Software:CAM

From Atmospheric and Oceanic Science
Revision as of 22:36, 10 September 2021 by Ambrish (talk | contribs)
Jump to navigation Jump to search

This is about running CAM on the beluga calculquebec Cluster

Get CAM

  1. When connected to Beluga, get a copy of the CAM repository by running

git clone https://github.com/ESCOMP/CAM

  1. the CAM directory will have been created by the previous command. Change to that directory

cd CAM

  1. we only want one specific version of CAM, and this version is tagged in git. We checkout that specific version for the runs:

git checkout cam6_3_000

  1. CAM also needs external modules that need to be downloaded. This is done by going to the CAM directory, and running [1]

./manage_externals/checkout_externals

Run CAM

Once CAM and its modules have been obtained, we can run it.

  1. To do this, change to the cime/scripts directory

cd cime/scripts

  1. Create a new case. Here I'm giving a test case as in the tutorial:

./create_newcase --case test_FHIST --res f09_f09_mg17 --compset FHIST

  1. This will create the test_FHIST directory. Change to that directory

cd test_FHIST

  1. You can now setup the case. This will download any datasets needed, and setup directories for building code, etc.

./case.setup

  1. Once the case has been setup, you can build the case similarly

./case.build

  1. To be able to submit and run the case, you need to modify one of the input files manually.
    1. To do so edit the file env_batch.xml
    2. In the xml file, there is a section called <batch_system type="slurm">. In that section, add a directive specifying the account to the <directives> section. e.g Originally case.build had
<batch_system type="slurm">
   <batch_query per_job_arg="-j">squeue</batch_query>
   <batch_cancel>scancel</batch_cancel>
   <batch_directive>#SBATCH</batch_directive>
   <jobid_pattern>(\d+)$</jobid_pattern>
   <depend_string> --dependency=afterok:jobid</depend_string>
   <depend_allow_string> --dependency=afterany:jobid</depend_allow_string>
   <depend_separator>,</depend_separator>
   <walltime_format>%H:%M:%S</walltime_format>
   <batch_mail_flag>--mail-user</batch_mail_flag>
   <batch_mail_type_flag>--mail-type</batch_mail_type_flag>
   <batch_mail_type>none, all, begin, end, fail</batch_mail_type>
   <directives>
     <directive> --job-name={{ job_id }}</directive>
     <directive> --nodes={{ num_nodes }}</directive>
     <directive> --ntasks-per-node={{ tasks_per_node }}</directive>
     <directive> --output={{ job_id }}   </directive>
     <directive> --exclusive                        </directive>
   </directives>
</batch_system>
 

This needs to be changed to

 <batch_system type="slurm">
   <batch_query per_job_arg="-j">squeue</batch_query>
   <batch_cancel>scancel</batch_cancel>
   <batch_directive>#SBATCH</batch_directive>
   <jobid_pattern>(\d+)$</jobid_pattern>
   <depend_string> --dependency=afterok:jobid</depend_string>
   <depend_allow_string> --dependency=afterany:jobid</depend_allow_string>
   <depend_separator>,</depend_separator>
   <walltime_format>%H:%M:%S</walltime_format>
   <batch_mail_flag>--mail-user</batch_mail_flag>
   <batch_mail_type_flag>--mail-type</batch_mail_type_flag>
   <batch_mail_type>none, all, begin, end, fail</batch_mail_type>
   <directives>
     <directive> --job-name={{ job_id }}</directive>
     <directive> --nodes={{ num_nodes }}</directive>
     <directive> --ntasks-per-node={{ tasks_per_node }}</directive>
     <directive> --output={{ job_id }}   </directive>
     <directive> --exclusive                        </directive>
     <directive> --account=rrg-itan</directive>
   </directives>
 </batch_system>
 

The change is the addition of the line <directive> --account=rrg-itan</directive> to the directives section. Please use a different account if you are using a different compute canada account.

  1. Once changes, the case can be submitted to beluga. run

./case.submit

  1. This will schedule jobs to run on beluga. The output files wil be in your /scratch directory. e.g my output was in

/scratch/ambrish/out/initial_port_test/test_FHIST/run