Cori for JGI¶
A subset of nodes on Cori, the flagship supercomputer at NERSC, are reserved for exclusive use by JGI users. All the features available on Cori Haswell nodes are available also on the JGI-specific "quality of service" (QOS).
All JGI staff and collaborators can submit a request to JGI management to be given access to Cori Genepool, the JGI reserved fraction of Cori compute capacity. This service first became available in January 2018.
JGI staff and affiliates can use their access to Cori Genepool by passing QOS arguments to Slurm job submissions.
- All JGI users must specify the Slurm account under which the job will run (with
-A <youraccount>). Unlike other NERSC users, JGI users accessing the Genepool QOS do not have a default account.
- For jobs requiring one or more whole nodes, use
For jobs which can share a node with other jobs, use
Each of the following items first require
module load esslurm:
- For large memory batch jobs use
- For large memory shared batch jobs use
- For large memory interactive jobs use
- For transfer jobs which write to the Data and Archive file system use
Jobs run under the Cori
xfer_dna QOSes are not charged. Resources are scheduled to the best of our ability, but interference with other users' workloads can still occur. Please be a good citizen to your fellow researchers. Users violating the spirit of this policy will find themselves less able to do so.
The JGI's Cori capacity is entirely housed on standard Haswell nodes: 32 physical cores, each core with 2 hyperthreads, no local hard drives, and 128GB memory. It is not necessary to request
-C Haswell via Slurm if using a JGI QOS. KNL nodes are NOT available via a JGI QOS. To use KNL nodes, submit to one of Cori's standard QOS (such as
regular), and use the "m342" account. Be aware that jobs run with "m342" will charge NERSC allocation hours to JGI.
For a single core shared job, you would minimally need:
sbatch --qos=genepool_shared -A <youraccount> yourscript.sh
To request an interactive session on a single node with all CPUs and memory:
salloc --qos=genepool -A <youraccount>
Don't forget that if the Cori Genepool QOS is full, the previous command can take a long time to give you a node.
In the earlier examples,
youraccount is the project name you submit to, not your username or file group name. If you don't know what accounts you belong to, you can check with:
sacctmgr show associations where user=$USER
Cori Features and Other Things to Know¶
Cori offers additional features and capabilities that can be of use to JGI researchers:
Cori uses the Slurm job scheduler. Documentation and examples for using Slurm at NERSC can be found here.
Cori scratch is storage space for each user located on a Lustre file system accessible from Cori and Cori ExVivo. This directory can be found at
/global/cscratch1/sd/$USER or by using the
$CSCRATCH environment variable. Cori scratch is purged periodically; backing up data stored there is your responsibility. The HPSS Tape Data Archive or JGI JAMO system can be used for for this purpose. See the NERSC Data Management Policy for more information on topics such as automatic file backups and scratch directory purge frequency.
The performance of the different file systems will vary depending significantly on what your application is doing. It's worth experimenting with your data in different locations to see what gives the best results.
JGI Partition Configuration¶
|Job limits||5000 exclusive jobs, or 10000 shared jobs|
|Run time limits||72 h|
|Partition size||192 nodes|
|Node configuration||32-core Haswell CPUs (64 hyperthreads), 128GB memory|