Cori¶
Cori is scheduled to retire at the end of April 2023
Please refer to the Migrating from Cori to Perlmutter page for the detailed Cori retirement plan and information about migrating your applications to Perlmutter.
Cori is a Cray XC40 with a peak performance of about 30 petaflops. The system is named in honor of American biochemist Gerty Cori, the first American woman to win a Nobel Prize and the first woman to be awarded the prize in Physiology or Medicine. Cori is comprised of 2,388 Intel Xeon "Haswell" processor nodes, 9,688 Intel Xeon Phi "Knight's Landing" (KNL) nodes. The system also has a large Lustre scratch file system and a first-of-its kind NVRAM "burst buffer" storage device.
Cori Retirement¶
Cori had its first users in 2015, and since then, NERSC's longest running system has been a valuable resource for thousands of users and projects. With the complete Perlmutter system scheduled to be operational for the 2023 allocation year, NERSC plans to decommission Cori at the end of April 2023.
Please refer to the Migrating from Cori to Perlmutter page for the detailed Cori retirement plan and information about migrating your applications to Perlmutter.
Usage¶
- Accounts
- Getting Started
- Connecting: (SSH) (MFA)
- Example jobs
- Precompiled Applications
- Queue Policies
System Overview¶
System Partition | # of cabinets | # of nodes | Aggregate Theoretical Peak | Aggregate Memory |
---|---|---|---|---|
Login | - | 20 | - | - |
Haswell | 14 | 2,388 | 2.81 PFlops | 298.5 TB |
KNL | 54 | 9,688 | 29.5 PFlops | 1.09 PB |
Large Memory | - | 20 | 30.7 TFlops | 40 TB |
File Systems¶
- Cori scratch
- Burst Buffer (usage is discouraged)
- File systems at NERSC
System Specification¶
System Partition | Processor | Clock Rate | Physical Cores Per Node | Threads/Core | Sockets Per Node | Memory Per Node |
---|---|---|---|---|---|---|
Login | Intel Xeon Processor E5-2698 v3 | 2.3 GHz | 32 | 2 | 2 | 515 GB |
Haswell | Intel Xeon Processor E5-2698 v3 | 2.3 GHz | 32 | 2 | 2 | 128 GB |
KNL | Intel Xeon Phi Processor 7250 | 1.4 GHz | 68 | 4 | 1 | 96 GB (DDR4), 16 GB (MCDRAM) |
Large Memory | AMD EPYC 7302 | 3.0 GHz | 32 | 2 | 2 | 2 TB |
Each XC40 cabinet housing Haswell and KNL nodes has 3 chassis; each chassis has 16 compute blades with 4 nodes per blade. Login nodes and the Large Memory Nodes are in separate cabinets.
Interconnect¶
Cray Aries with Dragonfly topology with >45 TB/s global peak bisection bandwidth.
Details about the interconnect
Node Specifications¶
Please note that the amounts of memory reported in this page for each type of node represents the amount of physical memory installed, but the memory users can use may be around 5-10 GB less, due to OS processes, file system caches, etc.
Login Nodes¶
- Cori has 12 Login nodes (
cori[01-12]
) open to public. - 2 Large Memory Login nodes (
cori[22,23]
) to submit tobigmem
qos. These nodes have 750GB of memory. - 4 Jupyter nodes (
cori[13,14,16,19]
]) access via Jupyter - 2 Workflow nodes (
cori[20,21]
) - requires approval before access to node - 1 Compile node (
cori17
) - requires approval before access to node - Each node has two sockets, each socket is populated with a 2.3 GHz 16-core Haswell processor.
The "connect" page has more details on connecting to login nodes.
Haswell Compute Nodes¶
- Each node has two sockets, each socket is populated with a 2.3 GHz 16-core Haswell processor. Intel Xeon Processor E5-2698 v3.
- Each core supports 2 hyper-threads, and has two 256-bit-wide vector units
- 36.8 Gflops/core (theoretical peak)
- 1.2 TFlops/node (theoretical peak)
- 2.81 PFlops total (theoretical peak)
- Each node has 128 GB DDR4 2133 MHz memory (four 16 GB DIMMs per socket)
- 298.5 TB total aggregate memory
KNL Compute Nodes¶
- Each node is a single-socket Intel Xeon Phi Processor 7250 ("Knights Landing") processor with 68 cores @ 1.4 GHz
- Each core has two 512-bit-wide vector processing units
- Each core has 4 hardware threads (272 threads total)
- AVX-512 vector pipelines with a hardware vector length of 512 bits (eight double-precision elements).
- 44.8 GFlops/core (theoretical peak)
- 3 TFlops/node (theoretical peak)
- 29.5 PFlops total (theoretical peak)
- Each node has 96 GB DDR4 2400 MHz memory, six 16 GB DIMMs (102 GiB/s peak bandwidth)
- Total aggregate memory (combined with MCDRAM) is 1.09 PB.
- Each node has 16 GB MCDRAM (multi-channel DRAM), > 460 GB/s peak bandwidth
- Each core has its own L1 caches, with 64 KB (32 KiB instruction cache, 32 KB data)
- Each tile (2 cores) shares a 1MB L2 cache
- Processor cores connected in a 2D mesh network with 2 cores per tile, with a 1 MB cache-coherent L2 cache shared between 2 cores in a tile, with two vector processing units per core.
CPU topology¶
Reservations required for alternative node topologies
Alternative node topologies are available via reservation only.