Community File System (CFS)¶
The Community File System (CFS) is a global file system available on all NERSC computational systems. It allows sharing of data between users, systems, and the "outside world".
Every MPP repository has an associated Community directory and unix group. Community directories are created in
/global/cfs/cdirs. All members of the project have access through their membership in the unix group. There is an environment variable
$CFS (which expands to
/global/cfs/cdirs/) that can be used to access your CFS directory:
nersc$ cd $CFS/<your_project_name>
Occasionally there are cases where the above model is too limiting. For example, large projects with multiple working groups may wish to have separate Community directories with separate quotas for each working group.
In these cases, a PI or PI Proxy for a repository may request an additional Community directory with a specific name. This will result in the creation of a new Unix group with that name, consisting solely of the requestor, followed by the creation of the Community directory itself. The PI or PI Proxy must then use NIM to add users to the newly-created Unix group to give them access to the new Community directory.
All of a repository's Community directories share one quota. The PI or PI Proxy can adjust the relative quota amounts by opening a ticket with NERSC.
Quotas on Community are determined by DOE Program Managers based on information PIs supply in their yearly ERCAP requests. If you need a mid-year quota increase on the Community File System, please contact your DOE Program Manager. You can find DOE Program Manager contact information here
See quotas for detailed information about inode, space quotas, and file system purge policies.
The system has a peak aggregate bandwidth of at least 100 GB/sec bandwidth for streaming I/O. While user applications that depend on high-bandwidth for streaming large files can use the Community File System, it is recommended to use Cori scratch or the Burst Buffer instead.
All NERSC users should backup important files on a regular basis. Ultimately, it is the user's responsibility to prevent data loss. However, NERSC provides some mechanisms in protecting against data loss.
Community directories use a snapshot capability to provide users a seven-day history of their contents. Every directory and sub-directory in a Community directory contains a ".snapshots" entry.
.snapshotsis invisible to
findand similar commands
- Contents are visible through
ls -F .snapshots
- Can be browsed normally after
- Files cannot be created, deleted or edited in snapshots
- Files can only be copied out of a snapshot
Community directories will remain in existence as long as the owning project is active. Projects typically "end" at the end of a NERSC Allocation Year. This happens when the PI chooses not to renew the project, or DOE chooses not to provide an allocation for a renewal request. In either case, the following steps will occur following the termination of the project:
-365 days - The start of the new Allocation Year and no Project renewal
The data in the Community directory will remain available on the Community File System until the start of the next Allocation Year.
+0 days - The start of the following Allocation Year
PIs notified that the affected Community directory will be archived, and then removed from the file system in 90 days.
The Community directory will become read-only.
The full pathname to the Community directory will be modified. Automated scripts will likely fail.
User access to the directory will be terminated. The directory will then be archived in HPSS, under ownership of the PI, and subsequently removed from the file system.