Sharing Data¶
Sharing data with other users must be done carefully. Permissions should be set to the minimum necessary to achieve the desired access. For instance, consider carefully whether it's really necessary before sharing write permissions on data, often just read permissions are enough. Be sure to have archived backups of any critical shared data. It is also important to ensure that private login secrets (like SSH private keys or apache htaccess files) are not shared with other users (either intentionally or accidentally). Good practice is to keep things like this in a separate directory that is as locked down as possible (e.g. by removing group and other permissions with chmod g-rwx,o-rwx <directory>
, please see our permissions page for a longer discussion on file permissions).
Also take a look at the NERSC Data Management policy.
Sharing Data Inside NERSC¶
Sharing Data Within Your Project¶
The easiest way to share data within your project at NERSC is to use the Community File System (CFS). Permissions on CFS directories are set up to be group readable and writable by default, and any permissions drift can be corrected by the PIs using the PI toolbox.
PIs can also request an HPSS Project Directory to share HPSS data within their project.
Sharing Data Outside Your Project¶
Sharing One Time¶
If you want to share just a few files a single time, you can use NERSC's give/take utilty.
Sharing Indefinitely¶
If you have a large volume of data you'd like to share with several NERSC users outside your project, you may want to consider creating a dedicated top-level CFS directory that's shared between projects. Project PIs can request a new CFS directory and can also request that directory be owned by a linux group made up of users from different projects.
If you only want to share with one or two users for an indefinite period, you might want to consider setting the linux permissions such that they're accessible for multiple users. Generally it's better to use ACLs to grant access rather than to make your directory world-readable. The example below shows how user elvis
could grant user adele
access to their scratch directory:
nersc$ setfacl -m u:adele:rx /pscratch/sd/e/elvis
nersc$ setfacl -m u:adele:rx /pscratch/sd/e/elvis/shared_directory
Note that anyone reading lower directories must have execute
(aka x
) permissions on the higher directories so they can traverse them, which is why adele
must have x
permissions on elvis
's top level directory.
Don't Use ACLs If You Want to Use These Directories in Batch Jobs
The Community File System is served by DVS on NERSC compute nodes. Adding an ACL will slow down reading from this directory during batch jobs. Please see our DVS page for more information.
Sharing Data Outside of NERSC¶
Data on the Community File System can be shared with users outside of NERSC through Globus Guest Collections.
Data can also be shared via Science Gateways.
If you have a collaborator who wishes to share data with you from outside, and that collaborator is not a NERSC user, you will need to use a non-NERSC service. See Files From Non-Users for some options.