This page outlines tips on managing the user shell environment and startup scripts for NERSC systems. Please see the shell startup page for a detailed explanation of shells, startup shell files, and different shell modes.
NERSC User Environment¶
Shells and dotfiles¶
NERSC supports the
tcsh login shells. Other shells (
zsh) are also available. The default login shell at NERSC is
bash. NERSC does not populate shell initialization files (also known as "dotfiles") in users' home directories. The same home file system is mounted on all NERSC resources, meaning that the same dotfile is used across all compute systems.
NERSC provides template dotfiles that can be found at https://software.nersc.gov/NERSC/dotfiles Before copying the NERSC dotfiles, we recommend creating a backup for your dotfiles. Then, copy the content of dotfiles into your
Home directories are shared between Cori and Perlmutter, which means your dotfiles must be compatible with both systems; otherwise, you will run into errors.
You can create your own dotfiles instead of using our template files. We recommend you test your changes by starting a new shell and see if configuration changes match your expectation.
No more .ext dotfiles at NERSC since February 21, 2020.
NERSC used to reserve the standard dotfiles (
~/.login, etc.) for system use, and users put their shell modifications into the corresponding
.ext files (e.g.,
~/.bash_profile.ext, etc.). This is not the case any more! You can now modify the standard dotfiles for your personal use.
The actual dotfile transition occurred during the center maintenance on February 21-25, 2020. To mitigate any interruptions to existing workloads, we preserved shell environments by replacing dotfiles with template dotfiles that source .ext files. For example, we changed the
~/.bashrc file to look like,
# begin .bashrc if [ -z "$SHIFTER_RUNTIME" ] then . $HOME/.bashrc.ext fi # end .bashrc
We recommend that users whose accounts were created before February 2020 move the contents of their
~/.bashrc.ext file into their
~/.bashrc file (and remove the .ext files afterwards).
Changing your default login shell¶
Use Iris to change your default login shell. Log in, then under the "Details" tab look for the "Server Logins" section. Click on "Edit" under the "Actions" column.
Customizing your shell environment¶
bash users can add startup configurations in the
~/.bashrc file, e.g., environment variables, aliases, and functions, to make them accessible in subshells. The
~/.bashrc file is sourced in non-interactive shell invocations (an example of this is running a shell script).
csh users should specify their configuration in
~/.tcshrc, which will be available in interactive login and interactive non-login mode.
All NERSC systems share the same global home file system; a user's
$HOME macro points to the same directory on every NERSC platform. To make system-specific customizations, use the pre-defined environment variable
Don't set NERSC_HOST
Some older dotfiles set
NERSC_HOST without checking whether this variable is set first. Generally you should not need to do this, so we advise you not set
NERSC_HOST in your dotfiles. If you must set
NERSC_HOST for some reason, it's good practice to check whether this variable is set first before overwriting it. In bash you can do this with
if [ -z "$var" ]; then var="mysettings"; fi.
case $NERSC_HOST in "perlmutter") : # settings for Perlmutter export MYVARIABLE="value-for-perlmutter" ;; "cori") : # settings for Cori export MYVARIABLE="value-for-cori" ;; "datatran") : # settings for DTN nodes export MYVARIABLE="value-for-dtn" ;; *) : # default value for other nodes export MYVARIABLE="default-value" ;; esac
darshan and altd¶
NERSC loads a light I/O profiling tool, darshan, and altd (a library tracking tool) on Cori by default. If you encounter any problems with them, you can unload them in your
module unload darshan module unload altd
If you run shifter applications, you may want to skip the dotfiles. You can use the following if block in your dotfiles:
if [ -z "$SHIFTER_RUNTIME" ]; then : # Settings for when *not* in shifter fi
Missing NERSC variables¶
If any NERSC-defined environment variables such as
$SCRATCH, are missing in your shell invocations, you can add them in your
~/.bashrc file as follows:
if [ -z "$SCRATCH" ]; then export SCRATCH=/global/cscratch1/sd/$USER fi
If you run bash scripts in crontabs, you may want to invoke a login shell (
#!/bin/bash -l) in order to get the NERSC-defined environment variables, such as
CSCRATCH, and to get the module command defined.
Troubleshooting user environment issues¶
If you are facing issues with your user environment, we have some recommendations to help you diagnose the problem.
First, we recommend you check the shell startup files used by your shell type (
tcsh). Most user environment issues can be resolved by reviewing the content of your user startup files. For
bash users, check your
$HOME/.bashrc file to see if an environment issue is caused by this file. For
$HOME/.cshrc and for zsh, check
$HOME/.zshrc. If you update your startup files, you can source the files to apply the changes to the current shell (
source $HOME/.bashrc) or log out and log back in.
If you want to know where environment variables are set, you will need to understand the shell startup files. When you
ssh into Cori/Perlmutter you are in an interactive login shell. For
bash user you will want to look at the table outlined in bash startup files. The
/etc/profile script, which is typically sourced during shell login, is available on any Linux distribution, but its contents may vary by distribution. During shell initialization, the shell will source files in
/etc/profile.d/* -- startup files added by the site administrator to provide system-wide defaults to all users. We encourage you review the content of each file if you need to troubleshoot your environment. Note that
/etc/profile and files in
/etc/profile.d/* are owned by the root user, so you wouldn't be able to edit them, but it's good to check these files when tracing issues related to the startup environment.
Second, you can review the modules loaded at startup. All user environments are initially loaded with a pre-determined set of modulefiles selected by the site administrators. You should review the content of your active modules by running
module list, then analyze the content of each modulefile by running
module show <modulefile>. Many users include
module load statements in their
~/.bashrc to customize their startup modules, but this can cause unexpected side-effects when loading other modules.
Here are some additional tips to help you troubleshoot environment issues:
- Check for environments like
LD_LIBRARY_PATHin startup scripts such as
~/.bashrcthat may cause issues. A common mistake is to reset one of these environment variables instead of prepending or appending additional paths. Setting
export PATH=/path/to/dirwill corrupt your shell -- instead set
export PATH=/path/to/dir:$PATH, which will prepend a directory to $PATH.
- Check all environment variables set in your terminal via
printenv. If you are looking for a particular pattern, you can
grepfor it within the long output, e.g.,
printenv | grep -i petsc(the
- Always check the path to the binary that is being run. For instance, if you want to run a python script, double check the path to the python wrapper by invoking
which pythonand see if the path makes sense.
- Make sure you are on the right machine! The environment variable
NERSC_HOSTwill show you which machine you are logged into. The expected value should be the following for Cori and Perlmutter:
elvis@cori> echo $NERSC_HOST cori elvis@perlmutter> echo $NERSC_HOST perlmutter
- Check whether you are in login or compute node by invoking
hostname. If you see an output start with
nid*then chances are you are in a compute node.
- If your shell prompt gets clobbered, try running
reset, which will reset your terminal settings.
Troubleshooting shell scripts¶
Running shell scripts¶
You can run a shell script with your preferred shell (i.e.,
sh script.sh) or you can specify a full or relative path to the script. A shell script must be executable in order to run when specifying the full path. In example below there is a permission error, since the file doesn't have execute permission (
x). You can fix this by running
chmod +x script.sh.
elvis@cori> ./script.sh bash: permission denied: ./script.sh elvis@cori> ls -l script.sh -rw-rw---- 1 elvis elvis 126 Apr 1 08:43 script.sh
Using strict running modes¶
Running a script in a stricter mode can help in the debugging process. For example, the default behavior of the bash shell is to run a script to completion regardless of the success of any commands within the script. Using
set -e makes the script terminate immediately when a simple command exits with a non-zero exit status (effectively, upon encountering an error).
The set command is a built-in option that changes shell behavior in
set command is used for setting variables (
set FOO=BAR). This is very different from how
set works in
sh: in these shells' syntax,
set changes the behavior of the current shell.
In the following example, bash stops execution after running
XYZ (which is an invalid command). The command
whoami is not run because the script terminates immediately after the invalid command. Note the non-zero script exit code, retrieved by
elvis@cori> cat script.sh #!/bin/bash set -e hostname # invalid command. Bash will terminate immediately XYZ # This command won't be executed whoami elvis@cori> bash script.sh cori10 script.sh: line 6: XYZ: command not found elvis@cori> echo $? 127
The shebang is a character sequence (
!#) at the beginning of a script used to indicate which shell interpreter to use when processing the script. You can also pass any shell options in the shebang line. In the previous example, we specified
set -e within the script to modify the behavior of the bash shell. This option can be passed on the shebang line
#!/bin/bash -e, which is also equivalent to invoking the script with
/bin/bash -e <script>.sh. Likewise, to enable strict mode for the csh/tcsh shell, you can use
#!/bin/csh -e and
If we were to
source this script, the setting would be applied to the current shell. When
set -e is enabled in the current shell or set as a result of sourcing some script, an invalid command (even a typo!) will terminate your shell. Watch out for this behavior if you source any script that enables
elvis@cori> source script.sh cori10 XYZ: command not found Connection to cori.nersc.gov closed.
Running in the mode in which the execution of a script terminates upon detecting a non-zero exit status can help you determine what went wrong in your script. You can check the exit code of your last command as follows:
# bash, sh, zsh echo $? # csh, tcsh echo $status
For complicated commands,
set -e may not be sufficient to determine whether there was an error. For example in
bash, the exit code for a piped command (
|) will be the last command in the pipe. Below we show two examples of non-zero exit codes within the pipe operator. The command
grep123 is a typo -- we meant
grep. In the first example we see a non-zero exit code, however in the second example we see a 0 exit code because
wc -l returned 0:
elvis@cori> ls -ld | grep123 $user grep123: command not found elvis@cori> echo $? 127 elvis@cori> ls -ld | grep123 $user | wc -l grep123: command not found 0 elvis@cori> echo $? 0
If you want bash to report the piped command as a failure, consider also running
set -o pipefail. If we add this setting and rerun the same example, we now see the exit code is 127 instead of 0.
elvis@cori> set -o pipefail elvis@cori> ls -ld | grep123 $user | wc -l grep123: command not found 0 elvis@cori> echo $? 127