Skip to content

How to use Python on NERSC systems

There are 4 options for using and configuring your Python environment at NERSC. We provide a brief overview here and will explain each option in greater detail below.

  1. Module only
  2. Module + source activate
  3. Conda init + conda activate
  4. Install your own Python

Our data show that about 80 percent of our NERSC Python users are using custom conda environments (Options 2 and 3)- you might find these are a good solution for you, too.

Option 1: Module only

In this mode, you just module load python and use it however you like. This is the simplest option but also the least flexible. If you require a package that is not in our default modules this option will not work for you.

Who should use Option 1?

Option 1 is best for users who want to get started quickly and who do not require special libraries or custom packages.

Option 2: Module + source activate

In this mode, you first module load python and then build and use a conda environment on top of our module. To use this method:

module load python
source activate myenv

To leave your environment

conda deactivate

and you will return to the base Python environment.

Who should use Option 2?

Option 2 is a good choice for any user who doesn't want a specific version of Python loaded automatically when they log on to Cori. It is also good for users who prefer to use the most recent Python module.

Option 3: Conda init + conda activate

In this mode, you will configure your environment one time via:

module load python
conda init

This will add the following to your .bashrc file:

# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/global/common/cori_cle7/software/python/3.7-anaconda-2019.10/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
    eval "$__conda_setup"
else
    if [ -f "/global/common/cori_cle7/software/python/3.7-anaconda-2019.10/etc/profile.d/conda.sh" ]; then
        . "/global/common/cori_cle7/software/python/3.7-anaconda-2019.10/etc/profile.d/conda.sh"
    else
        export PATH="/global/common/cori_cle7/software/python/3.7-anaconda-2019.10/bin:$PATH"
    fi
fi
unset __conda_setup
# <<< conda initialize <<<

After you have configured your environment, when you log on to Cori you should only:

conda activate myenv

To leave your environment:

conda deactivate

and you will return to the base Python environment.

What should you do if you decide you don't like Option 3? You can simply delete the lines that conda init has added to your .bashrc. file and choose another Python option.

Who should use Option 3?

Option 3 is suitable for any user who would like a particular Python environment loaded by default whenever they access Cori. However, the user must be willing to manually monitor and update their configuration. Users who choose Option 3 should not combine their conda-init configured Python environment with our NERSC Python modules.

Option 4: Install your own Python

You don't have to use any of the Python options we described above- you are free to install your own Python via Miniconda, Anaconda, Intel Python, or a custom collaboration install to have complete control over your stack. Furthermore, you are free to build this installation with or without containers.

Option 4a: Install your own Python without containers

Individuals may prefer to install and maintain their own Python stack. Collaborations, projects, or experiments may wish to install a shareable, managed Python stack to /global/common/software independent of the NERSC modules. You are welcome to use the Anaconda installer script for this purpose. In fact you may want to consider the more "stripped-down" Miniconda installer as a starting point. That option allows you to start with only the bare essentials and build up. Be sure to select Linux version in either case. For instance:

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh -b \
    -p /global/common/software/myproject/env
[installation messages]
source /global/common/software/myproject/env/bin/activate
conda install <only-what-my-project-needs>

You can customize the path with the -p argument. The installation above would go to $HOME/miniconda3 without it. You should also consider the PYTHONSTARTUP environment variable which you may wish to unset altogether. It is mainly relevant to the system Python we advise against using.

Who should use Option 4a?

Option 4a is suitable for individuals or collaborations who would like to install, maintain, and control their own Python stack. Users who choose Option 4a should not combine their custom Python installations with our NERSC Python modules.

Option 4b: Install your own Python in a container

Users may prefer to build their own software stack inside of a container for improved portability, performance, and configurability. Like in Option 4a, users may choose to install Miniconda, Anaconda, Intel Python, or start with a pre-built container from NVIDIA, for example.

To get started using Docker containers, see here.

To use Docker containers at NERSC via Shifter, see here.

Who should use Option 4b?

Option 4b is suitable for users willing to build their own software stack inside of a container. Anyone who plans to run mpi4py jobs at scale is strongly encouraged to use Option 4. Please see here for more information.

Creating conda environments

Creating custom conda environments is usually quick and easy. If you require a package that is not available in our default module, this is the option you must use.

If you are using Option 2 (source activate):

module load python
conda create --name myenv python=3.8
source activate myenv
conda install numpy scipy astropy

If you are using Option 3 (conda activate):

conda create --name myenv python=3.8
conda activate myenv
conda install numpy scipy astropy

Installing libraries via conda channels

Conda has several default channels that will be used first for package installation. If you want to use another channel beyond the defaults channel, you can, but we suggest that you select your channel carefully.

Here is an example that demonstrates why your channels matter. If we

conda install numpy

it will search the default channels first. This is good because it means that MKL-enabled NumPy will be installed which generally performs well on Cori's Intel hardware.

If however you have added other channels to your search path, for example conda-forge, the packages that conda-forge will decide to install may not be optimal for NERSC. In this example, you will likely get a version of NumPy that uses OpenBLAS instead of MKL and this can be substantially slower on Cori.

Don't permanently add other channels to your conda config, i.e.

conda config --add channels conda-forge

Do this instead:

conda install numpy --channel conda-forge

It's better to append the channel you need with a -channel conda-forge. This uses conda-forge only when you ask for it and not all the time.

Installing libraries via pip

Pip is available under Anaconda Python. If you create a conda environment but you are unable to find a conda build of whatever package (or version of that package) you want to install, then pip is one viable alternative. However, pip users at NERSC should be aware of the following:

  • Users of the pip command may want to use the "--user" flag for per-user site-package installation following the PEP370 standard. On Linux systems this defaults to $HOME/.local, and packages can be installed to this path with "pip install --user package_name." This can be overridden by defining the PYTHONUSERBASE environment variable.

  • To prevent per-user site-package installations from conflicting across machines and module versions, at NERSC we have configured our Python modules so that PYTHONUSERBASE is set to $HOME/.local/$NERSC_HOST/version where "version" corresponds to the version of the Python module loaded.

Mixing pip and conda: an example

We have observed that users often don't realize that the per-user site-package directories are included in the search path from all their conda environments created with the same module. What does this mean? We'll demonstrate with an example. If you have done the following:

module load python
pip install numpy --user

Any conda environment you have created based on this Python module will have this pip-installed NumPy in its search path.

It can be easy to forget you've done "pip install --user" and then create a new conda environment and be confused by how it works (or doesn't).

If you're using a conda environment anyway, think about whether you really want a pip-installed package to be accessible to multiple conda environments. If you don't, just drop the "--user" part and install it into your conda environment:

module load python
source activate myenv
pip install numpy