Skip to content

Python

Python is an interpreted general-purpose high-level programming language. You can use Anaconda Python on Cori through software environment modules. Do not use the system-provided Python /usr/bin/python.

Anaconda Python

Anaconda Python is a platform for large-scale data processing, predictive analytics, and scientific computing. It includes hundreds of open source packages and Intel MKL optimizations throughout the scientific Python stack. Anaconda provides the conda command-line tool for managing packages, but also works well with pip. The Anaconda distribution also exposes access to the Intel Distribution for Python.

Python 3 is the default Python module. To load it, type:

module load python

The default is python/3.7-anaconda-2019.10 so only module load python is necessary to use it.

When you load a Python module you are placed into its default Anaconda "root" environment. This may be sufficient for your needs. NERSC can install Anaconda packages into the root environment upon request subject to review of broad utility and maintainability. Contact us to find out about specific packages. In general we recommend users manage their own Python installations with "conda environments."

Creating conda environments

The conda tool lets you build your own custom Python installation through "environments." Conda environments replace and surpass virtualenv virtual environments in many ways. To create and start using a conda environment you can use conda create. Specify a name for the environment and at least one Python package to install. In particular you should specify which Python interpreter you want installed. Otherwise conda may make a decision that surprises you.

module load python
conda create -n myenv python=3 numpy

Before it installs anything conda create will show you what package management steps it will take and where the installation will go. You will be asked for confirmation before installation proceeds.

The Life You Save May Be Your Own

Make it a habit to actually review conda tool reports and not just blithely punch the "y" key to approve create/install/update actions. Verify the installation is going where you think it should. Make sure any proposed package downgrades are acceptable.

Activating conda environments

conda activate now an option

After the shell resource file change in the February 2020 maintenance, conda activate is now possible on NERSC systems. For more information about NERSC shell resource files (also known as dotfiles), please see here.

Once you have created a conda environment, you have two options for activating it: source activate and conda activate.

Using source activate

source activate is the older, less invasive way to activate a conda environment. It will not make any changes to your shell resource files/dotfiles. source activate is a good option for any user who doesn't want a specific version of Python loaded automatically when they log on to Cori.

You will first need to load a Python module (i.e. our base Anaconda Python environment) via:

module load python

and then you can source your custom environment

source activate myenv

The name of your environment should now be displayed in your prompt.

To leave your environment

source deactivate

and you will return to the base Python environment.

Bad News for csh/tcsh Users

If you use csh or tcsh you will not be able to use the source activate syntax. For csh users this is a shortcoming of the conda tool. There are workarounds available on the web that work to varying degrees. (We often find users are able to switch to /bin/bash without much difficulty, that is one solution.)

If you are a csh user and you do not need to install or manage packages once a conda environment has been provisioned, you can simply set PATH to point to the path of the Python interpreter in the environment.

Using conda activate

conda activate is the newer, more complex way of activating a conda environment. This is the method recommended by the conda developers but at NERSC we support both source activate and conda activate depending on your preferences. conda activate is appropriate for users who don't mind a semi-permanent change to their default Cori shell that will always load a specific version of Python by default. However, this means that the user is responsible for manually reconfiguring their conda setup if they would like to use a newer version of Python. See below for more information.

conda activate first requires that you run a setup command called conda init. This command only needs to be run one time. This command will take the currently loaded Python environment and set it as the default, so double check that the Python module you have loaded is suitable for your needs.

module load python
conda init

Running conda init will add several lines to your .bashrc file. For example:

# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/global/common/cori_cle7/software/python/3.7-anaconda-2019.10/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
    eval "$__conda_setup"
else
    if [ -f "/global/common/cori_cle7/software/python/3.7-anaconda-2019.10/etc/profile.d/conda.sh" ]; then
        . "/global/common/cori_cle7/software/python/3.7-anaconda-2019.10/etc/profile.d/conda.sh"
    else
        export PATH="/global/common/cori_cle7/software/python/3.7-anaconda-2019.10/bin:$PATH"
    fi
fi
unset __conda_setup
# <<< conda initialize <<<

This means that whenever you log on to Cori, the Python you had loaded when you ran conda init will always be loaded by default (and you do not need to type module load python). You can then load your custom environment right away:

conda activate myenv

The name of your environment should now be displayed in your prompt.

To leave your environment

conda deactivate

and you will return to the base Python environment specified in your .bashrc file.

Should you decide that you would like to update this setup (or would like to remove this functionality completely), the safest way is to run the command

/usr/common/software/bin/fixdots

which will reset your shell resource files (also known as dotfiles) to the NERSC defaults.

After this, if you would like to upgrade your setup, simply load the newest Python module via

module load python

and rerun

conda init.

For more information about NERSC shell resource files see here.

Installing Packages

You can find packages and install them into your own environments easily.

conda search scipy
[list of available versions of scipy]
conda install scipy

If conda search fails to identify your desired package it may still be installed via pip. Both conda and pip can be used in conda environments.

Use conda to install pip into your environment

To use pip in your own environment you may need to conda install pip. Verify whether you need to by typing "which pip" at the command line. If the path returned looks like /usr/common/software/python/.../bin/pip then do conda install pip.

If you consider pip a last resort you may want to search non-default channels for builds of the package you want. The syntax for that is a little different:

anaconda search -t conda <package-name>
[list of channels providing the package]
conda install -c <channel-name> <package-name>

Finally you can install packages "from source" and in some cases this is recommended. In particular any package that depends on the Cray programming environment should be installed this way. For Python this usually boils down to mpi4py and h5py with MPI-IO support.

Tips for using conda

Conda environments are disposable. If something goes wrong, it is often faster and easier to delete it and build a new environment.

Update and install only what you need. If you need to update numpy, don't try to update your entire environment-- just conda update numpy. This will update numpy and all of its dependencies. Avoid conda update --all as this will force conda to try to update all your packages. In a large environment this may be difficult or impossible. If you find yourself in a package dependency nightmare it is probably easiest to just build a new environment.

Problems with quota? Try conda clean. If you are a conda enthusiast, it can be easy to hit your $HOME quota limit. An easy solution to this is conda clean --all. This will remove unused packages and related files. For more information about conda clean check out this page.

Your conda environment can easily become a Jupyter kernel. If you would like to use your custom environment myenv in Jupyter:

source activate myenv
conda install ipykernel
python -m ipykernel install --user --name myenv --display-name MyEnv

Then when you log into jupyter.nersc.gov you should see MyEnv listed as a kernel option.

For more information about using your kernel at NERSC please see our Jupyter docs.

Running Scripts

Run serial Python scripts on a login node, or on a compute node in an interactive session (started via salloc) or batch job (submitted via sbatch) as you normally would in any Unix-like environment. On login nodes, please be mindful of resource consumption since those nodes are shared by many users at the same time.

Parallel Python scripts launched in an interactive (salloc) session or batch job (sbatch), such as those using MPI via the mpi4py module, must use srun to launch:

srun -n 64 python ./hello-world.py

Please see this page for more information about using mpi4py.

Please see this page for more information about using h5py MPI-IO.

End-of-Life for Python 2

If you are still using Python 2 at NERSC, you may have noticed our warning:

ATTENTION: Python 2 reached end-of-life Jan 1, 2020.
We urge you to transition to Python 3.

Why are you seeing this message?

Python 2 reached end-of-life on January 1, 2020. After its final release, Python 2 ceases to exist as an active project: No development, no bug fixes, no patches, etc. This is important because users must actively transition to Python 3, which is not backward-compatible with Python 2.

Developers of many packages including NumPy, SciPy, Matplotlib, pandas, and scikit-learn pledged to drop support for Python 2 "no later than 2020." You can expect support for all Python 2 libraries to continue to wither away. Using Python 2 past end of life is a risk as new issues will likely go unaddressed by developers. You may already have noticed deprecation warnings from your Python applications' outputs; do not ignore these warnings!

At the Python 3 Statement website, there are a few links under the “Why?” section that may be helpful to you in preparing your migration plan and easing the transition from Python 2 to Python 3. These seem especially helpful:

https://docs.python.org/3/howto/pyporting.html

https://python-3-for-scientists.readthedocs.io/en/latest/

What does this mean for you?

NOW is the time to transition your Python 2 code to Python 3. We will continue to provide Python 2 on Cori pending any serious security issues or other problems, but these may arise quickly and without warning.

Furthermore, Python 2 will not be available on Perlmutter. If you plan to run Python on Perlmutter, you'll need to transition your code to Python 3.

If you have any questions, please let us know via a ticket at help.nersc.gov.