Skip to content

Brief introduction to Python at NERSC

The Python we provide at NERSC is Anaconda Python. We believe that Anaconda provides a good compromise between productivity and performance. What does this mean for you?

You have 4 options for using Python at NERSC:

  1. Module only
  2. Module + source activate
  3. Conda init + conda activate
  4. Install your own Python

For more details about these 4 options, please see this page. Our data show that about 80 percent of our NERSC Python users are using custom conda environments (Options 2 and 3)- you might find that these are a good solution for you, too.

If you have Python questions or problems, you can always submit a ticket to help.nersc.gov. We also encourage you to take a look at our FAQ and troubleshooting page. If you would like to make any edits or contributions to our docs, please see here.

Python on your laptop vs. Python at NERSC

There are a few key differences in using Python on your laptop/desktop and on our large supercomputing systems.

  1. To take advantage of our large systems, you'll want to parallelize your code in some way. Please see our parallel-python page for more information.
  2. To improve performance within Anaconda, you should use conda channels and libraries that can take advantage of our architecture (by using the Intel MKL library, for example.) For more information about conda channels at NERSC, please see this page.
  3. You should consider the location of your software stack and data. The best and fastest place for your code and conda environment is /global/common/software. The best and fastest place for your data is $SCRATCH.

How to run Python jobs at NERSC

You have many options for running Python at NERSC:

  1. Our login nodes (only for very small testing and debugging). Please see our login node policies here.
  2. Jupyter for interactive notebooks well-suited for visualization and machine learning tasks.
  3. Compute nodes for any substantial computation (either interactively or via a batch job)

Running Python on an interactive compute node

To get an interactive Haswell node

salloc -N 1 -t 30 -C haswell -q interactive

You can source python either via a module or your conda environment (see here for more info).

To run a serial Python job

python hello-world.py

To run an mpi-enabled job you must use srun to launch

srun -n 10 python hello-world-mpi.py

Running Python in a batch job

To run a serial job in a conda environment via a batch script submit-python.sh

#!/bin/bash
#SBATCH --constraint=haswell
#SBATCH --nodes=1
#SBATCH --time=5

module load python
source activate myenv
python hello-world.py

And then submit by typing sbatch submit-python.sh.

To run an mpi-enabled job on 3 nodes using our python module, you can create a file called submit-mpi.sh

#!/bin/bash
#SBATCH --constraint=haswell
#SBATCH --nodes=3
#SBATCH --time=5

module load python
srun -n 96 -c 2 python hello-world-mpi.py

And then submit by typing sbatch submit-mpi.sh.

For more information about running jobs at NERSC please see this page.