Jupyter at NERSC: How-To Guides¶

These how-to guides may require you to edit files at NERSC. To do this, use the editor of your choice from an SSH session, a terminal in ThinLinc, a terminal pane in JupyterLab, or use JupyterLab's built-in text editor.

How to Use a Conda Environment as a Python Kernel¶

If you already have a Conda environment you want to use as a Jupyter kernel, make sure it includes the IPython kernel package ipykernel. Otherwise, create a new Conda environment with the packages you want plus ipykernel:

nersc$ module load python
nersc$ conda create -n <environment-name> ipykernel <other-packages...>

Activate the environment and use ipykernel install to set up a Jupyter kernelspec. Suppose our environment is called env:

nersc$ conda activate env
nersc$ python -m ipykernel install \
    --user --name env --display-name MyEnvironment
Installed kernelspec env in /global/u1/e/elvis/.local/share/jupyter/kernels/env

The kernelspec is written in JSON to kernel.json in the installation directory:

nersc$ cat $HOME/.local/share/jupyter/kernels/env/kernel.json
{
  "argv": [
    "/global/homes/e/elvis/.conda/envs/env/bin/python",
    "-m",
    "ipykernel_launcher",
    "-f",
    "{connection_file}"
  ],
  "display_name": "MyEnvironment",
  "language": "python",
  "metadata": {
    "debugger": true
  }
}

JupyterLab should pick up the kernelspec definition after it has been written, but if that doesn't seem to be happening, try reloading the JupyterLab tab in your browser. If that fails, click Hub Control Panel from the File menu in JupyterLab to be redirected to the Hub, close your JupyterLab tab, click the button to stop your running server, and then start it again. If a kernel is running and you change its kernelspec, you will need to restart the kernel (but not the notebook server) to pick up the changes.

How to Validate That a Kernelspec is Valid JSON¶

If you edit a kernelspec for some reason, you may want to ensure that it is valid JSON before you try to use it with JupyterLab. The recommended tool for this is jq:

nersc$ jq . $HOME/.local/share/jupyter/kernels/env/kernel.json

If the kernelspec is valid JSON, jq should render it. Otherwise, an error message will appear that may help you identify the problem. Watch for missing or extraneous commas.

How to Set Environment Variables for a Python Kernel¶

Use the --env option to define environment variables for Python kernels when creating them with the ipykernel install command:

nersc$ python -m ipykernel install \
    --user --name env --display-name MyEnvironment \
    --env HELLO_WORLD 1 \
    --env LD_LIBRARY_PATH $HOME/lib:\${LD_LIBRARY_PATH}

You can use $-based templates to define values with substitutions to make when the kernel is launched. Escape $ for variables you want expanded at kernel launch time and not on the ipykernel install command line.

The above command results in the following kernelspec. The value of $HOME is substituted on the command line since it was not escaped, but since ${LD_LIBRARY_PATH} was escaped by \ it is preserved in the kernelspec for substitution at kernel launch time:

nersc$ cat $HOME/.local/share/jupyter/kernels/env/kernel.json
{
  "argv": [
    "/global/homes/e/elvis/.conda/envs/env/bin/python",
    "-m",
    "ipykernel_launcher",
    "-f",
    "{connection_file}"
  ],
  "display_name": "MyEnvironment",
  "language": "python",
  "metadata": {
    "debugger": true
  },
  "env": {
    "HELLO_WORLD": "1",
    "LD_LIBRARY_PATH": "/global/homes/e/elvis/lib:${LD_LIBRARY_PATH}"
  }
}

How to Customize a Kernel with a Helper Shell Script¶

Using the editor of your choice, create a shell script in the same directory as the kernelspec you want to customize:

nersc$ touch $HOME/.local/share/jupyter/kernels/env/kernel-helper.sh

Edit the contents of this file to load modules, set environment variables, or run set-up commands. Make sure the last line of the script is exec "$@":

#!/bin/bash
export EXAMPLE_VALUE=$CFS/myproject
module load example
module load python
conda activate env
exec "$@"

Make the script executable:

nersc$ chmod u+x $HOME/.local/share/jupyter/kernels/env/kernel-helper.sh

Prepend the path to the kernel-helper script you just defined to argv in the original kernelspec. You may use the resource_dir template variable to represent the directory containing the kernelspec file:

{
  "argv": [
    "{resource_dir}/kernel-helper.sh",
    "python",
    "-m",
    "ipykernel_launcher",
    "-f",
    "{connection_file}"
  ],
  "display_name": "MyEnvironment",
  "language": "python",
  "metadata": {
    "debugger": true
  }
}

In this example, the absolute path to the Python interpreter used to run the kernel is no longer necessary, since the kernel-helper script activates a Conda environment that sets the PATH to the interpreter.

Is there a better way?

At NERSC we've documented the kernel-helper pattern for years. Meanwhile, with the release of jupyter_client 7.0, a capability called kernel provisioning has been introduced to address managing the lifecycle of a kernel's runtime environment in a more standard way. We are exploring the use of kernel provisioning, and may develop tooling that users can use to make the management of Jupyter kernel runtime environments easier at NERSC. We do expect that the pattern will continue to work indefinitely, and switching to a kernel-provisioner approach won't become mandatory.

How to Use a Container to Run a Jupyter Kernel¶

To use a container to run a Jupyter kernel, you will want to start with a basic kernelspec and then prepend argv with the container runtime command (e.g. shifter or podman-hpc) and options.

One way to do this is to use the image to run ipykernel install to set up a starter kernelspec and then inject the container arguments with the editor of your choice. Typically, you can't use the --user flag that puts the kernelspec into your home directory this way, since ipykernel will try to write the kernelspec into the image and that won't work, but you can specify the right path with --prefix $HOME/.local.

Another option is to create the kernelspec by hand or copy one of the below examples and adjust the image name, path to the Python interpreter, and the display name. Just be sure to put the kernelspec file at $HOME/.local/share/jupyter/kernels/<name>/kernel.json.

Shifter¶

An example of using a Shifter image to run ipykernel install to set up a starter kernelspec and then inject the container arguments with the editor of your choice would look like:

nersc$ shifter --image=myimage:v1.2.3 \
    </path/to/your/image/python> -m ipykernel install \
    --prefix $HOME/.local --name env --display-name MyEnvironment
[InstallIPythonKernelSpecApp] WARNING | Installing to ...
Installed kernelspec env in /global/u1/e/elvis/.local/share/jupyter/kernels/env

You can ignore the warning about where the kernel is being installed here. Finally, prepend the shifter command and any arguments to the argv section of the generated kernelspec:

{
  "argv": [
    "shifter",
    "--image=myimage:v1.2.3",
    "</path/to/your/image/python>",
    "-m",
    "ipykernel_launcher",
    "-f",
    "{connection_file}"
  ],
  "display_name": "MyEnvironment",
  "language": "python",
  "metadata": {
    "debugger": true
  }
}

`podman-hpc`¶

podman-hpc can also be used in a kernelspec. A generic example of what a podman-hpc kernelspec should look like is:

{
  "argv": [
    "podman-hpc",
    "run",
    "--rm",
    "--jupyter",
    "localhost/myimage:v1.2.3",
    "</path/to/your/image/python>",
    "-m",
    "ipykernel_launcher",
    "-f",
    "{connection_file}"
  ],
  "display_name": "MyEnvironment",
  "language": "python3",
  "metadata": {
    "debugger": true
  }
}

Here, the --jupyter flag mounts the /tmp and $HOME directories to allow the kernel to connect and to let new notebooks to be created in $HOME, respectively. podman-hpc is actively under development, so please file a ticket with us if you run into any issues using a podman-hpc kernel.

How to Use Matplotlib in Your Notebooks¶

Getting the Jupyter Matplotlib integration to work in a Jupyter kernel at NERSC currently requires that users install the same version of ipympl in their kernel as is installed in JupyterLab by NERSC. This is a known issue and the ipympl developers are working a solution.

For now, users need to know what version of ipympl they need to install into their kernels. Starting from the beginning of the 2022 allocation year, the version of ipympl installed into JupyterLab matches the version installed in the default Python module on any system where you can run JupyterLab. To find out what version of ipympl you need installed in your kernel, load the python module and use conda list:

nersc$ module load python
nersc$ conda list ipympl
# packages in environment at ...:
#  
# Name                    Version                   Build  Channel
ipympl                    a.b.c              pyhd8ed1ab_0  conda-forge

Install the matching version from the same channel into your environment:

nersc$ conda activate env
nersc$ conda install -c conda-forge ipympl=a.b.c

You will also need to ensure that the matplotlib version in the kernel is compatible with the ipympl version. See the compatibility table in the ipympl documentation. Ensuring this can potentialy solve the Error displaying widget error message. So for example, when your ipympl version requires matplotlib to be newer than a.b.c but less or equal d.e.f, you can enforce this restriction in your conda installation with:

nersc$ conda install -n env -c conda-forge "matplotlib <a.b.c,>=d.e.f"

How to Fix "Spawn Failed: Insufficient Storage"¶

If your home directory is over quota, JupyterLab will not be able to update utility files stored there, and will not work properly. Before JupyterHub launches JupyterLab, it detects whether your home directory is over quota, and if it is, it halts the launch and presents you with an error message about insufficient storage.

Since you cannot use JupyterLab to fix this, you need to connect via SSH or ThinLinc to manage files in your home directory. Once connected, use showquota to see how far over quota your home directory is. You could be over on space, inodes, or both. To fix this, you need to remove or migrate some files from your home directory. For example:

laptop$ ssh -i ~/.ssh/nersc elvis@perlmutter.nersc.gov
nersc$ showquota
+-------------+------------+-------------+----------------+-   -+
| File system | Space used | Space quota | Space used (%) | ... |
+-------------+------------+-------------+----------------+-   -+
|        home |   44.44GiB |    40.00GiB |         111.1% | ... |
|    pscratch |    7.26TiB |    20.00TiB |          36.3% | ... |
+-------------+------------+-------------+----------------+-   -+
nersc$ du -sh *
...
15G bigdata.dat
...
nersc$ mv bigdata.dat $SCRATCH/.
nersc$ showquota
+-------------+------------+-------------+----------------+-   -+
| File system | Space used | Space quota | Space used (%) | ... |
+-------------+------------+-------------+----------------+-   -+
|        home |   29.44GiB |    40.00GiB |          73.6% | ... |
|    pscratch |    7.28TiB |    20.00TiB |          36.4% | ... |
+-------------+------------+-------------+----------------+-   -+

Consider these tips for handling large Conda environments and package caches. Also, consider moving data to the Community File System or archiving it to HPSS. Once you are back under quota, try starting your notebook again from the hub.

How to Fix "Unexpected Error While Saving File"¶

If you are working on a notebook in your home directory and your home directory goes over quota, or you are working on a notebook in a CFS directory and that directory/project goes over quota, JuptyerLab will be unable to save your notebook. When this happens you will see a message like Unexpected error while saving file: <path-to-notebook> disk I/O error.

To fix this, open a JupyterLab terminal pane, or connect via SSH or ThinLinc, and move/delete files on the filesystem in question to create space. Use showquota to see how far over quota you are. You could be over on space, inodes, or both. If your notebook is on CFS, use showquota <project-name> to assess space used by the project. In that case you may need to coordinate space usage with others on your project or talk to your project PI about how space is allocated. See this how-to guide for an example of how to use showquota to get your home directory back under quota.

How to Use Jupyter at NERSC for a Tutorial¶

Collaborations, projects, experiments, or user facilities that use NERSC often have data, software, or notebook-based workflows deployed at NERSC for their users. Groups like these may get together for internal training, hackathons, or workshops where their members with NERSC access leverage the NERSC Jupyter service in some way. If you and your colleagues are interested in making use of Jupyter at NERSC for this kind of event, here is what you need to do to prepare.

We recommend that you inform NERSC at least one week before your event that you are planning to conduct a tutorial. With more lead time, we can usually help address potential pitfalls, test infrastructure ahead of time, and consult with you about best practices to help you make your event a success.

If your event plans do not include using compute nodes for running Jupyter, you can open a ticket and let us know how many participants you expect, when the tutorial will be held, and ask any questions you may have.
If your event plans do include using compute nodes for running Jupyter, you should request a reservation. When the reservation is active, your users will be able to select the reservation from the Configurable Job menu. Visit the documentation on reservations to learn more. Make sure to include in the notes that you are planning to have your users use Jupyter, so we can attempt to verify the reservation settings will work ahead of time.

Finally, consult the NERSC Outage Calendar and NERSC Center Status page to avoid scheduling an event on the same day as conflicting system maintenance. Check the outage calendar and Center Status page often in the days leading up to your event, as system maintenances may sometimes be scheduled on short notice. You can also subscribe to the nersc-status email list to be informed of all NERSC status updates as they are made.