Skip to content

Spin Reference Guide (Rancher 1)


  • An Application Stack contains one or more services, each performing a distinct function.

  • An instance of a service is called a Container.

  • An Image is a lightweight, stand-alone, read-only template that includes everything needed to run a piece of software.

  • A Container is a runnable instance of an image.

Accessing Spin

The Rancher command line interface (CLI) is used to manage services and stacks in Spin. The Rancher CLI is loaded from the Spin module on all major NERSC systems and activated with the command rancher.

The Rancher CLI must be used from a NERSC system and cannot be used from your laptop, as we maintain a modified version of the Rancher CLI which is optimized to work with NERSC systems. While Rancher also provides a Web Interface, that interface is currently only available to NERSC Administrators.

The Rancher CLI is distinct from the Docker CLI. The Docker CLI cannot be used to manage your Spin services from NERSC systems, as it is not yet possible for Docker to provide a secure, multi-user container infrastructure suitable for the NERSC systems. The Docker CLI is used to manage containers on your laptop.

All Rancher commands communicate with the Spin infrastructure using an API key. You will generate an API key below.

For more information on using the Rancher CLI, see Rancher Command Line Interface (CLI) on


NERSC provides a modified version of the Rancher CLI, and not all commands shown in the Rancher documentation are available to NERSC users.


Don't pipe large amounts of data through the Rancher CLI. As of Feb 2018, the Rancher CLI cannot stream large amounts of data in a pipeline, and doing so can cause the CLI to become stuck. This is due to a bug, and Rancher is looking into fixing it (See Rancher issue #12165). Workflows that do this will cause harm and will kill your connection, as indicated by the following error message:

nersc:test_db $ cat load_dept_emp.dump |  rancher exec dbtest/db mysql
ERRO[0012] Failed to handle connection: websocket: bad handshake
error during connect: Get https://%2Fvar%2Ffolders%2Fg8%2Fydzygkc103x9_xt_r8zs6zyr001d77%2FT%2Fdocker-sock578594745/v1.24/exec/6e644e66b9b123123fdf4459a5b23a29f3b079307a664d8b65b68d8d0268169c/json: EOF
nersc:test_db $

If you already have an API key to access Spin, then simply load the Spin module with module load spin. Running a non-intrusive command like rancher environment will connect to the server using your credentials, test that the connection is good, and print out a result.

nersc$ module load spin
nersc$ rancher environment
1a736936    prod-cattle   cattle          active    2017-02-27T23:59:40Z
1a5         dev-cattle    cattle          active    2016-10-11T01:02:27Z
1a1221788   sandbox       cattle          active    2018-03-12T21:25:22Z

If you do not have an API key, you will need to generate one. First, a NERSC staff person will need to grant your account access to Spin. Request access through our ticketing system.

Next, generate an API key. When prompted for a username and password, use your NERSC username and password.

Password for user elvis?
Success: Spin API Key generated for elvis.

The Rancher CLI stores its configuration file under your home directory, at ~/.rancher/cli.json. If you would like to view your login information at any time, run rancher config --print, like so:

nersc$ rancher config --print

Once you have your new API key, you can validate that your account is working correctly by running the command rancher environment. This command will communicate to the Rancher Server API using your API key. If rancher environment returns a list of environments, your account is working correctly. If the command prints an error such as '401 Unauthorized', your account is not functioning. Please contact us for help.

Your account is tied to one key which has access to all environments, Prod, Dev, and Sandbox.

nersc$ rancher environment
1a736936   prod-cattle  cattle         active  2017-02-27T23:59:40Z
1a5        dev-cattle   cattle         active  2016-10-11T01:02:27Z
1a1221788  sandbox      cattle         active  2018-03-12T21:25:22Z

Working with different environments

A Spin Environment is a set of servers which run the Spin containers. Each environment is isolated and separate from the other environments. Spin has two main environments for NERSC users:

  • 'dev-cattle' is for use with applications which are under development
  • 'prod-cattle' is used for production services.

A third environment named 'sandbox' will be used exclusively if you are taking the SpinUp sessions.


The name 'cattle' refers to the container 'Orchestrator' which we use to manage containers and is part of Rancher. Rancher names many of its components with 'Ranch'-themed names, such as 'Longhorn' or 'Wagyu'. To read more information on Rancher, please read the Spin Getting Started Guide overview.

During normal development, you will first deploy your application to the development environment, where you can iterate until you are ready to deploy a completed version to production. This approach, along with the full isolation of the development and production environments, allows you to continue active development without impacting the production deployment.

Most Rancher commands only operate on stacks and services within one environment, and will need to be told which environment to use. If you simply run a command now, Rancher will ask you to select the environment for most commands. This can be a hassle:

nersc$ rancher ps
[1] prod-cattle(1a736936)
[2] dev-cattle(1a5)
[3] sandbox(1a1221788)
Select: 2
1s3712  service  elvis-webapp/web  httpd  healthy     1/1    false
1s3713  service  elvis-webapp/db   mysql  healthy     1/1    false

To simplify your workflow, use the RANCHER_ENVIRONMENT variable to specify the environment to be used. The following example shows that I have two services running in the dev-cattle environment.

nersc$ export RANCHER_ENVIRONMENT=dev-cattle
nersc$ rancher ps
1s3712  service  elvis-webapp/web  httpd  healthy     1/1    false
1s3713  service  elvis-webapp/db   mysql  healthy     1/1    false

Choosing a Base Image

When building your own image, you will usually be pulling an official image from a major, reputable project and using that as the base image in your Dockerfile.

In general, good images from Docker Hub tend to be well maintained and have wide community support. We look for images that meet the following guidelines:

  • Are part of the official repositories, such as the Docker Hub Official Repositories
  • Have a high number of pulls, indicating that the project is well used
  • Have a high number of stars, indicating that the software works well
  • Are updated as frequently as needed to address security vulnerabilities and to keep up to date with upstream features. Look for images with recent modification times, which indicates that the image is being kept up to date.

If the project page has a de facto image or a recommended image, that's usually the best and simplest option. The goal here is to keep the image simple, and yet still be functional enough to support your application.

There are also many low-quality images on Docker Hub, but they tend to be obvious. Try to avoid images that have a low number of pulls, are poorly rated, or lack recent updates. Image size is another useful criterion. The appropriate size of an image will obviously vary depending on the application stack, but as a rule of thumb, take a close look at any image larger than 5 GB to see if it contains a lot of unnecessary components. Images that are overly large, besides likely containing too many unnecessary elements, may be frustratingly slow to push to the image registry during the development cycle (especially over a typical home internet link), and they will be slower to deploy.

Popular projects may have multiple images on their project page. The Apache httpd project has httpd:2.4 and httpd:alpine, which shows that the Apache community is maintaining a mainline application while also experimenting with tiny images based on the Alpine container OS.

Examples of official projects that have provided a selection of images for different use cases are:

One important consideration in choosing a base image is the operating system. There are official images available on Docker Hub for nearly every popular Linux distribution. If you are basing your image on a prebuilt container, such as Apache httpd, your choice will be made by the original image developers. If you are building your own image, there are a number of criteria that can be helpful in guiding your choice.

We recommend you choose an OS that utilizes a package manager and/or is a distribution that is familiar to you, especially one of the following:

Keep in mind that base images are updated over time. When building an image on your laptop, use the --pull flag to ensure that Docker will pull the latest parent images.

elvis@laptop:app $ docker image build --pull --tag spin-flask-demo-app .

Writing a Dockerfile

When writing a Dockerfile, seek a balance between readability and size. Familiarize yourself with Docker's Best practices for writing Dockerfiles. Some general rules:

  • Containers should be ephemeral
  • Each container should have only one concern
  • Avoid installing unnecessary packages
  • Keep the image small by reducing the number of layers. This can be accomplished by condensing operations into a single step (e.g. Single yum command with multiple packages vs. multiple yum commands), by chaining commands with ‘&&’. Consider using Docker multi-stage builds with Docker 17.05 or higher.
  • Improve caching and shorten build time by ordering statements such that the stable parts of the Dockerfile are at the beginning, and the more frequently changed statements are at the end

Tagging Images

The process of tagging and pushing an image to the registry is described in the Spin Getting Started Guide.

Be careful when reusing image tags, or when using the :latest tag. The :latest tag confuses many new (and experienced!) users, and it may not work the way you expect. Below, we describe two common issues with the :latest tag. Note that the same behavior will happen for any tag that is reused, including explicit version tags.

1. Contrary to the name, the label :latest is ambiguous, and may actually be 'latest' from the last time you downloaded the code, 18 months ago.

If your service is based on the someimage:latest image, the Docker daemon will first look for the image in the local image cache on the node. If the cache contains an image that matches, Docker will use that. Docker will not look for a new image on the registry by default, so if you uploaded a new version of someimage:latest to Dockerhub or the Spin registry, Docker will not see it. Docker can also be told to pull down an image (See below).

2. Furthermore, remember that :latest changes over time. If your service has replicas on multiple Docker hosts, one replica may be running :latest from September, while a second node may be running :latest from July.

We recommend using explicit version numbers, such as :v1.2.3 or a date format such as :v20180809 instead of :latest, an that you update the tag for any changes.

If you do use :latest: in your service, you can also use the label io.rancher.container.pull_image: always to tell Docker to always pull your :latest image. This will add a short delay to upgrade operations. Here is an example with a custom Nginx service:

version: '2'
    - /global/cfs/cdirs/sciencegroup/elvis_project/web/images:/srv:ro
    - ALL
    user: 1000:1000
    - nginx
      io.rancher.container.pull_image: always
    retain_ip: true

Naming Stacks, Services and Containers

  • Stacks accept any alphabetical name.
  • Services, which are part of a Stack, are referred to as [Stack Name]/[Service Name], such as sciencestack/web or sciencestack/db
  • A Service may have multiple instances of itself, which are called 'containers', and have the name [Stack Name]-[Service Name]-[Instance #], where 'Instance #' is the number of that container instance, such as sciencestack-web-1 and sciencestack-web-2.

Many commands can be used on a service or a container. Remember that a service may have one or more containers. To put it another way, containers are 'instances' of a service.

Spin has some conventions for naming these things. Let's look first at the naming of stacks, the collections of services that comprise an application within Rancher. Stacks should be named after the software systems they represent, considering the aggregate function of all the services they contain. For example, for a fictional science gateway system known as Rover made up of a web front-end, application server, and database, a stack in Spin could simply be called rover. The services within the rover stack should have descriptive names based on their individual function.

Suffixes can be useful to distinguish separate instances of a system. For example, two instances of Rover used to display data for two different experiments called JPROM and LPROM might be called rover-jprom and rover-lprom.

Tags are created within the Rancher environment to label stacks with information that can be useful for identifying ownership, for support, and for determining resource usage. Some tags are optional, while others are required for all stacks.

Tag Status Description Example
owner: Required Developer or primary user of stack. Must be in the form of a NERSC username owner:fred
staff-owner: Required NERSC staff member that is most familiar with or contact person for application (similar semantics to staff-owner in Service Now). Must be in the form of a NERSC username staff-owner:wilma
group: Recommended Group that owns the stack. All members of group have permission to update or restart services in stack. Should be an LDAP group group:csg
staff-group: Recommended NERSC group most familiar with or contact for application staff-group:isg
fqdn: Recommended Public facing DNS Name
requires: Optional Specifies dependencies of the stack, for example external file systems requires:gpfs

A service comprises one or more identical containers providing the same function in Rancher. Common services should use a name from the following table of recommended names, optionally followed by a hyphen and a descriptive suffix:

Name Description
api API server
app application server (backend)
db database
kv key-value store
lb load balancer
util utility service
web web server

For example, a system made up of an nginx front-end, a Django application server, a MySQL database, and a Redis key-value store might have services named web-nginx, app-django, db-mysql, and kv-redis.

The descriptive suffix can also be used to indicate the application-specific purpose of a service. For example, a system made up of an Apache front-end, a Python Flask-based application server for primary logic, and a custom image server might have services named web (a suffix isn't particularly descriptive in this case), app-primary, and app-images.

Standardizing service names has the benefit of clearly communicating the purpose of each service. This is beneficial when collaborating with others, when revisiting a service created in the past, and when enlisting the help of NERSC staff during troubleshooting.

As new service types become common, they will be added to this table of recommended names.

Image names are largely up to you. We recommend that image names be descriptive, or be named after the program or daemon being run. A custom image that is based off of the official nginx image should be named nginx-scienceapp, not web-scienceapp. web is ambiguous and obscures the actual software used in the image.

In Spin, containers are automatically named after the service of which they are an instance.

Versioning Your Builds

Storing the Dockerfile, docker-compose.yml (if it exists), and any other files associated with a container-based application in a version control system is helpful for all of the same reasons that version control is useful for other projects. In addition, during the pilot phase, container application developers frequently work with the Spin administrators to migrate their projects into this environment. Version control systems facilitate these collaborations. Frequently used systems include:

  • academic account - Academic users are upgraded to an account with unlimited public and private repositories. The upgrade is automatic and based on the email address used to register the account.
  • - Comes with comprehensive CI/CD features; it is a common choice for JGI projects
  • - Less commonly used as the free account tier doesn’t include private repositories, and the academic accounts are more limited and without the automatic upgrade feature of Bitbucket.
  • NERSC's internal Bitbucket service - available only to NERSC Staff.

Regardless of which system is being used, avoid storing secrets such as database passwords or private keys in files that get pushed to a repo; instead the Secrets feature should be utilized.

Using NERSC's Registry

Local images for Spin must be stored in the associated registry service,, or come directly from Docker Hub (Preferably only images produced by official projects will come from Docker Hub.) No other registries may be used as an image source for Spin.

The local registry is organized along the following lines:

  • Teams are used to group people that are working on the same project. Although the registry doesn’t yet use LDAP to define these, team names should match a group name in LDAP.
  • Namespaces are analogous to directories, and contain sets of related repositories. They are owned by teams, and team membership and roles define who can view and write to them. By convention, the default namespace should use the team name.
  • Repositories are where the docker images are actually stored.

Everyone with access to the Spin interface also has access to If a Team and Namespace haven’t yet been set up for your group, make a request via Service Now.

Be aware of the conditions under which Rancher will pull a fresh image. When a container is restarted in Spin, it may or may not first pull the image from the registry. If developers aren’t mindful of how they update their images in the registry, they might inadvertently be in a situation where their image is deployed into production before they intended. Observed behavior in the Spin environment is as follows:

  • If a container is restarted (due to a failed health check or manual operation) and Rancher chooses to schedule it on the same Spin node, a fresh copy of the image is not necessarily pulled from the registry
  • If a container is restarted and Rancher chooses to schedule it on a different Spin node, a fresh copy of the image will be pulled from the repository
  • If the ‘Upgrade’ operation is performed, Rancher will pull a fresh copy of the image from the registry even if no properties have changed.

Listing Stacks, Services, and Containers

To list all your stacks, type rancher stack ls.

nersc$ rancher stack ls
1st1967  elvis-first-stack  unhealthy  2        false
1st1969  elvis-flask-demo   healthy    3        false

The fields are:

  • Stack ID of your stack, prefixed with 1st, where 'st' stands for 'stack'
  • Stack Name
  • Stack State (Health)
  • The other fields are rarely used

To see information about active services in your stacks, type rancher ps. The following will be displayed:

nersc$ rancher ps
ID      TYPE     NAME                  IMAGE                                                          STATE    SCALE  SYSTEM  ENDPOINTS  DETAIL
1s4204  service  elvis-flask-demo/web  healthy  2/2    false
1s4205  service  elvis-flask-demo/app           healthy  1/1    false
1s4206  service  elvis-flask-demo/db   mongo:latest                                                   healthy  1/1    false

The fields are:

  • Service ID of your service, prefixed with a 1s, where 's' stands for 'service'
  • Service Name in the format [Stack Name]-[Service Name]
  • Image used for the Service
  • Service State (Health)
  • The Scale of a service, or the number of instances (a container is an instance of a service)
  • The other fields are rarely used

To see a list of all your containers that are part of any service, type rancher ps --containers

Services and Containers

Remember that a container is an instance of a service. A service may have one or more container instances.

In the example below, note that the 'web' service has two containers.

nersc$ rancher ps --containers
ID              NAME                    IMAGE                                                          STATE    HOST  IP             DOCKER        DETAIL
1i2599531       elvis-flask-demo-web-1  running  1h86  1256439e5462
1i2599534       elvis-flask-demo-web-2  running  1h87    0aa996c01835
1i2599532       elvis-flask-demo-app-1           running  1h85   14b0a7e5dee1
1i2599533       elvis-flask-demo-db-1   mongo:latest                                                   running  1h85   f019fedea11d

The fields are:

  • Instance ID of the service, prefixed with a 1i, where 'i' stands for 'instance'
  • Name of the container in the format of [Stack Name]-[Service Name]-[Instance #], where the instance is the numerical instance of the service. The example below shows two instances, 'web-1' and 'web-2'.
  • The internal IP of the services on the internal Spin network
  • The ID of the Spin host which is serving your containers
  • The Docker ID of your running container

To see a listing of all your services, type rancher ps --all. This will show services in any stack, including services which are stopped, inactive or recently removed. However, the stopped containers are a bit hidden in this display. In the following example notice that the 'SCALE' column says 2/1 which means that two containers exist, but only one is running.

nersc$ rancher ps --all
ID      TYPE     NAME                  IMAGE                                                          STATE     SCALE  SYSTEM  ENDPOINTS  DETAIL
1s3939  service  elvis-flask-demo/db   mongo:latest                                                   healthy   1/1    false
1s3940  service  elvis-flask-demo/app           upgraded  2/1    false
1s3941  service  elvis-flask-demo/web  healthy   1/1    false

Adding the --containers flag will make the stopped containers more obvious:

nersc$ rancher ps --all --containers
ID         NAME                    IMAGE                                                          STATE    HOST  IP             DOCKER        DETAIL
1i2596137  elvis-flask-demo-app-1           running  1h83  065asd9e0a
1i2596138  elvis-flask-demo-db-1   mongo:latest                                                   stopped  1h83    1f6920d6a1e9
1i2596146  elvis-flask-demo-web-1  running  1h82   66f48c9e36ee
1i2596160  elvis-flask-demo-db-1   mongo:latest                                                   running  1h83   065fe407ae58
1i2596161  elvis-flask-demo-app-1           running  1h83  16faa310be0a

Stopping and Starting Stacks and Services


The rancher start, rancher stop, and rancher restart commands share a common syntax. Stacks are started, restarted, or stopped by specifying the stack name. Individual services and containers are stopped by specifying the name of the service or container.


After upgrading a service or stack, the rancher stop start and restart commands cannot be used until the you have verified the upgrade and removed the old containers using the rancher up --confirm-upgrade command. Always remove containers after a successful upgrade.

If you do not remove the old containers, the command will fail with this error:

$ rancher stop elvis-flask-demo
error 1st1969: Bad response statusCode [422]. Status [422 status code 422]. Body: [baseType=error, code=InvalidState, fieldName=Service app is not in valid state to be deactivated: upgraded] from []

A similar error appears when stopping a service or container which has not completed upgrading:

$ rancher stop elvis-flask-demo/app
error 1s4205: stop/deactivate/deactivateservices not currently available on service 1s4205

After stopping a stack using rancher stop, the stopped containers may be seen by adding the --all flag to the rancher ps command.

nersc$ rancher stop elvis-first-stack
nersc$ rancher ps --all
ID      TYPE     NAME                   IMAGE                                                          STATE     SCALE  SYSTEM  ENDPOINTS  DETAIL
1s3748  service  elvis-first-stack/app    inactive  1/1    false
1s3749  service  elvis-first-stack/web  inactive  1/1    false

Sometimes you'll start a stack, and it won't start all of the way because of an error with one of the services in the stack.


'Failed to start: web : Service web must be state=active'

You might try to fix it in the Compose file, and then upgrade the Stack. Suppose that upgrade fails with an error like the following:

nersc:elvis-flask-demo $ rancher up --upgrade
INFO[0000] Secret db.elvis-flask-demo.mongo-initdb-password already exists
INFO[0000] [db]: Creating
INFO[0000] [app]: Creating
INFO[0000] [web]: Creating
INFO[0000] [web]: Created
INFO[0000] [app]: Created
INFO[0000] [db]: Created
INFO[0000] Secret db.elvis-flask-demo.mongo-initdb-password already exists
INFO[0000] [web]: Starting
INFO[0000] [db]: Starting
INFO[0000] [app]: Starting
ERRO[0000] Failed Starting web : Service web must be state=active or inactive to upgrade, currently: state=updating-active
INFO[0000] [db]: Started
INFO[0000] [app]: Started
ERRO[0000] Failed to start: web : Service web must be state=active or inactive to upgrade, currently: state=updating-active
FATA[0000] Service web must be state=active or inactive to upgrade, currently: state=updating-active
nersc:elvis-flask-demo $

The solution here is to stop the problematic service, and then try the upgrade again. You may need to wait 10+ seconds, or longer, for the service to actually stop correctly.

nersc:elvis-flask-demo $ rancher stop elvis-flask-demo/web
nersc:elvis-flask-demo $ rancher up --upgrade --stack elvis-flask-demo --file ~elvis/docker/elvis-flask-demo/docker-compose.yml
INFO[0000] Secret db.elvis-flask-demo.mongo-initdb-password already exists
INFO[0000] [app]: Creating
INFO[0000] [db]: Creating
INFO[0000] [web]: Creating
INFO[0000] [web]: Created
INFO[0000] [app]: Created
INFO[0000] [db]: Created
INFO[0000] Secret db.elvis-flask-demo.mongo-initdb-password already exists
INFO[0000] [web]: Starting
INFO[0000] [app]: Starting
INFO[0000] [db]: Starting
INFO[0001] Upgrading web
INFO[0001] [db]: Started
INFO[0001] [app]: Started
INFO[0029] [web]: Started
elvis-flask-demo-app-1 | 2018-04-10T23:41:04.364630881Z [2018-04-10 23:41:04 +0000] [1] [DEBUG] Current configuration:
elvis-flask-demo-app-1 | 2018-04-10T23:41:04.364688315Z   config: None


Services and containers may also be stopped using rancher stop and specifying the Service Name or Container Name.

  • Stopping a service, using the name [Stack Name]/[Service Name]:

    nersc$ rancher stop elvis-flask-demo/app
  • Stopping a container, using the name [Stack Name]-[Service Name]-[Instance #]:

    nersc$ rancher stop elvis-flask-demo-web-1

A note on rancher run: We generally discourage using rancher run, which lets you spin up a single container. Instead, we encourage you to create an application stack. We are looking into uses for rancher run, and may use it more in the future.

The --name flag requires a name to be passed in the format [Stack Name]/[Service Name].

nersc$ rancher run --name elvis-webapp/web
nersc$ rancher ps 1s2878
ID          NAME              IMAGE                                        STATE     HOST      IP              DOCKER         DETAIL
1i2553342   elvis-webapp-1   running   1h2   271efe4936a4


The command spits out the ID of the Rancher Stack, in this case '1s2872'. We can use that ID to query the status of the Stack.

If you don't use the name [stack name]/[service name], Rancher will insert the name 'Default' for you, which will cause confusion. Don't do this.

nersc$ rancher run --name elvistestweb1 httpd
nersc$ rancher ps 1s3027
ID          NAME                       IMAGE     STATE     HOST      IP           DOCKER         DETAIL
1i2569664   Default-elvistestweb1-1   httpd     running   1h42   d24ef37499de

Utilizing Storage

A number of different types of storage are available to containers running in the Spin environment. Docker concepts such as volumes, bind mounts, and tmpfs volumes are explained at . A brief summary of the different types of storage and their properties is presented in the table below, with the following column headings and their meanings:

  • Persistent - Whether the data in the volume is preserved when the container is destroyed and recreated
  • Portable - Whether the data in the volume is available when the container is restarted on a different node
  • Performance - A relative measure of the performance category of the storage
  • Auto-created - Whether the source directory is auto-created by Spin or must pre-exist to be mounted
  • Externally Accessible - Whether the data is available outside of the Spin environment
Name Persistent Portable Performance Auto-created Externally Accessible
Container storage N N fast Y N
Local node storage Y N fast Y N
Rancher NFS Y Y normal No No
Global File System Y Y normal N Y

Container Storage

Storing data to the container file system should only be used for small amounts of ephemeral data. The data is lost whenever the container is restarted with a fresh image, which can happen in a number of scenarios (container restarted on different node, container upgraded with new image, etc.)

Local Node Storage

Each node within Spin has 1.7 TB of storage available to containers. Because it is provisioned on SSD, this storage is relatively fast. Because it is local to a node, any data previously written will not be available to a container if it’s restarted on a different node. Therefore, this storage is most useful for applications that read/write lots of data that is considered transient or disposable (for example an application cache).

Rancher NFS

Rancher NFS is a storage class residing on NFS servers and available only from within the Spin environment. It is appropriate when an application needs persistent storage, that will be available to containers even if restarted on a different node. The storage is not available from outside of Spin, so it’s not a good choice when data needs to be part of a pipeline that has components outside of Spin, or when users expect to have direct access to the data files from login nodes. Rancher NFS does have nice built-in lifecycle features that can optionally create/destroy data directories to match the life cycle of an application stack.

Rancher NFS volumes should be named after the service that mounts the data:

<service name>.<stack name>

Global File System / GPFS

It is possible to mount GPFS into a docker container in Spin. There are some caveats, however. For the sake of security, we have imposed certain restrictions, detailed below. In addition, it can be difficult to simulate the global file system and/or copy data when prototyping on a laptop. General guidelines for using GPFS within Spin include the following:

  • The mount point should be as deep in directory structure as possible, e.g. /global/cfs/cdirs/project/username/application rather than /global/cfs/cdirs/project/username
  • The volume should be mounted read-only, unless the container actually writes to the global file system.
  • The file system must have the execution bit set for ‘other’ (o+x) on parent directories all of the way down to the mount point for the docker daemon to successfully mount the directory. For example, permission mode 0741 would work on a parent directory, but 0740 would not.

Permissions set on the global file systems must be respected and enforced when files are accessed from within a container. This is accomplished through a combination of container configuration and external controls enforced by the docker daemon. This leads to several considerations when using the global file system within Spin.

  • The user ‘root’ in a container maps to user ‘nobody’ on the global file systems, which places significant restrictions on the data that can be accessed from a container in a default configuration.
  • Setting the container to run as a non-root account with the appropriate file system access is an effective way to address these permission constraints.
  • Using a collaboration account with the necessary file system access is an effective way to ensure data access while also avoiding issues that occur when the owner of a personal account leaves a group or project.
  • When a container can’t easily be modified to run as a non-root user, the container can often be run with the group set in a manner that provides access. For example, a container running as root:genome will successfully read files in a directory with the following restrictive permissions:

    elvis@cori02:~$ ls -ld /global/cfs/cdirs/project/elvis
    drwxr-x--- 5 elvis genome 512 Sep  8 12:00 /global/cfs/cdirs/projectelvis
  • The Linux setuid and setgid capabilities will be dropped for containers accessing the global file system as discussed in the Security section

  • Configuring the user or group that the containers will run as and configuring capabilities will be performed by ISG administrators during the Spin pilot phase as part of the initial stack setup.
  • Images that will be run as a different user or group will need RUN statements, as shown in the following example, to prepare the image with the necessary group and user definitions.

To make a group available in the container, you can insert a RUN statement in your Dockerfile using groupadd.

This example illustrates the Dockerfile RUN statement for an image to run with the group ‘genome’ (gid 124).

# Add a genome group to facilitate access to global file system
RUN groupadd -g 124 genome

To set both a user and a group for a container, use useradd and groupadd.

This example illustrates the Dockerfile RUN statement for an image run as collaboration account ‘c_flintstones’ (uid 501) and group ‘genome’ (gid 124).

# Add collab account and group to facilitate access to
# global file system
RUN groupadd -g 124 genome && \
  useradd -u 501 -g 124 -c 'Collaboration Account' c_flintstones

The ampersands (&&) in this example minimize the layers created in the docker image. The containers would be configured to run as ‘root:genome’ and ‘c_flintstones:genome’, respectively, during the initial stack configuration.

When connecting to GPFS, make sure you are using read-only vs. read-write mounts appropriately.

  • Public (unauthenticated) services must mount the global file systems read-only
  • Authenticated services are allowed to mount the global file systems read-write. The authenticated application must include the capability of tracking the username responsible for creating/modifying/deleting data on the global file system. Writes need to be traceable to the user doing the writing

You will also need to ensure that the full path to your directory is set to o+x.

Let's imagine that you started your stack, but the stack isn't working correctly. To troubleshoot, you use the rancher logs command and discover the following error which says permission denied.

nersc$ rancher logs --service --follow --tail 10 elvis-flask-demo/web
2018-04-12T22:51:19Z   0s 41599f54 ERROR elvis-flask-demo/web(1s3680) 1i2589840 service.activate.exception: Expected state running but got error: Error response from daemon: error while creating mount source path '/global/cfs/cdirs/myteam/spin/elvis-flask-demo/web/nginx-proxy.conf': mkdir /global/cfs/cdirs/myteam/spin/elvis-flask-demo/web/nginx-proxy.conf: permission denied

What's happening here? The Docker daemon cannot access your directory because the o+x bit is not set. Notice the part which says mkdir /global/… permission denied? Docker cannot see the file on the host, therefore it believes that file does not exist. By default, Docker will try to create a directory using the path provided, but does not have permission to do so. We don't actually want Docker to create anything. We just want it to use what exists already.

The real cause of this error is the lack of the 'o+x' bit on the directory. Notice how the bit is missing on the .../elvis-flask-demo/web subdirectory?

nersc$ ls -ld /global/cfs/cdirs/myteam/spin /global/cfs/cdirs/myteam/spin/elvis-flask-demo/ /global/cfs/cdirs/myteam/spin/elvis-flask-demo/web/
drwxrwx--x 7 elvis myteam 512 Apr 12 14:40 /global/cfs/cdirs/myteam/spin
drwxrwx--x 7 elvis myteam 512 Apr 12 14:40 /global/cfs/cdirs/myteam/spin
drwxrwx--x 5 elvis elvis 512 Apr 12 15:06 /global/cfs/cdirs/myteam/spin/elvis-flask-demo/
drwxrwx--- 3 elvis elvis 512 Apr 12 14:41 /global/cfs/cdirs/myteam/spin/elvis-flask-demo/web/

The fix is:

nersc$ chmod o+x /global/cfs/cdirs/myteam/spin/elvis-flask-demo/web/
nersc$ ls -ld /global/cfs/cdirs/myteam/spin /global/cfs/cdirs/myteam/spin/elvis-flask-demo/ /global/cfs/cdirs/myteam/spin/elvis-flask-demo/web/
drwxrwx--x 7 elvis myteam 512 Apr 12 14:40 /global/cfs/cdirs/myteam/spin
drwxrwx--x 7 elvis myteam 512 Apr 12 14:40 /global/cfs/cdirs/myteam/spin
drwxrwx--x 5 elvis elvis 512 Apr 12 15:06 /global/cfs/cdirs/myteam/spin/elvis-flask-demo/
drwxrwx--x 3 elvis elvis 512 Apr 12 14:41 /global/cfs/cdirs/myteam/spin/elvis-flask-demo/web/

Logging from Containers

  • Logging strategies for container-based services may need to be modified for applications developed in a more traditional environment. Files written to the container’s file system aren’t easily accessible, and also aren’t persistent across container restarts or upgrades. There are several approaches that have proven useful in Spin:
  • Log to stdout and stderr rather than writing to a file in the file system. If the service needs just one log, it can write to stdout. If it needs two logically separate log streams, it can write to stdout and stderr. In cases where more than one log stream is used, the container should be started without the -i or -t flag so that stdout and stderr are not combined. These logs will be persistent, but as they can only be accessed via Rancher or a docker command on the Spin hosts, access to the logs must be coordinated with ISG staff during the pilot phase.
  • Write to a persistent log volume hosted outside of the Spin environment (e.g. A global Community directory). This will facilitate direct access to log information.
  • Log to central logging system (future capability)

Connecting to the Network

"Publishing" Ports

Services within a stack can communicate with each other on all network ports without any special configuration. To make a service accessible from outside the stack, "publish" (map) the internal port that the container exposes to an external port using a ports: declaration in the docker-compose.yml file:


    - 60013:8080/tcp

This declaration will make the web service container listening on port 8080 accessible from outside Spin at port 60013 using a dynamic DNS name, described below.

Before using ports:, please review these recommendations:

  • The ports available for use are listed below, and include standard ports for a variety of services, pre-configured with firewall rules designed for typical usage. Choose a port that meets your needs while minimizing exposure.
  • Note that ports 80 and 443 are not available for use with ports:. For these services, we recommend you temporarily use an alternate port in the provided ranges for quick access during initial development and testing. After the web service is running, it should be made available on standard ports with the reverse proxy and the ports: declaration removed.
  • Where a range of ports is available, such as 60000-60060, make a random selection to avoid conflicts with other services running in Spin.
  • If the standard port for the service you want to make available isn't listed, request that it be added to the list via a ticket in ServiceNow.
  • Inside the container, if possible, use a port number greater than 1024. This allows the NET_BIND_SERVICE capability to be dropped.

Pre-Firewalled Ports

The following TCP ports are publicly available from all source addresses:

80, 443, 8080, 8443, 60000 - 60050

The following TCP ports are available only from within NERSC networks ( as well as non-guest LBL networks, including LBL employee wireless and VPN.

3128, 3306, 4873, 5432, 5984, 8008, 27017, 55000 - 55100

The following TCP ports are available only from within NERSC networks:

5672, 8081, 15672, 50000-50050

Dynamic DNS Names

When a port is published using a ports: declaration, a DNS entry is created that will reliably resolve to the IP address of the node where the container is running.

The Dynamic DNS name will be of the form

<service name>.<stack name>.<environment>

For example, a database service in the stack ‘mystack’ in the production environment with a port published using 3306:3306/tcp would get the name; the database would be reachable at this name on port 3306 from within NERSC networks.

If a more memorable name is desired, work with your local network administrator to create a DNS CNAME to the dynamic name above.

HTTP/HTTPS reverse proxy

Access to ports 80 and 443 is achieved via a reverse proxy running at the Spin perimeter. The reverse proxy offers several benefits:

  • No ports: declaration is required
  • Traffic is load-balanced to services across multiple containers
  • Optional HTTPS termination avoids SSL/TLS complexity

To make a service reachable via the reverse proxy, submit a ticket to NERSC requesting a Spin reverse proxy configuration. Specify whether HTTPS termination is needed, and list the desired hostname along with the name of the environment, stack, service, and listening port.

Then, create a DNS CNAME from your desired hostname to the appropriate reverse proxy endpoint:

  • - Production, HTTPS-terminating
  • - Production, Non-HTTPS-terminating (SNI)
  • - Development, HTTPS-terminating
  • - Development, Non-HTTPS-terminating (SNI)

NERSC staff will arrange for secure transfer of web certificates and keys if HTTPS termination is requested.

Deleting Proxied Services

To maintain referential integrity, the reverse proxy configuration is cleared if a proxied service is removed. To avoid service interruptions, do not remove proxied services (or stacks containing proxied services). Instead, upgrade services to make changes; when resolving problems with container startup, stop the proxied service or entire stack, remove individual containers, and upgrade the service to make changes.

Ensuring Security

Security Audit

All applications sent to Spin are automatically audited at the API to ensure that they follow our security requirements, which are outlined below. If an application breaks one of our security standards, the Rancher CLI will print an error such as the following:

$ rancher stop NotMyStack
error NotMyStack: Bad response statusCode [401]. Status [401 Unauthorized].
Body: [message=you don't own stack NotMyStack] from

Docker containers are fairly secure by default. This security is achieved through the use of Linux kernel 'namespaces', isolated network stacks, control groups, and whitelisting the Linux kernel 'capabilities' to only those needed. Docker security is a big topic. For a good summary explaining the current security features of Docker, read Docker security in the Docker manual.

AppArmor and SELinux security policies on Ubuntu and CentOS will be enabled on Spin in the future.

To enhance the security of your containers, we recommend:

  • When possible, run services in the container as a non-root user. Many of the reasons that a process would normally need escalated privileges (direct access to hardware, writing to a particular directory, binding to a low-numbered port) don’t apply or can be avoided in a containerized environment. For example, a service can bind to a high-numbered port and then let docker map the privileged port on the docker host to the unprivileged port on the container. Similarly, volume mounts to a persistent volume with the desired permissions can avoid some of the permission hurdles.

  • Just as with a traditional server, if a container conducts a mix of privileged and unprivileged operations, it can implement privilege separation, and drop privileges after the privileged operations have been completed.

  • If it’s not possible to run as a non-root user, minimize the Linux capabilities granted to the container. In most cases, a container can drop all capabilities, and only add back one or two that are actually needed by the container. The initial set of capabilities that Docker uses is small enough that reviewing the list of what’s needed by a specific application isn’t an onerous task. Experience has shown that many containers (if not most containers) don’t actually need any of these capabilities.
  • If your service uses external file systems (like the global file system), it will be required to run as a non-root user, and drop many Kernel capabilities. This allows existing ownership and permissions on the file system to be effectively enforced within Spin.

The following chart shows which capabilities are allowed for Spin containers that do and do not use the NERSC Global File System:

Permission No External File system External File system Description
CHOWN Yes No Make arbitrary changes to file UIDs and GIDs (see chown(2)).
DAC_OVERRIDE Yes No Bypass file read, write, and execute permission checks
FOWNER Yes No Bypass permission checks on operations that normally require the file system UID of the process to match the UID of the file
KILL Yes No Bypass permission checks for sending signals
SETGID Yes No Make arbitrary manipulations of process GIDs and supplementary GID list
SETUID Yes No Make arbitrary manipulations of process UIDs.
NET_BIND_SERVICE Yes Yes Bind a socket to internet domain privileged ports (port numbers less than 1024).

Ensuring Privacy with Secrets

Rancher Secrets are a mechanism for storing encrypted copies of sensitive items such as database passwords and SSH keys that are needed by a container at runtime. Storing them as Rancher secrets obviates the need to store sensitive information as a file in your Docker development directory or as an environment variable (which is exposed in the docker-compose.yml file), and it helps prevent the information from ending up in the image registry or in a source code revision control repository. There is no direct analogy for Rancher Secrets in a laptop Docker environment.

Secrets have the following properties:

  • They are stored in encrypted form within the Spin infrastructure.
  • When attached to a container, they are available in unencrypted form in a file mounted as /run/secrets/secretname.
  • They are arbitrary files that can contain anything that is considered sensitive. Examples of secret files: certificates, config files that contain sensitive passwords, and environment files with sensitive information. It is up to the application to read and interpret the secret file.
  • They must be entered into the Rancher UI by an ISG administrator (during the pilot phase)

If an application requires a specific path to the secret, a symbolic link can be made to the file stored in /run/secrets/. Even if only one component of a configuration file is sensitive, the entire contents of the configuration file can be pasted into a secret to protect the sensitive component.

Following the Spin naming convention will help identify secrets related to your stack and also aid in the overall stack lifecycle management.

Secrets used for a single service should be named

<service name>.<stack name>.<filename>

Wherever possible, the filename should indicate how the secret is used. For example, a MySQL password within a stack named ‘My Portal’ would be


If the secret is used by a number of services within the stack, the service part of the name can be dropped, leaving the secret name as

<stack name>.<filename>

For example, an SSH private key that is used for multiple components within a stack named ‘My Portal’ would be:


When creating a secret, the description should always indicate the secret’s owner, by adding owner: to the description field.

When adding a secret to a container in the ‘Secrets’ tab,

  • Set ‘As Name’ to the filename component of the secret name. In the above multi-service secret example, the ‘As Name’ field would be set to ‘mysql_password’, and the secret would be available in the file /run/secrets/mysql_password.
  • Customize the file ownership and permissions to restrict read permissions within the running container. In general, the file owner should match the UID that the service is running as, and the mode should be set to 400.

Inspecting Containers and Their Contents

There are several ways to gain insight into what is happening inside a container. Reviewing logs is one method that can be employed without getting command-line access. At times, you will need to obtain a shell and actually type commands, and this is possible as well. If you simply need details of how the container is configured, you can obtain a JSON-formatted listing of its settings. You can even copy a file from inside a running container to the local file system outside of it.

Viewing Logs

Logs may be viewed using the rancher logs command. The command may use the service name, like elvis-first-stack/web, or the container name, like elvis-first-stack-web-1.

If your service has more than one container (remember, a container is an instance of a service), the individual container's logs will show the number of the container at the beginning. In the example below, the 'web' service has two containers. Notice how the line begins with '01' or '02', which indicates which container owns that log line.

nersc$ rancher logs elvis-flask-demo/web
01 2018-05-23T00:15:26.486199100Z - - [23/May/2018:00:15:26 +0000] "GET /static/CPvalid1_nodsRNA_40x_Tiles_p1745DAPI.png HTTP/1.1" 200 82055 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36" "-"
02 2018-05-23T00:17:21.196355808Z - - [23/May/2018:00:17:21 +0000] "GET /fields/ HTTP/1.1" 200 19322 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/53

To view the logs for just a single container, enter the container name instead of the service name. The container name can be found using rancher ps --containers as shown above.

nersc$ rancher logs elvis-flask-demo-web-2 - - [14/Mar/2018:00:41:23 +0000] "GET / HTTP/1.1" 200 12 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.146 Safari/537.36" "-"

In this next example, we are viewing logs for the last hour, with timestamps enabled, and are following the logs as if we were using tail --follow:

nersc$ rancher logs --since 1h --timestamps --follow elvis-webapp-1
2017-11-09T01:17:38.296570056Z AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using Set the 'ServerName' directive globally to suppress this message
2017-11-09T01:17:38.308314039Z AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using Set the 'ServerName' directive globally to suppress this message
2017-11-09T01:17:38.355638274Z [Thu Nov 09 01:17:38.336440 2017] [mpm_event:notice] [pid 1:tid 139923965044608] AH00489: Apache/2.4.27 (Unix) configured -- resuming normal operations
2017-11-09T01:17:38.355655838Z [Thu Nov 09 01:17:38.343553 2017] [core:notice] [pid 1:tid 139923965044608] AH00094: Command line: 'httpd -D FOREGROUND'

Obtaining a Shell

Use rancher exec -it NAME /bin/bash to start a bash shell on a container. The NAME may be the service name or an individual container name.

nersc$ rancher exec -it elvis-webapp-1 /bin/bash
root@21060e7b6b52:/usr/local/apache2# ps aux
root          1  0.0  0.0  77204  2936 ?        Ss   01:17   0:00 httpd -DFOREGR
daemon        9  0.0  0.0 366384  4144 ?        Sl   01:17   0:00 httpd -DFOREGR
daemon       10  0.0  0.0 366384  4152 ?        Sl   01:17   0:00 httpd -DFOREGR
daemon       11  0.0  0.0 366384  4152 ?        Sl   01:17   0:00 httpd -DFOREGR
root         93  0.5  0.0  20240  1920 ?        Ss   01:41   0:00 /bin/bash
root         97  0.0  0.0  17492  1144 ?        R+   01:41   0:00 ps aux
root@21060e7b6b52:/usr/local/apache2# exit

Inspecting Service Details

rancher inspect will print a Service's configuration in JSON, similar to how docker inspect works. JSON can be hard for humans to parse, so we recommend using the 'jq' command line tool, which is available on all NERSC systems.

nersc$ rancher inspect elvis-webapp/web | jq
  "accountId": "1a5",
  "assignServiceIpAddress": false,
  "baseType": "service",
  "createIndex": 2,
  "created": "2017-11-09T02:44:54Z",
  "createdTS": 1510195494000,
  "currentScale": 1,
  "description": null,
  "externalId": null,
  "fqdn": null,
  "healthState": "healthy",
  "id": "1s2878",
  "instanceIds": [

To save the jq output to a file, or to pipe the output through grep or less, be sure to apply a filter, such as '.', like so:

nersc$ rancher inspect elvis-webapp/web | jq '.' | less

Exporting Configuration

To export the configuration of a stack to your directory, do the following:

nersc:~ $ cd ~/docker/elvis-webapp
nersc:elvis-webapp $ rancher export elvis-webapp
INFO[0000] Creating elvis-webapp/docker-compose.yml
INFO[0000] Creating elvis-webapp/rancher-compose.yml
nersc:docker $ cat elvis-webapp/docker-compose.yml
version: '2'
    image: httpd
    image: mysql
nersc:elvis-webapp $

To export the configuration to a tar file, do

nersc:~ $ cd ~/docker
nersc:docker $ rancher export --file elvis-webapp.tar elvis-webapp
nersc:docker $ tar tf elvis-webapp.tar
nersc:docker $

Copying a File Out

On Spin, you can copy a text file from a running container using 'cat'

nersc:~ $ cd ~/docker/my-project
nersc:my-project $ rancher exec -it elvis-webapp-1 cat /etc/nginx/nginx.conf > nginx.conf.copy
nersc:my-project $ ls -ld nginx.conf.copy
-rw-r--r--  1 elvis staff  1085 Dec 11 15:05 nginx.conf.copy
nersc:my-project $

On your laptop, you can copy a file from inside a container with docker container cp

To copy files from a local container on your laptop to your working directory, you can use this trick, which we borrowed from Nginx. Start a temporary container on your laptop, and copy files using 'docker container cp' to your working directory:

laptop$ docker container run --rm --detach --name tmp-nginx-container nginx
Unable to find image 'nginx:latest' locally
latest: Pulling from library/nginx
e7bb522d92ff: Pull complete
6edc05228666: Pull complete
cd866a17e81f: Pull complete
Digest: sha256:285b49d42c703fdf257d1e2422765c4ba9d3e37768d6ea83d7fe2043dad6e63d
Status: Downloaded newer image for nginx:latest
laptop$ docker container cp tmp-nginx-container:/etc/nginx/nginx.conf nginx.conf
laptop$ ls -l nginx.conf
-rw-r--r--  1 elvis staff  643 Dec 26 03:11 nginx.conf

Since the container was started with the '--rm' flag, the container will remove itself after you have stopped it.

Removing Stacks and services

To remove a stack, type rancher rm --type stack StackName.

nersc$ rancher ps
ID      TYPE     NAME                   IMAGE                                                          STATE    SCALE  SYSTEM  ENDPOINTS  DETAIL
1s4146  service  elvis-first-stack/app    healthy  1/1    false
1s4147  service  elvis-first-stack/web  healthy  2/2    false
nersc$ rancher rm --type stack elvis-first-stack
nersc$ rancher ps

To remove unused services in your stack, use rancher prune.

This will remove services which are not listed in the docker-compose.yml file in your current working directory. We don't use this very often. Be careful with this.

rancher prune --stack elvis-webapp