Skip to content

Gitlab

Gitlab is a DevOps platform to allow software development teams to collaborate together by hosting code in source repository and automate build, integration and verification of code using Continuous Integration (CI)/Continuous Development (CD). The Gitlab Project is open source and actively maintained by Gitlab Inc.

Access

NERSC provides user facing gitlab service available at https://software.nersc.gov/. You will be required to type your NERSC credentials in order to access service.

Running CI Pipelines at NERSC

The gitlab server provides shared runners in order to run CI jobs on NERSC resources. Currently we have the following runners

Runner Name System Access
cori Cori All Users
cori-esslurm Cori All Users

The cori runner will use the system default slurm binaries /usr/bin/sbatch to submit job to cluster whereas cori-esslurm runner will submit job using the esslurm slurm binaries /opt/esslurm/bin/ to submit job. The cori-esslurm runner should be used when you need to submit job to Cori GPU cluster. Please refer to https://software.nersc.gov/ci-resources/corigpu on how to use cori-esslurm to submit jobs to Cori GPU cluster.

Note

There is no gitlab runner for Perlmutter at the moment, we plan on adding a new runner in the near future.

Note

Currently we are unable to run CI jobs on Cori GPU via SCHEDULER_PARAMETERS, we plan on having a fix in near future. Please see https://software.nersc.gov/ci-resources/corigpu project for status on Cori GPU pipeline.

We make use of Jacamar CI which is a Gitlab custom executor that allows one to run CI/CD jobs on HPC system. Jacamar provides integration with batch schedulers and downscoping of permission to ensure jobs are run via your user account. We recommend you review the ECP-CI documentation.

Warning

Please be careful of what you run in your CI job as they will be run via your user account. The Gitlab job will have access to all shared filesystem including $HOME directory that you typically have when accessing system. Any sensitive information should not be stored on NERSC system or displayed in Gitlab job. It is your responsiblity for proper use of NERSC system including Gitlab service. We are not responsible for any loss of data or issues with user environment as result of CI job.

Gitlab CI configuration is declared in a special file .gitlab-ci.yml that is typically available in the root of the project. Please review the reference guide for .gitlab-ci.yml. We encourage you review the Gitlab CI/CD documentation, please make sure you review the documentation for the appropriate version. You can see the gitlab version by navigating to https://software.nersc.gov/help.

Scheduler Integration

Jacamar CI support scheduler integration with several batch executors including Slurm, LSF, and Cobalt. In Gitlab this is defined via SCHEDULER_PARAMETERS variable which is used to request allocation on compute node. The variable can be defined in .gitlab-ci.yml or as a project CI/CD variable.

You should check Slurm example jobs on how to submit job, it's important you define the slurm options correctly via SCHEDULER_PARAMETERS otherwise your job will fail during slurm allocation. Here is a simple example on how one submits a job to Cori Haswell node. The tags keyword is used to select the gitlab runner to use in this case tags: [cori] informs gitlab to send job to Cori system. The keyword script, before_script and after_script are sections where you can run arbitrary shell commands. The stages keyword is used to define a list of stage name to group gitlab jobs; all jobs within a stage can execute in parallel. The stage keyword is used in context of a gitlab job, in this example the name of job is cori-haswell

You can find this example in https://software.nersc.gov/ci-resources/hello-environment.

Note

Gitlab runner will be down when system is offline which may result in termination or failure of CI jobs

stages:
  - examine

cori-haswell:
  stage: examine
  tags: [cori]
  variables:
    SCHEDULER_PARAMETERS: "-C haswell --qos=debug -N1 -t 00:05:00"    
  script:
    - echo "Script"
    - bash ./environment.bash

before_script:
  - echo "Before Script"
  - pwd
  - ls -la

after_script:
  - echo "After Script"
  - whoami
  - hostname

Increase Job Timeout

By default, gitlab job will timeout after 60min and gitlab will terminate job and mark job as failure. You can increase the job timeout in project settings by navigating to Settings > CI/CD > General Pipelines and set the Timeout value in minutes (10m), hours (10h) or days (10d). The maximum timelimit is 30 days (30d).

For more details see https://docs.gitlab.com/ee/ci/pipelines/settings.html#set-a-limit-for-how-long-jobs-can-run

Access Token

In order to use our gitlab server, you will need to create a Personal Access Token to perform any action since we have disabled SSH authentication when cloning repo. To create an access token navigate to https://software.nersc.gov/-/profile/personal_access_tokens and create a token name with appropriate scope. We recommend you enable scope read_repository and write_repository to read and write to repository, if you plan to use the gitlab API you may enable scope read_api, read_user and api. Once you create a token, you will see a randomly generated token, please save this token, if you are using Mac you can use Keychain Access to store your password.

Resources

Title Date Links
Introduction to CI at NERSC July 7th, 2021 Slides
Video