Slurm examples

The SLURM directive --partition is mandatory in each job (see available partitions).
The SLURM directive --time is mandatory in each job (see SLURM directives).
The SLURM directive --account is mandatory in each job (see SLURM directives).

Sequential example

#!/bin/bash
#
## BEGIN SBATCH directives
#SBATCH --job-name=test_seq
#SBATCH --output=res_seq.txt
#
#SBATCH --ntasks=1
#SBATCH --time=00:10:00
#SBATCH --partition=cpu_seq
#SBATCH --account=sandbox
##SBATCH --mail-type=ALL
##SBATCH --mail-user=your_email@polytechnique.edu
## END SBATCH directives

## To clean and load modules defined at the compile and link phases
module purge
module load gcc/10.2.0

## Execution
./hello.seq

Request one core on the cluster for ten minutes. Assuming hello.seq was compiling with gcc, srun will create one instance of it, on one node allocated by Slurm.

You can try the above job by compiling this hello world example hello.c :

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <limits.h>

int main()
{
  char hostname[HOST_NAME_MAX + 1];
  gethostname(hostname, HOST_NAME_MAX + 1);

  printf("Hello, world on %s\n", hostname);

  return EXIT_SUCCESS;
}

and compiling with with gcc:

$ module load gcc/10.2.0
$ gcc hello.c -o hello.seq

Then launch job with Slurm :

$ sbatch job_seq.sh

The res_seq.txt file should like something like this :

Hello, world on node001

Shared memory example (OpenMP)

#!/bin/bash
#
## BEGIN SBATCH directives
#SBATCH --job-name=test_omp
#SBATCH --output=res_omp.txt
#
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --time=00:10:00
#SBATCH --partition=cpu_shared
#SBATCH --account=sandbox
##SBATCH --mail-type=ALL
##SBATCH --mail-user=your_email@polytechnique.edu
## END SBATCH directives

## To clean and load modules defined at the compile and link phases
module purge
module load gcc/10.2.0

export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

## Execution
./hello.omp

The job will be run in an allocation where four cores have been reserved on the same compute node.

You can try it by using this hello world program hello_omp.c :

#include <stdio.h>
#include <omp.h>

int main() {
  #pragma omp parallel
   {
      printf("Hello world from thread %d out of %d\n",
               omp_get_thread_num(),omp_get_num_threads());
   }
  return 0;
}

and compiling it with :

$ gcc -fopenmp hello_omp.c -o hello.omp

then launch job with :

$ sbatch job_omp.sh

The res_omp.txt file should contain something like :

Hello world from thread 0 out of 4
Hello world from thread 1 out of 4
Hello world from thread 2 out of 4
Hello world from thread 3 out of 4

Message passing example (MPI)

#!/bin/bash

## BEGIN SBATCH directives
#SBATCH --job-name=test_mpi
#SBATCH --output=res_mpi.txt
#
#SBATCH --ntasks=80
#SBATCH --time=00:10:00
#SBATCH --partition=cpu_dist
#SBATCH --account=sandbox
##SBATCH --mail-type=ALL
##SBATCH --mail-user=your_email@polytechnique.edu
## END SBATCH directives

## load modules
module load gcc/10.2.0
module load openmpi/4.1.0

## execution 
mpirun -n $SLURM_NTASKS hello.mpi

Request four cores on the cluster for ten minutes. Assuming hello.mpi was compiled with MPI support mpirun will create four instances of it, on the nodes allocated by Slurm.

You can try it by using this hello world program hello_mpi.c :

#include <mpi.h>
#include <stdio.h>

int main(int argc, char** argv) {
    // Initialize the MPI environment
    MPI_Init(NULL, NULL);

    // Get the number of processes
    int world_size;
    MPI_Comm_size(MPI_COMM_WORLD, &world_size);

    // Get the rank of the process
    int world_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);

    // Get the name of the processor
    char processor_name[MPI_MAX_PROCESSOR_NAME];
    int name_len;
    MPI_Get_processor_name(processor_name, &name_len);

    // Print off a hello world message
    printf("Hello world from processor %s, rank %d out of %d processors\n",
           processor_name, world_rank, world_size);

    // Finalize the MPI environment.
    MPI_Finalize();
}

and compiling it with :

$ mpicc hello_mpi.c -o hello.mpi

then launch job with :

$ sbatch job_mpi.sh

The res_mpi.txt file should contain something like :

Hello world from processor node010, rank 2 out of 4 processors
Hello world from processor node010, rank 0 out of 4 processors
Hello world from processor node010, rank 1 out of 4 processors
Hello world from processor node010, rank 3 out of 4 processors

Hybrid jobs (MPI with OpenMP)

You can mix multi-processing (MPI) and multi-threading (OpenMP) in the same job, simply like this :

TODO

GPU jobs

If you want to claim a GPU for your job, you need to specify the GRES (Generic Resource Scheduling) parameter in your job script.

#SBATCH --partition=gpu
#SBATCH --gres=gpu:1

A simple job file requesting a node with a GPU could like this :

#!/bin/bash

## BEGIN SBATCH directives
#SBATCH --job-name=test_gpu
#SBATCH --output=res_gpu.txt
#
#SBATCH --ntasks=1
#SBATCH --time=00:10:00
#SBATCH --partition=gpu
#SBATCH --account=sandbox
#SBATCH --gres=gpu:1
##SBATCH --mail-type=ALL
##SBATCH --mail-user=your_email@polytechnique.edu
## END SBATCH directives

## load modules
module load application/version

## execution 
./myprog input.fits

Interactive jobs

Slurm jobs are normally batch jobs in the sense that they are run unattended. If you want to have a direct view on your job, for tests or debugging, you have two options.

If you need simply to have an interactive Bash session on a compute node, with the same environment set as the batch jobs, run the following command:

$ srun --time=00:10:00 --account=sandbox --partition=cpu_test --pty bash

Doing that, you are submitting a one core, default memory, for ten minutes (time limit required) on cpu_test partition (partition name required) job that will return a Bash prompt when it starts.

If you need more flexibility, you will need to use the salloc command. The salloc command accepts the same parameters as sbatch as far as resource requirement are concerned. Once the allocation is granted, you can use srun the same way you would in a submission script.

Singularity container

Singularity is a container solution for compute driven workloads. It is allowing to pack your application with all its dependencies such that you can run it out of the box in various different HPC clusters and computers.

MPI

The portability of the application of MPI application is not easy. The code compiled within the container need to have the same slurm library and the same OpenMPI routine as the one installed on the cluster.

Here is an example of mpi hello-world application package with the definition file mpitest.def :

Bootstrap: library
From: ubuntu:18.04

%files
    mpitest.c /opt 

%environment
    export OMPI_DIR=/opt/ompi
    export SINGULARITY_OMPI_DIR=$OMPI_DIR
    export SINGULARITYENV_APPEND_PATH=$OMPI_DIR/bin
    export SINGULARITY_APPEND_LD_LIBRARY_PATH=$OMPI_DIR/lib

%post
    echo "Installing required packages..."
    apt-get update && apt-get install -y wget git bash gcc gfortran g++ make file bzip2
····
    echo "Installing Open MPI" 
    export OMPI_DIR=/opt/ompi
    export OMPI_VERSION=4.0.1
    export OMPI_URL="https://download.open-mpi.org/release/open-mpi/v4.0/openmpi-$OMPI_VERSION.tar.bz2"
    mkdir -p /tmp/ompi
    mkdir -p /opt 
    # Download
    cd /tmp/ompi && wget -O openmpi-$OMPI_VERSION.tar.bz2 $OMPI_URL && tar -xjf openmpi-$OMPI_VERSION.tar.bz2
    # Compile and install
    cd /tmp/ompi/openmpi-$OMPI_VERSION && ./configure --prefix=$OMPI_DIR && make install
    # Set env variables so we can compile our application
    export PATH=$OMPI_DIR/bin:$PATH
    export LD_LIBRARY_PATH=$OMPI_DIR/lib:$LD_LIBRARY_PATH
    export MANPATH=$OMPI_DIR/share/man:$MANPATH

    echo "Compiling the MPI application..."
    cd /opt && mpicc -o mpitest mpitest.c

The hello world application source code mpitest.c :

#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>

int main (int argc, char **argv) {
        int rc;
        int size;
        int myrank;

        rc = MPI_Init (&argc, &argv);
        if (rc != MPI_SUCCESS) {
                fprintf (stderr, "MPI_Init() failed");
                return EXIT_FAILURE;
        }

        rc = MPI_Comm_size (MPI_COMM_WORLD, &size);
        if (rc != MPI_SUCCESS) {
                fprintf (stderr, "MPI_Comm_size() failed");
                goto exit_with_error;
        }

        rc = MPI_Comm_rank (MPI_COMM_WORLD, &myrank);
        if (rc != MPI_SUCCESS) {
                fprintf (stderr, "MPI_Comm_rank() failed");
                goto exit_with_error;
        }

        fprintf (stdout, "Hello, I am rank %d/%d", myrank, size);

        MPI_Finalize();

        return EXIT_SUCCESS;

 exit_with_error:
        MPI_Finalize();
        return EXIT_FAILURE;
}

The next step is to build the definition file (container) :

$ singularity build mpitest.sif mpitest.def

Note

We recommand that you use the GITLAB continuous integration process to build the container, and store it in the GITLAB container registry of the associated GITLAB project.

You can submit a job by executing the following script mpitest_job.sh:

#!/bin/bash

#SBATCH --job-name singularity-mpi-test # Job name
#SBATCH --ntasks=40                            # Total number of nodes
#SBATCH --partition=cpu_dist            # Partition name
#SBATCH --time=00:05:00                 # Max execution time

# modules
module load gcc
module load openmpi
module load singularity

mpirun -n $SLURM_NTASKS singularity exec mpitest.sif /opt/mpitest

Important

The standard way to execute MPI applications with Singularity containers is to run the native mpirun command from the host, which will start Singularity containers and ultimately MPI ranks within the containers.

Then launch job with :

$ sbatch mpitest_job.sh