Job management with SLURM
In order to submit jobs on the cluster, you must describe the resources (cores, time) used to the Slurm task scheduler. Slurm will execute jobs on remote compute node(s) as soon as the resources described will be available.
Important
You should not run compute code on login (frontend) server cholesy-login
. This is not suited for computations.
There are 2 ways to run a compute code on Cholesky :
-
using a interactive SLURM job : open a terminal on a compute node where you can execute your code. This method is well-suited for light tests and environment configuration (especially for GPU accelerated codes). See the section Interactive jobs.
-
using a Slurm script : submit your script to the scheduler, which will run it when the resources are available. This method is well-suited for production runs.
Slurm is configured with a fairshare policy among the users, which means that the more resources you have asked for in the past days and the lower your priority will be for your jobs if the task manager has several jobs to handle at the same time.
SLURM script
Using submission script is the typical way of creating jobs. In a slurm script, you have to describe :
- the resources you need for your code : partition*, walltime*, number of nodes, memory, number of tasks, number of gpus, etc;
- other parameters for your job : project or account* which your jobs belongs to, job name, output files, etc;
- batch environment : modules, variables;
- running code.
* this resources must be mentioned.
The batch environment is set by loading modules (see Module command) and setting the proper bash variables (PATH, OMP_NUM_THREAD, etc.).
SLURM partitions
- No partition by default, you must select a partition with
--partition
. See sbatch partition directive.
SLURM directives
You describe the resources in the submission script, using sbatch instructions (scripts lines beginning with #SBATCH
). These options can be used directly with the sbatch command, or listed in a script.
Important
The #SBATCH
directives must appear at the top of the submission file, before any other line except for the very first line which should be the shebang (e.g. #!/bin/bash
). See SLURM examples
SBATCH directives to define resources
partition
You must specify partition name :
#SBATCH --partition=<PartitionName>
With PartitionName
in partition names list
nodes
Number of nodes:
#SBATCH --nodes=<nnodes>
ntasks
Number of tasks (MPI processes):
#SBATCH --ntasks=<ntasks>
ntasks-per-node
Number of tasks (MPI processes) per node:
#SBATCH --ntasks-per-node=<ntpn>
cpu-per-tasks
Number of threads per task (Ex: Openmp threads per MPI proces):
#SBATCH --cpus-per-task=<ntpt>
gres=gpu
Number of gpus:
#SBATCH --gres=gpu:<ngpus>
mem
Memory per node:
#SBATCH --mem=<memory>
Default memory is 4 GB per core.
time
You must specify the walltime for your job. if your job is still running after the walltime duration, your job will be killed:
#SBATCH --time=<hh:mm:ss>
account
You must scpecify the account (or project) name for your job.
#SBATCH --account=<name>
SBATCH additional directives
job-name
Specify the job's name:
#SBATCH --job-name=<jobName>
output
Specify the standard output (stdout) for your job:
#SBATCH --output=<outputFile>
By default, a slurm-
.out file is created which jobid is a unique identifier used by Slurm.
error
Specify the error output (stderr) for your job:
#SBATCH --error=<errorFile>
By default, a slurm-
.err file is created which jobid is a unique identifier used by Slurm.
mail-user
Set an email address:
#SBATCH --mail-user=<emailAddress>
mail-type
To be notify by mail when a step has been reached :
#SBATCH --mail-type=<arguments>
Arguments for --mail-type
option are :
BEGIN
: send an email when the job startsEND
: send an email when the job stopsFAIL
: send an email if the job failsALL
: equivalent to BEGIN, END, FAIL.
export
Export user environment variables from environment to batch environment :
- By default all user environment variables will be loaded (
--export=ALL
). -
To avoid dependencies and inconsistencies between submission environment and batch execution environment, disabling this functionality is highly recommended. In order to not export environment variables present at job submission time to the job's environment :
#SBATCH --export=NONE
-
To select explicitly exported variables from the caller's environment to the job environment :
#SBATCH --export=VAR1,VAR2
Submit and monitor jobs
submit job
You have to submit your script (ex. slurm_job.sh) with sbatch
command:
$ sbatch slurm_job.sh
Submitted batch job 755
which responds with the jobid attributed to the job. For example here, jobid is 755. The jobid is a unique identifier that is used by many Slurm commands.
monitor job
The squeue
command shows the list of jobs :
$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
756 cpu_dist singular username PD 0:00 4 (None)
cancel job
The scancel
command cancels job.
To cancel job job0 with jobid 757 (obtained through squeue), you would use :
$ scancel 757
interactive jobs
- Example 1: access one node in interactive for 30 minutes.
$ srun --nodes=1 --time=00:30:00 -p cpu_seq --account=YourAccountProject --pty /bin/bash
[user@node001 ~]$ hostname
node001
- Example 2: access on a node with a GPU for 30 minutes.
$ srun --nodes=1 --time=00:30:00 -p gpu --gres=gpu:1 --account=YourAccountProject --pty /bin/bash
[user@cholesky-gpu01 ~]$ hostname
cholesky-gpu01
job arrays
TODO
chain jobs
TODO
Accounting
Use the command sacct
to get info on your finished jobs.
Note
On Cholesky, the accounting information is restricted to your jobs only.