Serial Execution
Example - Batch script
#!/bin/bash
#SBATCH --job-name=TestJob
#SBATCH --ntasks=3
#SBATCH --time=00:05:00
#SBATCH --cpus-per-task=2
#SBATCH --mem-per-cpu=1G # memory per CPU core
#SBATCH --partition=normal ## the partitions to run in (comma seperated)
srun --ntasks=1 --nodes=1 --cpus-per-task=$SLURM_CPUS_PER_TASK bash -c "sleep 30; echo 'hello 1'"
srun --ntasks=1 --nodes=1 --cpus-per-task=$SLURM_CPUS_PER_TASK bash -c "sleep 30; echo 'hello 2'"
srun --ntasks=1 --nodes=1 --cpus-per-task=$SLURM_CPUS_PER_TASK bash -c "sleep 30; echo 'hello 3'"
Output
Hello 1!
Hello 2!
Hello 3!
Submit the above script using the sbatch
command.
Next, investigate the output of this program by running the sacct
command and adding the start and end parameters to it.
sacct -j 7215 --format=JobID,Start,End,Elapsed,REQCPUS,ALLOCTRES%30
Output
JobID Start End Elapsed ReqCPUS AllocTRES
------------ ------------------- ------------------- ---------- -------- ------------------------------
7215 2022-09-27T21:47:12 2022-09-27T21:48:43 00:01:31 6 billing=6,cpu=6,mem=6G,node=1
7215.batch 2022-09-27T21:47:12 2022-09-27T21:48:43 00:01:31 6 cpu=6,mem=6G,node=1
7215.extern 2022-09-27T21:47:12 2022-09-27T21:48:43 00:01:31 6 billing=6,cpu=6,mem=6G,node=1
7215.0 2022-09-27T21:47:12 2022-09-27T21:47:42 00:00:30 2 cpu=2,mem=2G,node=1
7215.1 2022-09-27T21:47:42 2022-09-27T21:48:12 00:00:30 2 cpu=2,mem=2G,node=1
7215.2 2022-09-27T21:48:12 2022-09-27T21:48:43 00:00:31 2 cpu=2,mem=2G,node=1
Explanation
If you look at the start and end time, from output of the sacct
command above, you will notice that the job steps or number of tasks (7215.0, 7215.1 and 7215.2)
started executing at different times 21:47:12
, 21:47:42
, 21:48:12
respectively. It means that the job steps were executed sequentially.
--cpus-per-task is set at the srun level to get the correct value from the SBATCH flag. The enviromental variable SLURM_CPUS_PER_TASK is the number of CPUs allocated to the batch step.
|
Due to HyperThreading/SMT and how Slurm assigns resources, if you request only one CPU per task, the srun commands may not run in parallel. Unless you disable multithreading the recommendation is to use an even number of CPUs. |