MPI Program Execution on Discovery

To run an MPI program on Discovery, you will need to create a job batch script and submit it to Slurm using the sbatch command. A basic Slurm job batch script should have some Slurm directives, a set of modules, and shell commands required to execute the program. Slurm directives prescribe the resource requirements for the job. The shell commands must include the srun command, which’s used to execute the program on Discovery.

Compile and Run the MPI Sequential Search Program on Discovery

This section discusses in general how to run the MPI sequential search program presented in the section Parallel Programming With MPI.

  • Copy the MPI sequential search program mpi_sequential_search.c the Parallel Programming With MPI page.

  • Create a Makefile to build the program as shown in the following Makefile.

    # Set the name of the source code file and the executable
    SRC=mpi_sequential_search.c
    OBJ=mpi_sequential_search
    CFLAGS= -o
    
    # A command to compile a parallel program with MPI
    mpi_sequential_search:
    	mpicc $(CFLAGS) $(OBJ) $(SRC)
    
    # Delete the executable file
    clean:
    	rm -f $(OBJ)

    Then, run the make command to compile your program.

  • To run your program on Discovery, you need to create a job script mpi_sequential_search.sh as shown below.

    Batch script mpi_sequential_search.sh
    #!/bin/bash
    
    #SBATCH --job-name mpi_sequential_search
    #SBATCH --output mpi_sequential_search-%j.out
    #SBATCH --ntasks=4
    #SBATCH --cpus-per-task=1
    #SBATCH --mem-per-cpu=500M
    #SBATCH --partition=normal
    #SBATCH --time=0-00:15:00
    
    ## Load modules
    module load spack/2022a
    module load gcc/12.1.0-2022a-gcc_8.5.0-ivitefn
    module load openmpi/4.1.3-2022a-gcc_12.1.0-slurm-pmix_v4-qg3dxke
    
    ## Shell commands that compile your program and run it using the 'srun' command.
    srun -n 4 mpi_sequential_search 50

Explanation

Below is a brief explanation of the aim of each a Slurm directive used in the batch script presented above.

The SBATCH directives Description

--job-name

A name that’s used to recognize your job in the queue.

--output

Name of output file for stdout.

--ntasks

Number of times each job gets executed.

--cpus-per-task

Number of CPUs each task will get.

--mem-per-cpu

The size of memory each CPU will get.

--partition

the partition on which the job will run.

--time

To define how long the job will run in real time.

The SBATCH directive --time indicates the maximum execution time of a job running on Discovery. If your job runs longer than what’s set in the time directive, it gets terminated even before it completes its task.

  • Make sure the Slurm job script file mpi_sequential_search.sh is executable by executing the command below.

chmod +x mpi_sequential_search.sh
  • Submit the Slurm job script file to Slurm by executing the following command.

sbatch mpi_sequential_search.sh
  • Find the file named sequential_search-%j.out to get the output of executing your, where %j is the job number you will get on the screen when submitting your job on the Slurm.

Executing a batch script without the use of the srun command would result in running your job on the login node. Remember that the login node should be only used for non-demanding activities. Please use the scheduler (Slurm) through the srun command to run your job on the compute nodes.