Running MPI Jobs

MPI is a standard library that’s used to send messages between multiple processes. These processes can be located on the same system (a single multi-core system) or on a collection of distributed servers. It’s an efficient inter-process Communication. MPI is a set of function calls and libraries that implement a distributed execution of a program. Distributed doesn’t necessarily mean that you must run your MPI job on many machines. In fact, you could run multiple MPI processes on a laptop.

MPI Variations

stdout

Description

MPI

Message passing library standard

OpenMPI

Open Source implementation of MPI library.

OpenMp

Compiler add-on

There are other implementations of MPI such as: MVAPICH, MPICH and IntelMPI

MPI Benefits

  • Thread based parallelism

  • Better performance on large shared-memory nodes.

  • Uses fastest available interconnection.

  • No need to recompile your program on every cluster.

  • Portable and easy to use

The below C program starts 3 processes in which each of those processes would communicate with each other using MPI.

Example

  • C program 'program.c'

  • Batch script 'script.sh'

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <mpi.h>
#include <sched.h>

int main(int argc, char *argv[])
{
    int tid, nthreads;
    char *cpu_name;

    MPI_Init(&argc, &argv);

    MPI_Comm_rank(MPI_COMM_WORLD, &tid);

    MPI_Comm_size(MPI_COMM_WORLD, &nthreads);

    cpu_name = (char *)calloc(80, sizeof(char));

    gethostname(cpu_name, 80);

    // get num of cpu
    cpu_set_t cpuset;
    CPU_ZERO(&cpuset);
    sched_getaffinity(0, sizeof(cpu_set_t), &cpuset);
    int cpu_count = CPU_COUNT(&cpuset);

    printf("Hi MPI user: from process = %i on machine=%s, of %i processes, using %d CPUs\n", tid, cpu_name, nthreads, cpu_count);

    free(cpu_name);

    MPI_Finalize();
    return 0;
}
#!/bin/bash

#SBATCH --job-name=MpiJob
#SBATCH --output=MpiJob.out
#SBATCH --ntasks=3  ## number of tasks (analyses) to run
#SBATCH --cpus-per-task=2  ## the number of cpus allocated to each task
#SBATCH --mem-per-cpu=1G   ## memory per CPU core
#SBATCH --partition=normal  ## the partitions to run in (comma seperated)
#SBATCH --time=0-00:10:00  ## time for analysis (day-hour:min:sec)

# Execute job steps
srun --cpus-per-task=$SLURM_CPUS_PER_TASK ./program

Before submitting the job script, you need to compile the program by running the following commands.

module load spack/2022a  gcc/12.1.0-2022a-gcc_8.5.0-ivitefn openmpi/4.1.3-2022a-gcc_12.1.0-slurm-pmix_v4-qg3dxke
mpicc -o program program.c

Then, submit the job script script.sh.

sbatch script.sh

Output

Hi MPI user: from process = 1 on machine=discovery-c5.cluster.local, of 3 processes, using 2 CPUs
Hi MPI user: from process = 2 on machine=discovery-c5.cluster.local, of 3 processes, using 2 CPUs
Hi MPI user: from process = 0 on machine=discovery-c5.cluster.local, of 3 processes, using 2 CPUs

Explanation

In the job steps, first the C program gets compiled by using MPI compiler and then, srun command executes the program. The --ntasks flag is the number of MPI processes to run.

The script executed 3 processes and the incremented integer values show the communication between the processes. All processes executed on discovery-c5 node. On adjusting the value of the --ntasks flag, the number of execution processes will be adjusted. Therefore, if it’s set to 1 process, only one print statement will be shown: Hi MPI user: from process = 0 on machine=discovery-c5.cluster.local, of 1 processes, using 2 CPUs.