Hybrid Jobs

One can combine multithreading and multinode parallelism using a hybrid OpenMP/MPI approach. Let use the following C++ code, which uses both MPI and OMP:

#include <iostream>
#include <mpi.h>
#include <omp.h>

int main(int argc, char** argv) {
  using namespace std;
  
  MPI_Init(&argc, &argv);

  int world_size, world_rank;
  MPI_Comm_size(MPI_COMM_WORLD, &world_size);
  MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);

  // Get the name of the processor
  char processor_name[MPI_MAX_PROCESSOR_NAME];
  int name_len;
  MPI_Get_processor_name(processor_name, &name_len);

  #pragma omp parallel
  {
  int id = omp_get_thread_num();
  int nthrds = omp_get_num_threads();
  cout << "Hello from thread " << id << " of " << nthrds
       << " on MPI process " << world_rank << " of " << world_size
       << " on node " << processor_name << endl;
  }

  MPI_Finalize();
  return 0;
}

Save it as hybrid.cpp and compile it via the command

module load compilers/mpi/openmpi-slurm
mpicxx -fopenmp -o hybrid hybrid.cpp

Below is a SLURM job script for our code:

#!/bin/bash
#
#SBATCH --job-name="Hybrid Demo" 	# a name for your job
#SBATCH --partition=peregrine-cpu	# partition to which job should be submitted
#SBATCH --qos=cpu_debug				# qos type
#SBATCH --nodes=2                	# node count
#SBATCH --ntasks-per-node=2      	# total number of tasks per node
#SBATCH --cpus-per-task=4        	# cpu-cores per task
#SBATCH --mem-per-cpu=1G         	# memory per cpu-core
#SBATCH --time=00:01:00          	# total run time limit (HH:MM:SS)

export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
module purge
module load compilers/mpi/openmpi-slurm 

srun ./hybrid

Notice how we ask for two nodes, and 2 tasks per node, and 4 cpus-per-task. Our code will be running over two nodes now. Save the script as hybrid.sh and submit it as

sbatch hybrid.sh

The result will be saved in a file named slurm-####.out and should look like

Hello from thread Hello from thread 0 of 4 on MPI process 1 of 4 on node peregrine03 of 4 on MPI process 1 of 4 on node peregrine0
Hello from thread 2 of 4 on MPI process 1 of 4 on node peregrine0

Hello from thread Hello from thread 3 of 4 on MPI process 3 of 4 on node peregrine1Hello from thread 1 of 4 on MPI process 3 of 4 on node 2peregrine1 of 4 on MPI process 3 of 4 on node peregrine1


Hello from thread 1 of 4 on MPI process 0 of 4 on node peregrine0
Hello from thread 1 of 4 on MPI process 1 of 4 on node peregrine0
Hello from thread 2 of 4 on MPI process 0 of Hello from thread 4 on node peregrine0
Hello from thread 0 of 4 on MPI process 0 of 4 on node peregrine0
3 of 4 on MPI process 0 of 4 on node peregrine0
Hello from thread Hello from thread 3 of 4 on MPI process 2 of 4 on node peregrine11 of 4 on MPI process 2 of 4 on node peregrine1

Hello from thread 2 of 4 on MPI process 2 of 4 on node peregrine1
Hello from thread 0 of 4 on MPI process 3 of 4 on node peregrine1
Hello from thread 0 of 4 on MPI process 2 of 4 on node peregrine1