MPI (Multi-Node) Jobs

Bluegem uses the Intel-MPI suite (based on mvapich2) which is optimised for the Intel Infiniband interconnect used for all nodes of BlueGem.

To compile an MPI job, ensure that you load the intel-mpi module, e.g. "module add intel-mpi". Then choose your compiler, e.g. "module add gcc" or "module add intel/compiler".

Then compile using mpicc, mpicxx, mpif77 as you would do normally (note that BlueGem does not have access to the Intel Fortran compiler, so you will need to use gcc if you want mpif90 etc.)

To run the MPI job, you can use "mpiexec.hydra". This will automatically work out the number of tasks to run and how they are distributed across the cluster based on your sbatch command.

For example

#!/bin/bash -login
module add apps/gromacs-4.6.7

mpiexec.hydra -psm -notunepme -dlb yes -bootstrap slurm mdrun_mpi

is used to run a multi-node Gromacs job. The options are:

  • -psm : This tells mpiexec.hydra to use the infiniband network. You MUST have this option, or else your job will crash.
  • -notunepme -dlb yes : This tells gromacs to not automatically tune the PME calculation. The gromacs autotune doesn't work well, and can lead to your job hanging on BlueGem.
  • -bootstrap slurm : This tells mpiexec.hydra to get all information from Slurm.

If you want to specify the job topology yourself, then you can use hostfiles and the "-n X" option as you would do normally with mpiexec.

Edit this page