Maxwell cluster @ DESY
This page only provides HiPACE++ specific instructions. For more information please visit the Maxwell documentation.
Create a file profile.hipace, for instance in $HOME, and source it whenever you log in and want to work with
HiPACE++:
#!/usr/bin/env zsh # Shell is assumed to be zsh
module purge
module load maxwell gcc/12 cuda/12.8 openmpi/4 hdf5/1.10.6
# optimize CUDA compilation for A100
export AMREX_CUDA_ARCH=8.0 # use 7.0 for V100, 8.0 for A100 or 9.0 for H200
Install HiPACE++ (the first time, and whenever you want the latest version):
source profile.hipace
git clone https://github.com/Hi-PACE/hipace.git $HOME/src/hipace # only the first time
cd $HOME/src/hipace
rm -rf build
cmake -S . -B build -DHiPACE_COMPUTE=CUDA -DopenPMD_USE_MPI=OFF
cmake --build build -j 16
You can get familiar with the HiPACE++ input file format in our Get started
section, to prepare an input file that suits your needs. You can then create your directory on
DUST /data/dust/group/<your group> or /data/dust/user/<your username>,
where you can put your input file and adapt the following
submission script:
#! /usr/bin/env zsh
#SBATCH --partition=<partition> # mpa # maxgpu # allgpu
#SBATCH --time=01:00:00
#SBATCH --nodes=1
#SBATCH --constraint=A100&GPUx4 # A100&GPUx1
#SBATCH --job-name=HiPACE
#SBATCH --output=hipace-%j-%N.out
#SBATCH --error=hipace-%j-%N.err
export OMP_NUM_THREADS=1
source $HOME/profile.hipace # or correct path to your profile file
mpiexec -n 4 -npernode 4 $HOME/src/hipace/build/bin/hipace inputs
The -npernode must be set to the number of GPUs per node, otherwise not all GPUs are used correctly.
There are nodes with 4 GPUs and 1 GPU (see the Maxwell documentation on compute infrastructure.
for more details and the required constraints). Please set the value accordingly.
Tip
If you encounter an error like module: command not found, this can be fixed in most cases by adding the following piece of code before module purge in your profile.hipace file.
module ()
{
eval `modulecmd bash $*`
}
Tip
Parallel simulations can be largely accelerated by using GPU-aware MPI.
To utilize GPU-aware MPI, the input parameter comms_buffer.on_gpu = 1 must be set and the following flag must be passed in the job script:
export UCX_MEMTYPE_CACHE=n
Note that using GPU-aware MPI may require more GPU memory.