Perlmutter @ NERSC ==================== This page only provides HiPACE++ specific instructions. For more information please visit the `Perlmutter documentation `__. Log in with ``ssh @perlmutter-p1.nersc.gov``. Building for GPU ---------------- Create a file ``profile.hipace`` and ``source`` it whenever you log in and want to work with HiPACE++: .. code-block:: bash # please set your project account export proj=_g # _g for GPU accounting # required dependencies module load cmake/3.24.3 module load cray-hdf5-parallel/1.12.2.3 # necessary to use CUDA-Aware MPI and run a job export CRAY_ACCEL_TARGET=nvidia80 # optimize CUDA compilation for A100 export AMREX_CUDA_ARCH=8.0 # compiler environment hints export CC=cc export CXX=CC export FC=ftn export CUDACXX=$(which nvcc) export CUDAHOSTCXX=CC Download HiPACE++ from GitHub (the first time, and whenever you want the latest version): .. code-block:: bash git clone https://github.com/Hi-PACE/hipace.git $HOME/src/hipace # or any other path you prefer Compile the code using CMake .. code-block:: bash source profile.hipace # load the correct modules cd $HOME/src/hipace # or where HiPACE++ is installed rm -rf build cmake -S . -B build -DHiPACE_COMPUTE=CUDA cmake --build build -j 16 You can get familiar with the HiPACE++ input file format in our :doc:`../../run/get_started` section, to prepare an input file that suits your needs. You can then create your directory in your ``$PSCRATCH``, where you can put your input file and adapt the following submission script: .. code-block:: bash #!/bin/bash -l #SBATCH -t 01:00:00 #SBATCH -N 2 #SBATCH -J HiPACE++ # note: must end on _g #SBATCH -A _g #SBATCH -q regular #SBATCH -C gpu #SBATCH -c 32 #SBATCH --exclusive #SBATCH --gpu-bind=none #SBATCH --gpus-per-node=4 #SBATCH -o hipace.o%j #SBATCH -e hipace.e%j # path to executable and input script EXE=$HOME/src/hipace/build/bin/hipace INPUTS=inputs # pin to closest NIC to GPU export MPICH_OFI_NIC_POLICY=GPU # for GPU-aware MPI use the first line #HIPACE_GPU_AWARE_MPI="comms_buffer.on_gpu=1" HIPACE_GPU_AWARE_MPI="" # CUDA visible devices are ordered inverse to local task IDs # Reference: nvidia-smi topo -m srun --cpu-bind=cores bash -c " export CUDA_VISIBLE_DEVICES=\$((3-SLURM_LOCALID)); ${EXE} ${INPUTS} ${HIPACE_GPU_AWARE_MPI}" \ > output.txt and use it to submit a simulation. Note, that this example simulation runs on 8 GPUs, since `-N = 2` yields 2 nodes with 4 GPUs each. .. tip:: Parallel simulations can be largely accelerated by using GPU-aware MPI. To utilize GPU-aware MPI, the input parameter ``comms_buffer.on_gpu = 1`` must be set (see the job script above). Note that using GPU-aware MPI may require more GPU memory.