This folder demonstrates cuBLASLMp library API usage.
- Linux
- x86_64
- arm64-sbsa
cuBLASMp is distributed through NVIDIA Developer Zone and also as a part of HPC SDK. cuBLASMp requires CUDA Toolkit, HPC-X, NVSHMEM, NCCL and GDRCOPY to be installed on the system. The samples require C++11 compatible compiler.
git clone https://github.com/NVIDIA/CUDALibrarySamples.git
cd CUDALibrarySamples/cuBLASMp
mkdir build
cd build
export HPCXROOT=<path/to/hpcx>
export CUBLASMP_HOME=<path/to/cublasmp>
export CAL_HOME=<path/to/libcal>
export NVSHMEM_HOME=<path/to/nvshmem>
source ${HPCXROOT}/hpcx-mt-init-ompi.sh
hpcx_load
cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_CUDA_ARCHITECTURES="70;80;90" -DCUBLASMP_INCLUDE_DIRECTORIES=${CUBLASMP_HOME}/include -DCUBLASMP_LIBRARIES=${CUBLASMP_HOME}/lib/libcublasmp.so -DCAL_INCLUDE_DIRECTORIES=${CAL_HOME}/include -DCAL_LIBRARIES=${CAL_HOME}/lib/libcal.so -DNVSHMEM_INCLUDE_DIRECTORIES=${NVSHMEM_HOME}/include -DNVSHMEM_HOST_LIBRARIES=${NVSHMEM_HOME}/lib/libnvshmem_host.so -DNVSHMEM_DEVICE_LIBRARIES=${NVSHMEM_HOME}/lib/libnvshmem_device.a
make -j
Run examples with mpi command and number of processes according to process grid values, i.e.
mpirun -n 2 ./pmatmul
mpirun -n 2 ./pgemm
mpirun -n 2 ./ptrsm
mpirun -n 2 ./psyrk
mpirun -n 2 ./pgeadd
mpirun -n 2 ./ptradd