-
Notifications
You must be signed in to change notification settings - Fork 0
Logging into CSCS machines
Your account will be given to you either by the project lead or one of the project mentors in September.
In contrast to previous years, many of this year's participants already have an account on CSCS machines.
If you have one, we encourage you to use it rather than getting a temporary account. Some of the notes below
then do not apply to you (see explanations below).
If you do not have an account, you will be given a temporary account with a username of the form hckXX (for some number XX) and a password. The account is valid until Oct. 8, 2018.
In order to gain access to CSCS systems you need to first access our front-end machine Ela which is accessible as ela.cscs.ch. Access Ela by means of ssh as follows:
ssh -Y [email protected]
The machine that we will use is called Piz Daint. Piz Daint consists of the main supercomputer and a set of front-end nodes which you would typically access for compilation, batch job submission, file transfers and use of performance tools. The front end nodes are accessible as daint.cscs.ch and you can access these as:
ssh -Y daint
Having logged in to Daint, you will have a default basic environment and directories in 3 file systems which you can access as $HOME, $PROJECT and $SCRATCH. Note that only $SCRATCH is available on the compute nodes, however, it is volatile, and might be scrubbed if files are inactive for long periods of time. Please use $SCRATCH, which points to your subdirectory in the /scratch/snx1600 file system, and do not use /scratch/snx3000, which will be taken out of service during the hackathon.
CSCS uses the module command to change your programming environment.
If you issue the command module list
you will see your currently loaded modules. If you issue the command module avail
you see all of the available modules that you can load. If you want to load a module then issue the command module load <modulename>
for some , for example,
module load daint-gpu
loads the environment which is specific to Daint GPU nodes (where we will work all week)
For a simple description of what a module provides use module help <modulename>
###Compilation Environment
In order to compile codes you will need to select a programming environment for a specific compiler. Available compilers on Daint are the Cray compiler, Intel and GNU and these are loaded using the module names as shown in the following table.
- Cray
PrgEnv-cray
- PGI
PrgEnv-pgi
- Intel
PrgEnv-intel
- GNU
PrgEnv-gnu
There are two OpenACC-capable compilers: Cray CCE and NVIDIA PGI (the PrgEnv-cray environment is loaded by default). It is important for the hackathon to use the latest compiler version which is generally NOT the default. Swap to the latest configuration of any programming environment (cray, gnu, intel, pgi) with
module switch cdt cdt/18.04
If you are working with the PGI compiler, one would first have to change the environment:
module switch PrgEnv-cray PrgEnv-pgi
By default this will give you the already outdated PGI 17.5 compiler. The latest compiler (18.7) in our possession can be loaded with:
module rm cray-libsci_acc
PGIV=18.7
module use /apps/common/UES/pgi/$PGIV/modulefiles
module rm pgi
module load pgi/$PGIV
export PGI_VERS_STR=$PGIV.0
Finally, code will only be compiled for GPUs if the following module is loaded:
module load craype-accel-nvidia60
The compiler programming environments on the Cray XC30 provides convenient wrappers that you can use for compilation, and these wrappers ensure that any libraries and header files that you have loaded through the module command are included or linked automatically.
The wrapper command for Fortran codes is ftn
The wrapper command for C++ codes is CC
The wrapper command for C codes is cc
You just need to use these wrappers and they will take care of adding the include paths and linking the libraries. You may need to load additional modules, and then the wrappers will again take care of adding the correct paths.
You will just need to compile an executable from a single source file as in this example for a Fortran code:
ftn -O2 mpiexercise1.f90 -o myexe1
As mentioned, for GPU compilation and execution you need to the load the following module:
module load craype-accel-nvidia60
In order to run your code you will need to get an allocation of processors from the batch system. The mentors will help you generate batch submission scripts. For basic development an interactive session can be started on the internal login nodes of Piz Daint.
When we have been granted a set of processors, we then use the “srun” command to launch jobs on the compute nodes, and the flags that you pass to srun differ depending upon whether you are running MPI or OpenMP parallel applications.
When you have finished your practical you should exit the “salloc” session by typing “exit” so that your processors are returned back to the pool.
Before the EuroHack, you will have to compete for the debug queue:
salloc --partition=debug --ntasks=16 --time=01:00:00
During the EuroHack we will have a special reservation “hackathon” on the machine which is only available to the our hckXX accounts. For development you should use only few processors, e.g., max. 4. You should therefore issue the following command which will give you 4 processors for up to 1 hour:
salloc --res=hackathon --ntasks=4 --time=01:00:00
If you have a CSCS account: we encourage you to use it to avoid losing potentially valuable work at the end of EuroHack18. However, you will not be able to use the 'hackathon' reservation, and must continue to use the 'debug' partition.
You will then have your prompt returned after a message such as the following:
salloc: Granted job allocation XXXX
You are now able to launch your MPI jobs on the compute nodes.
For MPI jobs that are to be launched on the compute nodes you need to use the “-n” flag to specify how many processes you wish to launch. For example, to launch 4 processes of the myexe1 executable you would issue the following command:
srun -n 4 ./myexe1
srun offers the flexibility to run multi-threaded distributed jobs, e.g., using both MPI and OpenMP. A common configuration (but by no means always the most efficient one!) is to run one MPI process per node, and 8 threads on the 12 cores of the Intel Haswell socket. E.g., for 4 processes each with 12 threads, you would use:
salloc --res=eurohack --nodes=4 --time=01:00:00
and you will then be given back your prompt.
To launch 12 threads of an OpenMP job you need to specify the number of threads using the OMP_NUM_THREADS variable and then you need to tell aprun that you want 4 processes using the “- n” flag with 12 threads using the “-d” flag as follows:
export OMP_NUM_THREADS=12
srun -n 4 -d 12 ./ompexe1