Logging into CSCS machines

Programming Environment for EuroHack18

Your account will be given to you either by the project lead or one of the project mentors in September.

Accessing CSCS systems - the front end node Ela

In contrast to previous years, many of this year's participants already have an account on CSCS machines.
If you have one, we encourage you to use it rather than getting a temporary account. Some of the notes below then do not apply to you (see explanations below).

If you do not have an account, you will be given a temporary account with a username of the form hckXX (for some number XX) and a password. The account is valid until Oct. 8, 2018.

In order to gain access to CSCS systems you need to first access our front-end machine Ela which is accessible as ela.cscs.ch. Access Ela by means of ssh as follows:

ssh -Y [email protected]

Accessing CSCS systems - Piz Daint

The machine that we will use is called Piz Daint. Piz Daint consists of the main supercomputer and a set of front-end nodes which you would typically access for compilation, batch job submission, file transfers and use of performance tools. The front end nodes are accessible as daint.cscs.ch and you can access these as:

ssh -Y daint

Default Environment on Daint

Having logged in to Daint, you will have a default basic environment and directories in 3 file systems which you can access as $HOME, $PROJECT and $SCRATCH. Note that only $SCRATCH is available on the compute nodes, however, it is volatile, and might be scrubbed if files are inactive for long periods of time. Please use $SCRATCH, which points to your subdirectory in the /scratch/snx1600 file system, and do not use /scratch/snx3000, which will be taken out of service during the hackathon.

CSCS uses the module command to change your programming environment. If you issue the command module list you will see your currently loaded modules. If you issue the command module avail you see all of the available modules that you can load. If you want to load a module then issue the command module load <modulename> for some , for example,

module load daint-gpu

loads the environment which is specific to Daint GPU nodes (where we will work all week) For a simple description of what a module provides use module help <modulename>

###Compilation Environment

In order to compile codes you will need to select a programming environment for a specific compiler. Available compilers on Daint are the Cray compiler, Intel and GNU and these are loaded using the module names as shown in the following table.

Cray PrgEnv-cray
PGI PrgEnv-pgi
Intel PrgEnv-intel
GNU PrgEnv-gnu

There are two OpenACC-capable compilers: Cray CCE and NVIDIA PGI (the PrgEnv-cray environment is loaded by default). It is important for the hackathon to use the latest compiler version which is generally NOT the default. Swap to the latest configuration of any programming environment (cray, gnu, intel, pgi) with

module switch cdt cdt/18.04

If you are working with the PGI compiler, one would first have to change the environment:

module switch PrgEnv-cray PrgEnv-pgi

By default this will give you the already outdated PGI 17.5 compiler. The latest compiler (18.7) in our possession can be loaded with:

module rm cray-libsci_acc

PGIV=18.7

module use /apps/common/UES/pgi/$PGIV/modulefiles

module rm pgi

module load pgi/$PGIV

export PGI_VERS_STR=$PGIV.0

Finally, code will only be compiled for GPUs if the following module is loaded:

module load craype-accel-nvidia60

The compiler programming environments on the Cray XC30 provides convenient wrappers that you can use for compilation, and these wrappers ensure that any libraries and header files that you have loaded through the module command are included or linked automatically.

The wrapper command for Fortran codes is ftn The wrapper command for C++ codes is CC The wrapper command for C codes is cc

You just need to use these wrappers and they will take care of adding the include paths and linking the libraries. You may need to load additional modules, and then the wrappers will again take care of adding the correct paths.

You will just need to compile an executable from a single source file as in this example for a Fortran code:

ftn -O2 mpiexercise1.f90 -o myexe1

As mentioned, for GPU compilation and execution you need to the load the following module:

module load craype-accel-nvidia60

Running your code

In order to run your code you will need to get an allocation of processors from the batch system. The mentors will help you generate batch submission scripts. For basic development an interactive session can be started on the internal login nodes of Piz Daint.

When we have been granted a set of processors, we then use the “srun” command to launch jobs on the compute nodes, and the flags that you pass to srun differ depending upon whether you are running MPI or OpenMP parallel applications.

When you have finished your practical you should exit the “salloc” session by typing “exit” so that your processors are returned back to the pool.

Before the EuroHack, you will have to compete for the debug queue:

salloc --partition=debug --ntasks=16 --time=01:00:00

Launching MPI jobs

During the EuroHack we will have a special reservation “hackathon” on the machine which is only available to the our hckXX accounts. For development you should use only few processors, e.g., max. 4. You should therefore issue the following command which will give you 4 processors for up to 1 hour:

salloc --res=hackathon --ntasks=4 --time=01:00:00

If you have a CSCS account: we encourage you to use it to avoid losing potentially valuable work at the end of EuroHack18. However, you will not be able to use the 'hackathon' reservation, and must continue to use the 'debug' partition.

You will then have your prompt returned after a message such as the following:

salloc: Granted job allocation XXXX

You are now able to launch your MPI jobs on the compute nodes.

For MPI jobs that are to be launched on the compute nodes you need to use the “-n” flag to specify how many processes you wish to launch. For example, to launch 4 processes of the myexe1 executable you would issue the following command:

srun -n 4 ./myexe1

Running MPI/OpenMP hybrid jobs

srun offers the flexibility to run multi-threaded distributed jobs, e.g., using both MPI and OpenMP. A common configuration (but by no means always the most efficient one!) is to run one MPI process per node, and 8 threads on the 12 cores of the Intel Haswell socket. E.g., for 4 processes each with 12 threads, you would use:

salloc --res=eurohack --nodes=4 --time=01:00:00

and you will then be given back your prompt.

To launch 12 threads of an OpenMP job you need to specify the number of threads using the OMP_NUM_THREADS variable and then you need to tell aprun that you want 4 processes using the “- n” flag with 12 threads using the “-d” flag as follows:

export OMP_NUM_THREADS=12

srun -n 4 -d 12 ./ompexe1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly