From 1c0121d60e0d8c7884744ca7759871104f270f32 Mon Sep 17 00:00:00 2001 From: Caspar van Leeuwen Date: Tue, 15 Oct 2024 10:22:36 +0200 Subject: [PATCH 01/15] A start to modify docs for eessi_mixin (WIP) --- docs/test-suite/writing-portable-tests.md | 225 +++++++++++++++------- 1 file changed, 159 insertions(+), 66 deletions(-) diff --git a/docs/test-suite/writing-portable-tests.md b/docs/test-suite/writing-portable-tests.md index d8c0898a1..bebcbc660 100644 --- a/docs/test-suite/writing-portable-tests.md +++ b/docs/test-suite/writing-portable-tests.md @@ -178,7 +178,165 @@ This test _works_, but is _not_ very portable. If we move to a system with 192 c ### Step 3: implementing as a portable ReFrame test { #as-portable-reframe-test } -In the previous section, there were several system-specific items in the test. In this section, we will show how we use the EESSI hooks to avoid hard-coding system specific information. We do this by replacing the system-specific parts of the test from Step 2 bit by bit. The full final test can be found under `tutorials/mpi4py/mpi4py_portable.py` in the [EESSI test suite](https://github.com/EESSI/test-suite/) repository. +In step 2, there were several system-specific items in the test. In this section, we will show how we use inheritance from the `EESSI_Mixin` class to avoid hard-coding system specific information. The full final test can be found under `tutorials/mpi4py/mpi4py_portable.py` in the [EESSI test suite](https://github.com/EESSI/test-suite/) repository. + +#### How EESSI_Mixin works + +The `EESSI_Mixin` class provides standardized functionality that should be useful to all tests in the EESSI test-suite. One of it's key functions is to make sure tests dynamically try to determine sensible values for the things that were system specific in Step 2. For example, instead of hard coding a task count, the test inheriting from `EESSI_Mixin` will determine this dynamically based on the amount of available cores per node, and a declaration from the inheriting test class about how you want to instantiated tasks. + +For example, if you want to launch one task for each core, the test that inherits from `EESSI_Mixin` would only have to declare + +``` +compute_unit = COMPUTE_UNIT[CPU] +``` + +The `EESSI_Mixin` class then takes care of querying the ReFrame config file for the node topology, and setting the correct number of tasks for each node. + +Another feature is that it sets defaults for a few items, such as the `valid_prog_environs = ['default']`. These will likely be the same for _most_ tests in the EESSI test suite, and when they _do_ need to be different, one can easily overwrite them in the child class. + +Most of the functionality in the `EESSI_Mixin` class require certain class attributes (such as the `compute_unit` above) to be set by the child class, so that the `EESSI_Mixin` class can use those as input. It is important that these attributes are set _before_ the stage in which the `EESSI_Mixin` class needs them (see the stages of the [ReFrame regression pipeline](https://reframe-hpc.readthedocs.io/en/stable/pipeline.html)). To support test developers, the `EESSI_Mixin` class checks if these attributes are set, and gives verbose feedback in case any attributes are missing. + +#### Inheriting from EESSI_Mixin + +The first step is to actually inherit from the `EESSI_Mixin` class: + +``` +from eessi.testsuite.eessi_mixin import EESSI_Mixin +... +@rfm.simple_test +class EESSI_MPI4PY(rfm.RunOnlyRegressionTest, EESSI_Mixin): +``` + +#### Removing hard-coded test scales + +First, we remove + +``` + # ReFrame will generate a test for each scale + scale = parameter([2, 128, 256]) +``` +from the test. The `EESSI_Mixin` class will define the default set of scales on which this test will be run as +``` +from eessi.testsuite.constants import SCALES +... + scale = parameter(SCALES.keys()) +``` + +This will ensure the test will run at all of the default scales, as defined by the `SCALES` [constant](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/constants.py). + +If, and only if, your test can not run on all of those scales should you overwrite this parameter in your child class. For example, if you have a test that does not support running on multiple nodes, you could define a filtering function outside of the class +``` +def filter_scales(): + return [ + k for (k,v) in SCALES.items() + if v['num_nodes'] == 1 + ] +``` +and then in the class body overwrite the scales with a subset of items from the `SCALES` constant: +``` + scale = parameter(filter_scales()) +``` + +Next, we also remove + +``` + @run_after('init') + def define_task_count(self): + self.num_tasks = self.scale + self.num_tasks_per_node = min(self.num_tasks, 128) +``` + +as `num_tasks` and and `num_tasks_per_node` will be set by the `assign_tasks_per_compute_unit` [hook](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/hooks.py), which is invoked by the `EESSI_Mixin` class. + + + + +We define the following class variables: + + + +``` +from eessi.testsuite.utils import find_modules +... + module_name = parameter(find_modules('mpi4py')) +``` + +BLA BLA TODO: Explain per attribute that needs to be set how to set it, and before which phase + + + +### Background of the mpi4py test { #background-of-mpi4py-test } +To understand what this test does, you need to know some basics of MPI. If you know about MPI, you can skip this section. + +The MPI standard defines how to communicate between multiple processes that work on a common computational task. Each process that is part of the computational task gets a unique identifier (0 to N-1 for N processes), the MPI rank, which can e.g. be used to distribute a workload. The MPI standard defines communication between two given processes (so-called point-to-point communication), but also between a set of N processes (so-called collective communication). + +An example of such a collective operation is the [MPI_REDUCE](https://www.mpi-forum.org/docs/mpi-4.1/mpi41-report/node130.htm#Node130) call. It reduces data elements from multiple processes with a certain operation, e.g. it takes the sum of all elements or multiplication of all elements. + +#### The mpi4py test +In this example, we will implement a test that does an `MPI_Reduce` on the rank, using the `MPI.SUM` operation. This makes it easy to validate the result, as we know that for N processes, the theoretical sum of all ranks (0, 1, ... N-1) is `(N * (N-1) / 2)`. + +Our initial code is a python script `mpi4py_reduce.py`, which can be found in `tutorials/mpi4py/src/mpi4py_reduce.py` in the [EESSI test suite](https://github.com/EESSI/test-suite/) repository: +```python +#!/usr/bin/env python +""" +MPI_Reduce on MPI rank. This should result in a total of (size * (size - 1) / 2), +where size is the total number of ranks. +Prints the total number of ranks, the sum of all ranks, and the time elapsed for the reduction." +""" + +import argparse +import time + +from mpi4py import MPI + +parser = argparse.ArgumentParser(description='mpi4py reduction benchmark', + formatter_class=argparse.ArgumentDefaultsHelpFormatter) +parser.add_argument('--n_warmup', type=int, default=100, + help='Number of warmup iterations') +parser.add_argument('--n_iter', type=int, default=1000, + help='Number of benchmark iterations') +args = parser.parse_args() + +n_warmup = args.n_warmup +n_iter = args.n_iter + +size = MPI.COMM_WORLD.Get_size() +rank = MPI.COMM_WORLD.Get_rank() +name = MPI.Get_processor_name() + +# Warmup +t0 = time.time() +for i in range(n_warmup): + total = MPI.COMM_WORLD.reduce(rank, op=MPI.SUM) + +# Actual reduction, multiple iterations for accuracy of timing +t1 = time.time() +for i in range(n_iter): + total = MPI.COMM_WORLD.reduce(rank, op=MPI.SUM) +t2 = time.time() +total_time = (t2 - t1) / n_iter + +if rank == 0: + print(f"Total ranks: {size}") + print(f"Sum of all ranks: {total}") # Should be (size * (size-1) / 2) + print(f"Time elapsed: {total_time:.3e}") +``` + +Assuming we have `mpi4py` available, we could run this manually using +``` +$ mpirun -np 4 python3 mpi4py_reduce.py +Total ranks: 4 +Sum of all ranks: 6 +Time elapsed: 3.609e-06 +``` + +This started 4 processes, with ranks 0, 1, 2, 3, and then summed all the ranks (`0+1+2+3=6`) on the process with rank 0, which finally printed all this output. The whole reduction operation is performed `n_iter` times, so that we get a more reproducible timing. + +### Legacy: former step 3: implementing as a portable ReFrame test { #as-portable-reframe-test-legacy } + +The approach using inheritance from the `eessi_mixin` class, described above, is strongly preferred and recommended. There might be certain tests that do not fit the standardized approach of `eessi_mixin`, but usually that will be solvable by overwriting hooks set by `eessi_mixin` in the inheriting class. In the rare case that your test is so exotic that even this doesn't provide a sensible solution, you can still invoke the hooks used by `eessi_mixin` manually. Note that this used to be the default way of writing tests for the EESSI test suite. + +In step 2, there were several system-specific items in the test. In this section, we will show how we use the EESSI hooks to avoid hard-coding system specific information. We do this by replacing the system-specific parts of the test from Step 2 bit by bit. The full final test can be found under `tutorials/mpi4py/mpi4py_portable_legacy.py` in the [EESSI test suite](https://github.com/EESSI/test-suite/) repository. #### Replacing hard-coded test scales (mandatory) @@ -476,69 +634,4 @@ on a system with 192 cores per node. I.e. any test of 2 nodes (384 cores) or abo For example, any pipeline hook attached to the `setup` step making use of `self.num_tasks`, should be defined after the function calling the test-suite hook `assign_tasks_per_compute_unit`. -### Background of the mpi4py test { #background-of-mpi4py-test } -To understand what this test does, you need to know some basics of MPI. If you know about MPI, you can skip this section. - -The MPI standard defines how to communicate between multiple processes that work on a common computational task. Each process that is part of the computational task gets a unique identifier (0 to N-1 for N processes), the MPI rank, which can e.g. be used to distribute a workload. The MPI standard defines communication between two given processes (so-called point-to-point communication), but also between a set of N processes (so-called collective communication). - -An example of such a collective operation is the [MPI_REDUCE](https://www.mpi-forum.org/docs/mpi-4.1/mpi41-report/node130.htm#Node130) call. It reduces data elements from multiple processes with a certain operation, e.g. it takes the sum of all elements or multiplication of all elements. - -#### The mpi4py test -In this example, we will implement a test that does an `MPI_Reduce` on the rank, using the `MPI.SUM` operation. This makes it easy to validate the result, as we know that for N processes, the theoretical sum of all ranks (0, 1, ... N-1) is `(N * (N-1) / 2)`. - -Our initial code is a python script `mpi4py_reduce.py`, which can be found in `tutorials/mpi4py/src/mpi4py_reduce.py` in the [EESSI test suite](https://github.com/EESSI/test-suite/) repository: -```python -#!/usr/bin/env python -""" -MPI_Reduce on MPI rank. This should result in a total of (size * (size - 1) / 2), -where size is the total number of ranks. -Prints the total number of ranks, the sum of all ranks, and the time elapsed for the reduction." -""" - -import argparse -import time - -from mpi4py import MPI - -parser = argparse.ArgumentParser(description='mpi4py reduction benchmark', - formatter_class=argparse.ArgumentDefaultsHelpFormatter) -parser.add_argument('--n_warmup', type=int, default=100, - help='Number of warmup iterations') -parser.add_argument('--n_iter', type=int, default=1000, - help='Number of benchmark iterations') -args = parser.parse_args() - -n_warmup = args.n_warmup -n_iter = args.n_iter - -size = MPI.COMM_WORLD.Get_size() -rank = MPI.COMM_WORLD.Get_rank() -name = MPI.Get_processor_name() - -# Warmup -t0 = time.time() -for i in range(n_warmup): - total = MPI.COMM_WORLD.reduce(rank, op=MPI.SUM) - -# Actual reduction, multiple iterations for accuracy of timing -t1 = time.time() -for i in range(n_iter): - total = MPI.COMM_WORLD.reduce(rank, op=MPI.SUM) -t2 = time.time() -total_time = (t2 - t1) / n_iter - -if rank == 0: - print(f"Total ranks: {size}") - print(f"Sum of all ranks: {total}") # Should be (size * (size-1) / 2) - print(f"Time elapsed: {total_time:.3e}") -``` - -Assuming we have `mpi4py` available, we could run this manually using -``` -$ mpirun -np 4 python3 mpi4py_reduce.py -Total ranks: 4 -Sum of all ranks: 6 -Time elapsed: 3.609e-06 -``` -This started 4 processes, with ranks 0, 1, 2, 3, and then summed all the ranks (`0+1+2+3=6`) on the process with rank 0, which finally printed all this output. The whole reduction operation is performed `n_iter` times, so that we get a more reproducible timing. From c35fe6f1ee04ce30a01e6371a069f50846a4b4bc Mon Sep 17 00:00:00 2001 From: Caspar van Leeuwen Date: Mon, 11 Nov 2024 18:10:33 +0100 Subject: [PATCH 02/15] Add description on replacing valid_systems --- docs/test-suite/writing-portable-tests.md | 36 +++++++++++++++++++---- 1 file changed, 30 insertions(+), 6 deletions(-) diff --git a/docs/test-suite/writing-portable-tests.md b/docs/test-suite/writing-portable-tests.md index bebcbc660..e1df0cfc4 100644 --- a/docs/test-suite/writing-portable-tests.md +++ b/docs/test-suite/writing-portable-tests.md @@ -184,13 +184,13 @@ In step 2, there were several system-specific items in the test. In this section The `EESSI_Mixin` class provides standardized functionality that should be useful to all tests in the EESSI test-suite. One of it's key functions is to make sure tests dynamically try to determine sensible values for the things that were system specific in Step 2. For example, instead of hard coding a task count, the test inheriting from `EESSI_Mixin` will determine this dynamically based on the amount of available cores per node, and a declaration from the inheriting test class about how you want to instantiated tasks. -For example, if you want to launch one task for each core, the test that inherits from `EESSI_Mixin` would only have to declare +To illustrate this, suppose want to launch one task for each core for your test. In that case, your test (that inherits from `EESSI_Mixin`) would only have to declare ``` compute_unit = COMPUTE_UNIT[CPU] ``` -The `EESSI_Mixin` class then takes care of querying the ReFrame config file for the node topology, and setting the correct number of tasks for each node. +The `EESSI_Mixin` class then takes care of querying the ReFrame config file for the cpu topology of the node, and setting the correct number of tasks per node. Another feature is that it sets defaults for a few items, such as the `valid_prog_environs = ['default']`. These will likely be the same for _most_ tests in the EESSI test suite, and when they _do_ need to be different, one can easily overwrite them in the child class. @@ -207,6 +207,7 @@ from eessi.testsuite.eessi_mixin import EESSI_Mixin class EESSI_MPI4PY(rfm.RunOnlyRegressionTest, EESSI_Mixin): ``` + #### Removing hard-coded test scales First, we remove @@ -248,12 +249,15 @@ Next, we also remove as `num_tasks` and and `num_tasks_per_node` will be set by the `assign_tasks_per_compute_unit` [hook](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/hooks.py), which is invoked by the `EESSI_Mixin` class. +Instead, we only set the `compute_unit`. A task will be launched for each compute unit. E.g. +``` + compute_unit = COMPUTE_UNIT[CPU] +``` +will launch one task per (physical) CPU core. Other options are `COMPUTE_UNIT[HWTHREAD]` (one task per hardware thread), `COMPUTE_UNIT[NUMA_NODE]` (one task per numa node), `COMPUTE_UNIT[CPU_SOCKET]` (one task per CPU socket), `CPUTE_UNIT[GPU]` (one task per GPU) and `COMPUTE_UNIT[NODE]` (one task per node). Check the `COMPUTE_UNIT` [constant](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/constants.py) for the full list of valid compute units. The number of cores per task will automatically be set based on this as the ratio of the number of cores in a node to the number of tasks per node (rounded down). - -We define the following class variables: - - +#### Replacing hard-coded module names +Instead of hard-coding a module name, we parameterize over all module names that match a certain regular expression. ``` from eessi.testsuite.utils import find_modules @@ -261,6 +265,26 @@ from eessi.testsuite.utils import find_modules module_name = parameter(find_modules('mpi4py')) ``` +#### Replacing hard-coded system names and programming environments +First, we remove the hard-coded system name and programming environment. I.e. we remove +``` + valid_prog_environs = ['default'] + valid_systems = ['snellius'] +``` +The `EESSI_Mixin` class sets `valid_prog_environs = ['default']` by default, so that is no longer needed in the child class (but it can be overwritten if needed). The `valid_systems` is instead replaced by a declaration of what type of device type is needed. We'll create an `mpi4py` test that runs on CPU only: +``` + device_type = DEVICE_TYPES[CPU] +``` +but note if we would have wanted to also generate test instances to test GPU <=> GPU communication, we could have defined this as a paremeter: +``` + device_type = parameter([DEVICE_TYPES[CPU], DEVICE_TYPES[GPU]]) +``` + +The device type that is set will be used by the `filter_valid_systems_by_device_type` hook to check in the ReFrame configuration file which of the current partitions contain the relevant device. Typically, we don't set the `DEVICE_TYPES[CPU]` on a GPU partition in the ReFrame configuration, so that we skip all CPU-only tests on GPU nodes. + +`EESSI_Mixin` also filters based on the supported scales, which can again be configured per partition in the ReFrame configuration file. This can e.g. be used to avoid running large-scale tests on partitions that don't have enough nodes to run them. + + BLA BLA TODO: Explain per attribute that needs to be set how to set it, and before which phase From 5429d78bb13bc5595375bfb5e7b3eb8483d5ad85 Mon Sep 17 00:00:00 2001 From: Caspar van Leeuwen Date: Tue, 12 Nov 2024 16:40:37 +0100 Subject: [PATCH 03/15] First complete draft of writing portable tests with EESSI_Mixin --- docs/test-suite/writing-portable-tests.md | 242 +++++++++++++++++----- 1 file changed, 191 insertions(+), 51 deletions(-) diff --git a/docs/test-suite/writing-portable-tests.md b/docs/test-suite/writing-portable-tests.md index e1df0cfc4..31ff578a5 100644 --- a/docs/test-suite/writing-portable-tests.md +++ b/docs/test-suite/writing-portable-tests.md @@ -178,7 +178,7 @@ This test _works_, but is _not_ very portable. If we move to a system with 192 c ### Step 3: implementing as a portable ReFrame test { #as-portable-reframe-test } -In step 2, there were several system-specific items in the test. In this section, we will show how we use inheritance from the `EESSI_Mixin` class to avoid hard-coding system specific information. The full final test can be found under `tutorials/mpi4py/mpi4py_portable.py` in the [EESSI test suite](https://github.com/EESSI/test-suite/) repository. +In step 2, there were several system-specific items in the test. In this section, we will show how we use inheritance from the `EESSI_Mixin` class to avoid hard-coding system specific information. The full final test can be found under `tutorials/mpi4py/mpi4py_portable_mixin.py` in the [EESSI test suite](https://github.com/EESSI/test-suite/) repository. #### How EESSI_Mixin works @@ -207,7 +207,6 @@ from eessi.testsuite.eessi_mixin import EESSI_Mixin class EESSI_MPI4PY(rfm.RunOnlyRegressionTest, EESSI_Mixin): ``` - #### Removing hard-coded test scales First, we remove @@ -253,8 +252,10 @@ Instead, we only set the `compute_unit`. A task will be launched for each comput ``` compute_unit = COMPUTE_UNIT[CPU] ``` -will launch one task per (physical) CPU core. Other options are `COMPUTE_UNIT[HWTHREAD]` (one task per hardware thread), `COMPUTE_UNIT[NUMA_NODE]` (one task per numa node), `COMPUTE_UNIT[CPU_SOCKET]` (one task per CPU socket), `CPUTE_UNIT[GPU]` (one task per GPU) and `COMPUTE_UNIT[NODE]` (one task per node). Check the `COMPUTE_UNIT` [constant](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/constants.py) for the full list of valid compute units. The number of cores per task will automatically be set based on this as the ratio of the number of cores in a node to the number of tasks per node (rounded down). +will launch one task per (physical) CPU core. Other options are `COMPUTE_UNIT[HWTHREAD]` (one task per hardware thread), `COMPUTE_UNIT[NUMA_NODE]` (one task per numa node), `COMPUTE_UNIT[CPU_SOCKET]` (one task per CPU socket), `COMPUTE_UNIT[GPU]` (one task per GPU) and `COMPUTE_UNIT[NODE]` (one task per node). Check the `COMPUTE_UNIT` [constant](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/constants.py) for the full list of valid compute units. The number of cores per task will automatically be set based on this as the ratio of the number of cores in a node to the number of tasks per node (rounded down). The `EESSI_Mixin` class will also set the `OMP_NUM_THREADS` environment variable equal to the number of cores per task. +!!! note + `compute_unit` needs to be set in ReFrame's `setup` phase (or earlier). For the different phases of the pipeline, please see the [documentation on how ReFrame executes tests](https://reframe-hpc.readthedocs.io/en/stable/pipeline.html). #### Replacing hard-coded module names Instead of hard-coding a module name, we parameterize over all module names that match a certain regular expression. @@ -265,6 +266,19 @@ from eessi.testsuite.utils import find_modules module_name = parameter(find_modules('mpi4py')) ``` +This parameter will be a list of all module names matching the expression, and each test instance will load the respective module before running the test. + +Furthermore, we remove the hook that sets `self.module`: +``` +@run_after('init') +def set_modules(self): + self.modules = [self.module_name] +``` +This is now taken care of by the `EESSI_Mixin` class. + +!!! note + `module_name` needs to be set in ReFrame's `init` phase (or earlier) + #### Replacing hard-coded system names and programming environments First, we remove the hard-coded system name and programming environment. I.e. we remove ``` @@ -280,14 +294,186 @@ but note if we would have wanted to also generate test instances to test GPU <=> device_type = parameter([DEVICE_TYPES[CPU], DEVICE_TYPES[GPU]]) ``` -The device type that is set will be used by the `filter_valid_systems_by_device_type` hook to check in the ReFrame configuration file which of the current partitions contain the relevant device. Typically, we don't set the `DEVICE_TYPES[CPU]` on a GPU partition in the ReFrame configuration, so that we skip all CPU-only tests on GPU nodes. +The device type that is set will be used by the `filter_valid_systems_by_device_type` hook to check in the ReFrame configuration file which of the current partitions contain the relevant device. Typically, we don't set the `DEVICE_TYPES[CPU]` on a GPU partition in the ReFrame configuration, so that we skip all CPU-only tests on GPU nodes. Check the `DEVICE_TYPES` [constant](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/constants.py) for the full list of valid compute units. `EESSI_Mixin` also filters based on the supported scales, which can again be configured per partition in the ReFrame configuration file. This can e.g. be used to avoid running large-scale tests on partitions that don't have enough nodes to run them. +!!! note + `device_type` needs to be set in ReFrame's `init` phase (or earlier) + +#### Requesting sufficient memory +To make sure you get an allocation with sufficient memory, your test should declare how much memory it needs per node by defining a `required_mem_per_node` function in your test class that returns the required memory per node (in MiB). Note that the amount of required memory generally depends on the amount of tasks that are launched per node (`self.num_tasks_per_node`). + +Our `mpi4py` test takes around 200 MB per task, plus about 70 MB for every additional task. We round this up a little so that we can be sure the test won't run out of memory if memory consumption is slightly different on a different system. Thus, we define: -BLA BLA TODO: Explain per attribute that needs to be set how to set it, and before which phase +``` +def required_mem_per_node(self): + return self.num_tasks_per_node * 100 + 250 +``` +While rounding up is advisable, do keep your estimate realistic. Too high a memory request will mean the test will get skipped on systems that cannot satisfy that memory request. Most HPC systems have at least 1 GB per core, and most laptop/desktops have at least 8 GB total. Desiging a test so that it fits within those memory constraints will ensure it can be run almost anywhere. +!!! note + The easiest way to get the memory consumption of your test at various task counts is to execute it on a system which runs jobs in [cgroups](https://en.wikipedia.org/wiki/Cgroups), define `measure_memory_usage = True` in your class body, and make the `required_mem_per_node` function return a constant amount of memory equal to the available memory per node on your test system. This will cause the `EESSI_Mixin` class to read out the maximum memory usage of the cgroup (on the head node of your allocation, in case of multi-node tests) and report it as a performance number. + +#### Process binding +The `EESSI_Mixin` class binds processes to their respective number of cores automatically using the `hooks.set_compact_process_binding` hook. E.g. for a pure MPI test like `mpi4py`, each task will be bound to a single core. For hybrid tests that do both multiprocessing and multithreading, tasks are bound to a sequential number of cores. E.g. on a node with 128 cores and a hybrid test with 64 tasks and 2 threads per task, the first task will be bound to core 0 and 1, second task to core 2 and 3, etc. To override this behaviour, one would have to overwrite the +``` +@run_after('setup') +def assign_tasks_per_compute_unit(self): + ... +``` +function. Note that this also calls other hooks (such as `hooks.assign_task_per_compute_unit`) that you probably still want to invoke. Check the `EESSI_Mixin` [class definition](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/eessi_mixin.py) to see which hooks you still want to call. + + +#### Thread binding (optional) +Thread binding is not done by default, but can be done by invoking the `hooks.set_compact_thread_binding` hook: +``` +@run_after('setup') +def set_binding(self): + hooks.set_compact_thread_binding(self) +``` + +#### Skipping tests instances when required (optional) +Preferably, we prevent test instances from being generated (i.e. before ReFrame's `setup` phase) if we know that they cannot run on a certain system. However, sometimes we need information on the nodes that will run it, which is only available _after_ the `setup` phase. That is the case for anything where we need information from e.g. the [reframe.core.pipeline.RegressionTest.current_partition](https://reframe-hpc.readthedocs.io/en/stable/regression_test_api.html#reframe.core.pipeline.RegressionTest.current_partition). + +For example, we might know that a test only scales to around 300 tasks, and above that, execution time increases rapidly. In that case, we'd want to skip any test instance that results in a larger amount of tasks, but we only know this after `assign_tasks_per_compute_unit` has been called (which is done by `EESSI_Mixin` in after the `setup` stage). For example, the `2_nodes` scale would run fine on systems with 128 cores per node, but would exceed the task limit of 300 on systems with `192` cores per node. + +We can skip any generated test cases using the `skip_if` function. For example, to skip the test if the total task count exceeds 300, we'd need to call `skip_if` _after_ the `setup` stage (so that `self.num_tasks` is already set): + +```python +@run_after('setup') + hooks.assign_tasks_per_compute_unit(test=self, compute_unit=COMPUTE_UNIT[CPU]) + + max_tasks = 300 + self.skip_if(self.num_tasks > max_tasks, + f'Skipping test: more than {max_tasks} tasks are requested ({self.num_tasks})') +``` + +The `mpi4py` scales almost indefinitely, but if we were to set it for the sake of this example, one would see: +```bash +[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=16_nodes /38aea144 @snellius:genoa+default +[ SKIP ] ( 1/13) Skipping test: more than 300 tasks are requested (3072) +[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=8_nodes /bfc4d3d4 @snellius:genoa+default +[ SKIP ] ( 2/13) Skipping test: more than 300 tasks are requested (1536) +[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=4_nodes /8de369bc @snellius:genoa+default +[ SKIP ] ( 3/13) Skipping test: more than 300 tasks are requested (768) +[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=2_nodes /364146ba @snellius:genoa+default +[ SKIP ] ( 4/13) Skipping test: more than 300 tasks are requested (384) +[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=1_node /8225edb3 @snellius:genoa+default +[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=1_2_node /4acf483a @snellius:genoa+default +[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=1_4_node /fc3d689b @snellius:genoa+default +[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=1_8_node /73046a73 @snellius:genoa+default +[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=1cpn_4nodes /f08712a2 @snellius:genoa+default +[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=1cpn_2nodes /23cd550b @snellius:genoa+default +[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=4_cores /bb8e1349 @snellius:genoa+default +[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=2_cores /4c0c7c9e @snellius:genoa+default +[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=1_core /aa83ba9e @snellius:genoa+default + +... +``` +on a system with 192 cores per node. I.e. any test of 2 nodes (384 cores) or above would be skipped because it exceeds our max task count. + +#### Summary +To make the test portable, we added additional imports: +``` +from eessi.testsuite.eessi_mixin import EESSI_Mixin +from eessi.testsuite.constants import COMPUTE_UNIT, DEVICE_TYPES, CPU +from eessi.testsuite.utils import find_modules +``` + +Made sure the test inherits from `EESSI_Mixin`: +``` +@rfm.simple_test +class EESSI_MPI4PY(rfm.runOnlyRegressionTest, EESSI_Mixin): +``` + +Removed the following from the class body: +``` +valid_prog_environs = ['default'] +valid_systems = ['snellius'] + +module_name = parameter(['mpi4py/3.1.4-gompi-2023a', 'mpi4py/3.1.5-gompi-2023b']) +scale = parameter([2, 128, 256]) +``` + +Added the following to the class body: +``` +device_type = DEVICE_TYPES[CPU] +compute_unit = COMPUTE_UNIT[CPU] + +module_name = parameter(find_modules('mpi4py')) +``` + +Defined the class method: +``` +def required_mem_per_node(self): + return self.num_tasks_per_node * 100 + 250 +``` + +Removed the ReFrame pipeline hook that set the `self.modules`: +``` +@run_after('init') +def set_modules(self): + self.modules = [self.module_name] +``` + +Removed the ReFrame pipeline hook that set the number of tasks and number of tasks per node: +``` +@run_after('init') +def define_task_count(self): + # Set the number of tasks, self.scale is now a single number out of the parameter list + # https://reframe-hpc.readthedocs.io/en/stable/regression_test_api.html#reframe.core.pipeline.RegressionTest.num_tasks + self.num_tasks = self.scale + # Set the number of tasks per node to either be equal to the number of tasks, but at most 128, + # since we have 128-core nodes + # https://reframe-hpc.readthedocs.io/en/stable/regression_test_api.html#reframe.core.pipeline.RegressionTest.num_tasks_per_node + self.num_tasks_per_node = min(self.num_tasks, 128) +``` + +The final test is thus: +``` +""" +This module tests mpi4py's MPI_Reduce call +""" + +import reframe as rfm +import reframe.utility.sanity as sn + +from reframe.core.builtins import variable, parameter, run_after, performance_function, sanity_function + +from eessi.testsuite.eessi_mixin import EESSI_Mixin +from eessi.testsuite.constants import COMPUTE_UNIT, DEVICE_TYPES, CPU +from eessi.testsuite.utils import find_modules + +@rfm.simple_test +class EESSI_MPI4PY(rfm.RunOnlyRegressionTest, EESSI_Mixin): + device_type = DEVICE_TYPES[CPU] + compute_unit = COMPUTE_UNIT[CPU] + + module_name = parameter(find_modules('mpi4py')) + + n_iterations = variable(int, value=1000) + n_warmup = variable(int, value=100) + + executable = 'python3' + executable_opts = ['mpi4py_reduce.py', '--n_iter', f'{n_iterations}', '--n_warmup', f'{n_warmup}'] + + time_limit = '5m00s' + + def required_mem_per_node(self): + return self.num_tasks_per_node * 100 + 250 + + @sanity_function + def validate(self): + sum_of_ranks = round(self.num_tasks * ((self.num_tasks - 1) / 2)) + return sn.assert_found(r'Sum of all ranks: %s' % sum_of_ranks, self.stdout) + + @performance_function('s') + def time(self): + return sn.extractsingle(r'^Time elapsed:\s+(?P\S+)', self.stdout, 'perf', float) + +``` ### Background of the mpi4py test { #background-of-mpi4py-test } To understand what this test does, you need to know some basics of MPI. If you know about MPI, you can skip this section. @@ -611,51 +797,5 @@ The use of this hook is optional but recommended in most cases. Note that thread The `set_omp_num_threads` hook sets the `$OMP_NUM_THREADS` environment variable based on the number of `cpus_per_task` defined in the ReFrame test (which in turn is typically set by the `assign_tasks_per_compute_unit` hook). For OpenMP codes, it is generally recommended to call this hook, to ensure they launch the correct amount of threads. -#### Skipping tests instances when required (optional) -Preferably, we prevent test instances from being generated (i.e. before ReFrame's `setup` phase) if we know that they cannot run on a certain system. However, sometimes we need information on the nodes that will run it, which is only available _after_ the `setup` phase. That is the case for anything where we need information from e.g. the [reframe.core.pipeline.RegressionTest.current_partition](https://reframe-hpc.readthedocs.io/en/stable/regression_test_api.html#reframe.core.pipeline.RegressionTest.current_partition). The `assign_tasks_per_compute_unit` hook for example uses this property to get the core count of a node, and thus needs to be executed after the `setup` phase. - -For example, we might know that a test only scales to around 300 tasks, and above that, execution time increases rapidly. In that case, we'd want to skip any test instance that results in a larger amount of tasks, but we only know this after `assign_tasks_per_compute_unit` has been called. For example, the `2_nodes` scale would run fine on systems with 128 cores per node, but would exceed the task limit of 300 on systems with `192` cores per node. - -We can skip any generated test cases using the `skip_if` function. For example, to skip the test if the total task count exceeds 300, we'd need to call `skip_if` _after_ the task count has been set by `assign_tasks_per_compute_unit`: - -```python -@run_after('setup') - hooks.assign_tasks_per_compute_unit(test=self, compute_unit=COMPUTE_UNIT[CPU]) - - max_tasks = 300 - self.skip_if(self.num_tasks > max_tasks, - f'Skipping test: more than {max_tasks} tasks are requested ({self.num_tasks})') -``` - -The `mpi4py` scales almost indefinitely, but if we were to set it for the sake of this example, one would see: -```bash -[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=16_nodes /38aea144 @snellius:genoa+default -[ SKIP ] ( 1/13) Skipping test: more than 300 tasks are requested (3072) -[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=8_nodes /bfc4d3d4 @snellius:genoa+default -[ SKIP ] ( 2/13) Skipping test: more than 300 tasks are requested (1536) -[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=4_nodes /8de369bc @snellius:genoa+default -[ SKIP ] ( 3/13) Skipping test: more than 300 tasks are requested (768) -[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=2_nodes /364146ba @snellius:genoa+default -[ SKIP ] ( 4/13) Skipping test: more than 300 tasks are requested (384) -[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=1_node /8225edb3 @snellius:genoa+default -[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=1_2_node /4acf483a @snellius:genoa+default -[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=1_4_node /fc3d689b @snellius:genoa+default -[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=1_8_node /73046a73 @snellius:genoa+default -[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=1cpn_4nodes /f08712a2 @snellius:genoa+default -[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=1cpn_2nodes /23cd550b @snellius:genoa+default -[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=4_cores /bb8e1349 @snellius:genoa+default -[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=2_cores /4c0c7c9e @snellius:genoa+default -[ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=1_core /aa83ba9e @snellius:genoa+default - -... -``` -on a system with 192 cores per node. I.e. any test of 2 nodes (384 cores) or above would be skipped because it exceeds our max task count. - -!!! note - The order in which [ReFrame pipeline hooks](https://reframe-hpc.readthedocs.io/en/stable/regression_test_api.html#pipeline-hooks) (methods decorated with `run_after` or `run_before`) are called is the same in which they are attached/defined. - - This is important in case we want to call hooks for the same stage (`init`/`setup`/...) in different functions (for cleanliness of the code or any other reason). - - For example, any pipeline hook attached to the `setup` step making use of `self.num_tasks`, should be defined after the function calling the test-suite hook `assign_tasks_per_compute_unit`. From 83de8648e1f90036e3f4df1685948aeab4c719d3 Mon Sep 17 00:00:00 2001 From: Caspar van Leeuwen Date: Tue, 12 Nov 2024 16:51:05 +0100 Subject: [PATCH 04/15] Fix typos --- docs/test-suite/writing-portable-tests.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/test-suite/writing-portable-tests.md b/docs/test-suite/writing-portable-tests.md index 31ff578a5..c1fc0addd 100644 --- a/docs/test-suite/writing-portable-tests.md +++ b/docs/test-suite/writing-portable-tests.md @@ -289,7 +289,7 @@ The `EESSI_Mixin` class sets `valid_prog_environs = ['default']` by default, so ``` device_type = DEVICE_TYPES[CPU] ``` -but note if we would have wanted to also generate test instances to test GPU <=> GPU communication, we could have defined this as a paremeter: +but note if we would have wanted to also generate test instances to test GPU <=> GPU communication, we could have defined this as a parameter: ``` device_type = parameter([DEVICE_TYPES[CPU], DEVICE_TYPES[GPU]]) ``` @@ -311,7 +311,7 @@ def required_mem_per_node(self): return self.num_tasks_per_node * 100 + 250 ``` -While rounding up is advisable, do keep your estimate realistic. Too high a memory request will mean the test will get skipped on systems that cannot satisfy that memory request. Most HPC systems have at least 1 GB per core, and most laptop/desktops have at least 8 GB total. Desiging a test so that it fits within those memory constraints will ensure it can be run almost anywhere. +While rounding up is advisable, do keep your estimate realistic. Too high a memory request will mean the test will get skipped on systems that cannot satisfy that memory request. Most HPC systems have at least 1 GB per core, and most laptop/desktops have at least 8 GB total. Designing a test so that it fits within those memory constraints will ensure it can be run almost anywhere. !!! note The easiest way to get the memory consumption of your test at various task counts is to execute it on a system which runs jobs in [cgroups](https://en.wikipedia.org/wiki/Cgroups), define `measure_memory_usage = True` in your class body, and make the `required_mem_per_node` function return a constant amount of memory equal to the available memory per node on your test system. This will cause the `EESSI_Mixin` class to read out the maximum memory usage of the cgroup (on the head node of your allocation, in case of multi-node tests) and report it as a performance number. From cb98495407048660fb4947838f99b08063227f62 Mon Sep 17 00:00:00 2001 From: Caspar van Leeuwen Date: Tue, 12 Nov 2024 16:54:23 +0100 Subject: [PATCH 05/15] Reword a bit the old documentation --- docs/test-suite/writing-portable-tests.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/test-suite/writing-portable-tests.md b/docs/test-suite/writing-portable-tests.md index c1fc0addd..2f73ce7f3 100644 --- a/docs/test-suite/writing-portable-tests.md +++ b/docs/test-suite/writing-portable-tests.md @@ -542,9 +542,9 @@ Time elapsed: 3.609e-06 This started 4 processes, with ranks 0, 1, 2, 3, and then summed all the ranks (`0+1+2+3=6`) on the process with rank 0, which finally printed all this output. The whole reduction operation is performed `n_iter` times, so that we get a more reproducible timing. -### Legacy: former step 3: implementing as a portable ReFrame test { #as-portable-reframe-test-legacy } +### Step 3: implementing as a portable ReFrame test without using EESSI_Mixin { #as-portable-reframe-test-legacy } -The approach using inheritance from the `eessi_mixin` class, described above, is strongly preferred and recommended. There might be certain tests that do not fit the standardized approach of `eessi_mixin`, but usually that will be solvable by overwriting hooks set by `eessi_mixin` in the inheriting class. In the rare case that your test is so exotic that even this doesn't provide a sensible solution, you can still invoke the hooks used by `eessi_mixin` manually. Note that this used to be the default way of writing tests for the EESSI test suite. +The approach using inheritance from the `EESS_Mixin` class, described above, is strongly preferred and recommended. There might be certain tests that do not fit the standardized approach of `EESSI_Mixin`, but usually that will be solvable by overwriting hooks set by `EESSI_Mixin` in the inheriting class. In the rare case that your test is so exotic that even this doesn't provide a sensible solution, you can still invoke the hooks used by `EESSI_Mixin` manually. Note that this used to be the default way of writing tests for the EESSI test suite. In step 2, there were several system-specific items in the test. In this section, we will show how we use the EESSI hooks to avoid hard-coding system specific information. We do this by replacing the system-specific parts of the test from Step 2 bit by bit. The full final test can be found under `tutorials/mpi4py/mpi4py_portable_legacy.py` in the [EESSI test suite](https://github.com/EESSI/test-suite/) repository. From 9f2331db5cda1c4b45318f9ed5279d2b4275f3c5 Mon Sep 17 00:00:00 2001 From: Caspar van Leeuwen Date: Tue, 12 Nov 2024 17:05:02 +0100 Subject: [PATCH 06/15] Final comment in the summary --- docs/test-suite/writing-portable-tests.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/test-suite/writing-portable-tests.md b/docs/test-suite/writing-portable-tests.md index 2f73ce7f3..b5ebeea77 100644 --- a/docs/test-suite/writing-portable-tests.md +++ b/docs/test-suite/writing-portable-tests.md @@ -14,7 +14,7 @@ In this tutorial, you will learn how to write a test for the [EESSI test suite]( The test suite contains a combination of real-life use cases for end-user scientific software (e.g. tests for GROMACS, TensorFlow, CP2K, OpenFOAM, etc) and low level tests (e.g. OSU Microbenchmarks). -The tests in the EESSI test suite are developed using the [ReFrame HPC testing framework](https://reframe-hpc.readthedocs.io/en/stable/). Typically, ReFrame tests hardcode system specific information (core counts, performance references, etc) in the test definition. The EESSI test suite aims to be portable, and implements a series of standard [hooks](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/hooks.py) to replace information that is typically hardcoded. All system-specific information is then limited to the ReFrame configuration file. As an example: rather than hardcoding that a test should run with 128 tasks (i.e. because a system has 128 core nodes), the EESSI test suite has a hook that can define a test should be run on a "single, full node". The hook queries the ReFrame configuration file for the amount of cores per node, and specifies this number as the corresponding amount of tasks. Thus, on a 64-core node, this test would run with 64 tasks, while on a 128-core node, it would run 128 tasks. +The tests in the EESSI test suite are developed using the [ReFrame HPC testing framework](https://reframe-hpc.readthedocs.io/en/stable/). Typically, ReFrame tests hardcode system specific information (core counts, performance references, etc) in the test definition. The EESSI test suite aims to be portable, and implements a [mixin class](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/eessi_mixin.py) that invokes a series of standard [hooks](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/hooks.py) to replace information that is typically hardcoded. All system-specific information is then limited to the ReFrame configuration file. As an example: rather than hardcoding that a test should run with 128 tasks (i.e. because a system has 128 core nodes), the EESSI test suite has a hook that can define a test should be run on a "single, full node". The hook queries the ReFrame configuration file for the amount of cores per node, and specifies this number as the corresponding amount of tasks. Thus, on a 64-core node, this test would run with 64 tasks, while on a 128-core node, it would run 128 tasks. ## Test requirements @@ -472,9 +472,10 @@ class EESSI_MPI4PY(rfm.RunOnlyRegressionTest, EESSI_Mixin): @performance_function('s') def time(self): return sn.extractsingle(r'^Time elapsed:\s+(?P\S+)', self.stdout, 'perf', float) - ``` +Note that with only 34 lines of code, this is now very quick and easy to write, because of the default behaviour from the `EESSI_Mixin` class. + ### Background of the mpi4py test { #background-of-mpi4py-test } To understand what this test does, you need to know some basics of MPI. If you know about MPI, you can skip this section. From 6a927c37f262d6f0554829fd87f6e538d60b3d72 Mon Sep 17 00:00:00 2001 From: Caspar van Leeuwen Date: Tue, 12 Nov 2024 17:39:15 +0100 Subject: [PATCH 07/15] Added section on picking the time_limit --- docs/test-suite/writing-portable-tests.md | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/docs/test-suite/writing-portable-tests.md b/docs/test-suite/writing-portable-tests.md index b5ebeea77..3c7c232b6 100644 --- a/docs/test-suite/writing-portable-tests.md +++ b/docs/test-suite/writing-portable-tests.md @@ -334,7 +334,7 @@ def set_binding(self): hooks.set_compact_thread_binding(self) ``` -#### Skipping tests instances when required (optional) +#### Skipping tests instances when required (optional) [#skipping-test-instances] Preferably, we prevent test instances from being generated (i.e. before ReFrame's `setup` phase) if we know that they cannot run on a certain system. However, sometimes we need information on the nodes that will run it, which is only available _after_ the `setup` phase. That is the case for anything where we need information from e.g. the [reframe.core.pipeline.RegressionTest.current_partition](https://reframe-hpc.readthedocs.io/en/stable/regression_test_api.html#reframe.core.pipeline.RegressionTest.current_partition). For example, we might know that a test only scales to around 300 tasks, and above that, execution time increases rapidly. In that case, we'd want to skip any test instance that results in a larger amount of tasks, but we only know this after `assign_tasks_per_compute_unit` has been called (which is done by `EESSI_Mixin` in after the `setup` stage). For example, the `2_nodes` scale would run fine on systems with 128 cores per node, but would exceed the task limit of 300 on systems with `192` cores per node. @@ -374,6 +374,27 @@ The `mpi4py` scales almost indefinitely, but if we were to set it for the sake o ``` on a system with 192 cores per node. I.e. any test of 2 nodes (384 cores) or above would be skipped because it exceeds our max task count. +#### Setting a time limit (optional) +By default, the `EESSI_Mixin` class sets a time limit for jobs of 1 hour. You can overwrite this in your child class: +``` +time_limit = '5m00s' +``` +For the appropriate string formatting, please check the [ReFrame documentation on time_limit](https://reframe-hpc.readthedocs.io/en/stable/regression_test_api.html#reframe.core.pipeline.RegressionTest.time_limit). We already had this in the non-portable version of our `mpi4py` test and will keep it in the portable version: since this is a very quick test, specifying a lower time limit will help in getting the jobs scheduled more quickly. + +Note that for the test to be portable, the time limit should be set such that it is sufficient _regardless of node architecture_. It is pretty hard to guarantee this with a single, fixed time limit for all test scales, without knowing upfront what architecture the test will be run on, and thus how many tasks will be launched. For strong scaling tests, you might want a higher time limit for low task counts, whereas for weak scaling tests you might want a higher time limit for higher task counts. You can consider setting the time limit after setup, and making it dependend on the task count. + +Suppose we have a weak scaling test that takes 5 minutes for a single task, and 60 minutes for 10k tasks. We can set a time limit based on linear interpolation between those task counts: +``` +@run_after('setup') +def set_time_limit(self): + # linearly interpolate between the single and 10k task count + minutes = 5 + self.num_tasks * ((60-5) / 10000) + self.time_limit = '%sm00s' % minutes +``` +Note that this is typically an overestimate of how long the test will take for intermediate task counts, but that's ok: we'd rather overestimate than underestimate the runtime. + +To be even safer, one could consider combining this with logic to [skip tests](#skipping-test-instances) if the 10k task count is exceeded. + #### Summary To make the test portable, we added additional imports: ``` From 230428bc8bc712ca1a81ae72397f717d5b9d8945 Mon Sep 17 00:00:00 2001 From: Caspar van Leeuwen Date: Tue, 12 Nov 2024 17:40:01 +0100 Subject: [PATCH 08/15] Use correct brackets for anchor --- docs/test-suite/writing-portable-tests.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/test-suite/writing-portable-tests.md b/docs/test-suite/writing-portable-tests.md index 3c7c232b6..9ad32ce06 100644 --- a/docs/test-suite/writing-portable-tests.md +++ b/docs/test-suite/writing-portable-tests.md @@ -334,7 +334,7 @@ def set_binding(self): hooks.set_compact_thread_binding(self) ``` -#### Skipping tests instances when required (optional) [#skipping-test-instances] +#### Skipping tests instances when required (optional) { #skipping-test-instances } Preferably, we prevent test instances from being generated (i.e. before ReFrame's `setup` phase) if we know that they cannot run on a certain system. However, sometimes we need information on the nodes that will run it, which is only available _after_ the `setup` phase. That is the case for anything where we need information from e.g. the [reframe.core.pipeline.RegressionTest.current_partition](https://reframe-hpc.readthedocs.io/en/stable/regression_test_api.html#reframe.core.pipeline.RegressionTest.current_partition). For example, we might know that a test only scales to around 300 tasks, and above that, execution time increases rapidly. In that case, we'd want to skip any test instance that results in a larger amount of tasks, but we only know this after `assign_tasks_per_compute_unit` has been called (which is done by `EESSI_Mixin` in after the `setup` stage). For example, the `2_nodes` scale would run fine on systems with 128 cores per node, but would exceed the task limit of 300 on systems with `192` cores per node. From db0270a5198458219c479028d37ccd29381edb31 Mon Sep 17 00:00:00 2001 From: Caspar van Leeuwen Date: Tue, 12 Nov 2024 17:40:42 +0100 Subject: [PATCH 09/15] Fixed typo --- docs/test-suite/writing-portable-tests.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/test-suite/writing-portable-tests.md b/docs/test-suite/writing-portable-tests.md index 9ad32ce06..70c8a659a 100644 --- a/docs/test-suite/writing-portable-tests.md +++ b/docs/test-suite/writing-portable-tests.md @@ -566,7 +566,7 @@ This started 4 processes, with ranks 0, 1, 2, 3, and then summed all the ranks ( ### Step 3: implementing as a portable ReFrame test without using EESSI_Mixin { #as-portable-reframe-test-legacy } -The approach using inheritance from the `EESS_Mixin` class, described above, is strongly preferred and recommended. There might be certain tests that do not fit the standardized approach of `EESSI_Mixin`, but usually that will be solvable by overwriting hooks set by `EESSI_Mixin` in the inheriting class. In the rare case that your test is so exotic that even this doesn't provide a sensible solution, you can still invoke the hooks used by `EESSI_Mixin` manually. Note that this used to be the default way of writing tests for the EESSI test suite. +The approach using inheritance from the `EESSI_Mixin` class, described above, is strongly preferred and recommended. There might be certain tests that do not fit the standardized approach of `EESSI_Mixin`, but usually that will be solvable by overwriting hooks set by `EESSI_Mixin` in the inheriting class. In the rare case that your test is so exotic that even this doesn't provide a sensible solution, you can still invoke the hooks used by `EESSI_Mixin` manually. Note that this used to be the default way of writing tests for the EESSI test suite. In step 2, there were several system-specific items in the test. In this section, we will show how we use the EESSI hooks to avoid hard-coding system specific information. We do this by replacing the system-specific parts of the test from Step 2 bit by bit. The full final test can be found under `tutorials/mpi4py/mpi4py_portable_legacy.py` in the [EESSI test suite](https://github.com/EESSI/test-suite/) repository. From 5bcbc343ceb8e1a3d7c53f95e4c2cf0849970480 Mon Sep 17 00:00:00 2001 From: Caspar van Leeuwen Date: Tue, 12 Nov 2024 17:42:12 +0100 Subject: [PATCH 10/15] Fix spell error --- docs/test-suite/writing-portable-tests.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/test-suite/writing-portable-tests.md b/docs/test-suite/writing-portable-tests.md index 70c8a659a..bcdcece1b 100644 --- a/docs/test-suite/writing-portable-tests.md +++ b/docs/test-suite/writing-portable-tests.md @@ -381,7 +381,7 @@ time_limit = '5m00s' ``` For the appropriate string formatting, please check the [ReFrame documentation on time_limit](https://reframe-hpc.readthedocs.io/en/stable/regression_test_api.html#reframe.core.pipeline.RegressionTest.time_limit). We already had this in the non-portable version of our `mpi4py` test and will keep it in the portable version: since this is a very quick test, specifying a lower time limit will help in getting the jobs scheduled more quickly. -Note that for the test to be portable, the time limit should be set such that it is sufficient _regardless of node architecture_. It is pretty hard to guarantee this with a single, fixed time limit for all test scales, without knowing upfront what architecture the test will be run on, and thus how many tasks will be launched. For strong scaling tests, you might want a higher time limit for low task counts, whereas for weak scaling tests you might want a higher time limit for higher task counts. You can consider setting the time limit after setup, and making it dependend on the task count. +Note that for the test to be portable, the time limit should be set such that it is sufficient _regardless of node architecture_. It is pretty hard to guarantee this with a single, fixed time limit for all test scales, without knowing upfront what architecture the test will be run on, and thus how many tasks will be launched. For strong scaling tests, you might want a higher time limit for low task counts, whereas for weak scaling tests you might want a higher time limit for higher task counts. You can consider setting the time limit after setup, and making it dependent on the task count. Suppose we have a weak scaling test that takes 5 minutes for a single task, and 60 minutes for 10k tasks. We can set a time limit based on linear interpolation between those task counts: ``` From d70c9f01f06d1c1f9d6ae5fc7f1dbab09ca6c940 Mon Sep 17 00:00:00 2001 From: Caspar van Leeuwen <33718780+casparvl@users.noreply.github.com> Date: Tue, 19 Nov 2024 12:41:20 +0100 Subject: [PATCH 11/15] Apply suggestions from code review Implemented many good textual improvements from the code review. Co-authored-by: Sam Moors --- docs/test-suite/writing-portable-tests.md | 42 +++++++++++------------ 1 file changed, 21 insertions(+), 21 deletions(-) diff --git a/docs/test-suite/writing-portable-tests.md b/docs/test-suite/writing-portable-tests.md index bcdcece1b..8e2b5e771 100644 --- a/docs/test-suite/writing-portable-tests.md +++ b/docs/test-suite/writing-portable-tests.md @@ -182,11 +182,11 @@ In step 2, there were several system-specific items in the test. In this section #### How EESSI_Mixin works -The `EESSI_Mixin` class provides standardized functionality that should be useful to all tests in the EESSI test-suite. One of it's key functions is to make sure tests dynamically try to determine sensible values for the things that were system specific in Step 2. For example, instead of hard coding a task count, the test inheriting from `EESSI_Mixin` will determine this dynamically based on the amount of available cores per node, and a declaration from the inheriting test class about how you want to instantiated tasks. +The `EESSI_Mixin` class provides standardized functionality that should be useful to all tests in the EESSI test-suite. One of its key functions is to make sure tests dynamically try to determine sensible values for the things that were system specific in Step 2. For example, instead of hard coding a task count, the test inheriting from `EESSI_Mixin` will determine this dynamically based on the amount of available cores per node, and a declaration from the inheriting test class about how you want to instantiate tasks. -To illustrate this, suppose want to launch one task for each core for your test. In that case, your test (that inherits from `EESSI_Mixin`) would only have to declare +To illustrate this, suppose you want to launch your test with one task per CPU core. In that case, your test (that inherits from `EESSI_Mixin`) only has to declare -``` +```python compute_unit = COMPUTE_UNIT[CPU] ``` @@ -211,7 +211,7 @@ class EESSI_MPI4PY(rfm.RunOnlyRegressionTest, EESSI_Mixin): First, we remove -``` +```python # ReFrame will generate a test for each scale scale = parameter([2, 128, 256]) ``` @@ -222,7 +222,7 @@ from eessi.testsuite.constants import SCALES scale = parameter(SCALES.keys()) ``` -This will ensure the test will run at all of the default scales, as defined by the `SCALES` [constant](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/constants.py). +This ensures the test will run a test case for each of the default scales, as defined by the `SCALES` [constant](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/constants.py). If, and only if, your test can not run on all of those scales should you overwrite this parameter in your child class. For example, if you have a test that does not support running on multiple nodes, you could define a filtering function outside of the class ``` @@ -232,7 +232,7 @@ def filter_scales(): if v['num_nodes'] == 1 ] ``` -and then in the class body overwrite the scales with a subset of items from the `SCALES` constant: +and then in the class body overwrite the scale parameter with a subset of items from the `SCALES` constant: ``` scale = parameter(filter_scales()) ``` @@ -248,11 +248,11 @@ Next, we also remove as `num_tasks` and and `num_tasks_per_node` will be set by the `assign_tasks_per_compute_unit` [hook](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/hooks.py), which is invoked by the `EESSI_Mixin` class. -Instead, we only set the `compute_unit`. A task will be launched for each compute unit. E.g. +Instead, we only set the `compute_unit`. The number of launched tasks will be equal to the number of compute units. E.g. ``` compute_unit = COMPUTE_UNIT[CPU] ``` -will launch one task per (physical) CPU core. Other options are `COMPUTE_UNIT[HWTHREAD]` (one task per hardware thread), `COMPUTE_UNIT[NUMA_NODE]` (one task per numa node), `COMPUTE_UNIT[CPU_SOCKET]` (one task per CPU socket), `COMPUTE_UNIT[GPU]` (one task per GPU) and `COMPUTE_UNIT[NODE]` (one task per node). Check the `COMPUTE_UNIT` [constant](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/constants.py) for the full list of valid compute units. The number of cores per task will automatically be set based on this as the ratio of the number of cores in a node to the number of tasks per node (rounded down). The `EESSI_Mixin` class will also set the `OMP_NUM_THREADS` environment variable equal to the number of cores per task. +will launch one task per (physical) CPU core. Other options are `COMPUTE_UNIT[HWTHREAD]` (one task per hardware thread), `COMPUTE_UNIT[NUMA_NODE]` (one task per numa node), `COMPUTE_UNIT[CPU_SOCKET]` (one task per CPU socket), `COMPUTE_UNIT[GPU]` (one task per GPU) and `COMPUTE_UNIT[NODE]` (one task per node). Check the `COMPUTE_UNIT` [constant](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/constants.py) for the full list of valid compute units. The number of cores per task will automatically be set based on this as the ratio of the number of cores in a node to the number of tasks per node (rounded down). Aditionally, the `EESSI_Mixin` class will set the `OMP_NUM_THREADS` environment variable equal to the number of cores per task. !!! note `compute_unit` needs to be set in ReFrame's `setup` phase (or earlier). For the different phases of the pipeline, please see the [documentation on how ReFrame executes tests](https://reframe-hpc.readthedocs.io/en/stable/pipeline.html). @@ -266,7 +266,7 @@ from eessi.testsuite.utils import find_modules module_name = parameter(find_modules('mpi4py')) ``` -This parameter will be a list of all module names matching the expression, and each test instance will load the respective module before running the test. +This parameter generates all module names available on the current system matching the expression, and each test instance will load the respective module before running the test. Furthermore, we remove the hook that sets `self.module`: ``` @@ -285,7 +285,7 @@ First, we remove the hard-coded system name and programming environment. I.e. we valid_prog_environs = ['default'] valid_systems = ['snellius'] ``` -The `EESSI_Mixin` class sets `valid_prog_environs = ['default']` by default, so that is no longer needed in the child class (but it can be overwritten if needed). The `valid_systems` is instead replaced by a declaration of what type of device type is needed. We'll create an `mpi4py` test that runs on CPU only: +The `EESSI_Mixin` class sets `valid_prog_environs = ['default']` by default, so that is no longer needed in the child class (but it can be overwritten if needed). The `valid_systems` is instead replaced by a declaration of what type of device type is needed. We'll create an `mpi4py` test that runs on CPUs only: ``` device_type = DEVICE_TYPES[CPU] ``` @@ -301,10 +301,10 @@ The device type that is set will be used by the `filter_valid_systems_by_device_ !!! note `device_type` needs to be set in ReFrame's `init` phase (or earlier) -#### Requesting sufficient memory -To make sure you get an allocation with sufficient memory, your test should declare how much memory it needs per node by defining a `required_mem_per_node` function in your test class that returns the required memory per node (in MiB). Note that the amount of required memory generally depends on the amount of tasks that are launched per node (`self.num_tasks_per_node`). +#### Requesting sufficient RAM memory +To make sure you get an allocation with sufficient memory, your test should declare how much memory per node it needs by defining a `required_mem_per_node` function in your test class that returns the required memory per node (in MiB). Note that the amount of required memory generally depends on the amount of tasks that are launched per node (`self.num_tasks_per_node`). -Our `mpi4py` test takes around 200 MB per task, plus about 70 MB for every additional task. We round this up a little so that we can be sure the test won't run out of memory if memory consumption is slightly different on a different system. Thus, we define: +Our `mpi4py` test takes around 200 MB when running with a single task, plus about 70 MB for every additional task. We round this up a little so that we can be sure the test won't run out of memory if memory consumption is slightly different on a different system. Thus, we define: ``` def required_mem_per_node(self): @@ -323,7 +323,7 @@ The `EESSI_Mixin` class binds processes to their respective number of cores auto def assign_tasks_per_compute_unit(self): ... ``` -function. Note that this also calls other hooks (such as `hooks.assign_task_per_compute_unit`) that you probably still want to invoke. Check the `EESSI_Mixin` [class definition](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/eessi_mixin.py) to see which hooks you still want to call. +function. Note that this function also calls other hooks (such as `hooks.assign_task_per_compute_unit`) that you probably still want to invoke. Check the `EESSI_Mixin` [class definition](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/eessi_mixin.py) to see which hooks you still want to call. #### Thread binding (optional) @@ -334,7 +334,7 @@ def set_binding(self): hooks.set_compact_thread_binding(self) ``` -#### Skipping tests instances when required (optional) { #skipping-test-instances } +#### Skipping test instances when required (optional) { #skipping-test-instances } Preferably, we prevent test instances from being generated (i.e. before ReFrame's `setup` phase) if we know that they cannot run on a certain system. However, sometimes we need information on the nodes that will run it, which is only available _after_ the `setup` phase. That is the case for anything where we need information from e.g. the [reframe.core.pipeline.RegressionTest.current_partition](https://reframe-hpc.readthedocs.io/en/stable/regression_test_api.html#reframe.core.pipeline.RegressionTest.current_partition). For example, we might know that a test only scales to around 300 tasks, and above that, execution time increases rapidly. In that case, we'd want to skip any test instance that results in a larger amount of tasks, but we only know this after `assign_tasks_per_compute_unit` has been called (which is done by `EESSI_Mixin` in after the `setup` stage). For example, the `2_nodes` scale would run fine on systems with 128 cores per node, but would exceed the task limit of 300 on systems with `192` cores per node. @@ -350,7 +350,7 @@ We can skip any generated test cases using the `skip_if` function. For example, f'Skipping test: more than {max_tasks} tasks are requested ({self.num_tasks})') ``` -The `mpi4py` scales almost indefinitely, but if we were to set it for the sake of this example, one would see: +The `mpi4py` test scales up to a very high core count, but if we were to set it for the sake of this example, one would see: ```bash [ RUN ] EESSI_MPI4PY %module_name=mpi4py/3.1.5-gompi-2023b %scale=16_nodes /38aea144 @snellius:genoa+default [ SKIP ] ( 1/13) Skipping test: more than 300 tasks are requested (3072) @@ -381,15 +381,15 @@ time_limit = '5m00s' ``` For the appropriate string formatting, please check the [ReFrame documentation on time_limit](https://reframe-hpc.readthedocs.io/en/stable/regression_test_api.html#reframe.core.pipeline.RegressionTest.time_limit). We already had this in the non-portable version of our `mpi4py` test and will keep it in the portable version: since this is a very quick test, specifying a lower time limit will help in getting the jobs scheduled more quickly. -Note that for the test to be portable, the time limit should be set such that it is sufficient _regardless of node architecture_. It is pretty hard to guarantee this with a single, fixed time limit for all test scales, without knowing upfront what architecture the test will be run on, and thus how many tasks will be launched. For strong scaling tests, you might want a higher time limit for low task counts, whereas for weak scaling tests you might want a higher time limit for higher task counts. You can consider setting the time limit after setup, and making it dependent on the task count. +Note that for the test to be portable, the time limit should be set such that it is sufficient _regardless of node architecture and scale_. It is pretty hard to guarantee this with a single, fixed time limit, without knowing upfront what architecture the test will be run on, and thus how many tasks will be launched. For strong scaling tests, you might want a higher time limit for low task counts, whereas for weak scaling tests you might want a higher time limit for higher task counts. To do so, you can consider setting the time limit after setup, and making it dependent on the task count. -Suppose we have a weak scaling test that takes 5 minutes for a single task, and 60 minutes for 10k tasks. We can set a time limit based on linear interpolation between those task counts: +Suppose we have a weak scaling test that takes 5 minutes with a single task, and 60 minutes with 10k tasks. We can set a time limit based on linear interpolation between those task counts: ``` @run_after('setup') def set_time_limit(self): # linearly interpolate between the single and 10k task count minutes = 5 + self.num_tasks * ((60-5) / 10000) - self.time_limit = '%sm00s' % minutes + self.time_limit = f'{minutes}m00s' ``` Note that this is typically an overestimate of how long the test will take for intermediate task counts, but that's ok: we'd rather overestimate than underestimate the runtime. @@ -432,14 +432,14 @@ def required_mem_per_node(self): return self.num_tasks_per_node * 100 + 250 ``` -Removed the ReFrame pipeline hook that set the `self.modules`: +Removed the ReFrame pipeline hook that sets `self.modules`: ``` @run_after('init') def set_modules(self): self.modules = [self.module_name] ``` -Removed the ReFrame pipeline hook that set the number of tasks and number of tasks per node: +Removed the ReFrame pipeline hook that sets the number of tasks and number of tasks per node: ``` @run_after('init') def define_task_count(self): From 9935fb196ab36a0c75a0bc6bdb60516bb5f02805 Mon Sep 17 00:00:00 2001 From: Caspar van Leeuwen Date: Tue, 19 Nov 2024 12:46:54 +0100 Subject: [PATCH 12/15] Add use of bench_name and bench_name_ci to tag tests with the CI tag --- docs/test-suite/writing-portable-tests.md | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/docs/test-suite/writing-portable-tests.md b/docs/test-suite/writing-portable-tests.md index 8e2b5e771..d51f8db01 100644 --- a/docs/test-suite/writing-portable-tests.md +++ b/docs/test-suite/writing-portable-tests.md @@ -325,6 +325,27 @@ def assign_tasks_per_compute_unit(self): ``` function. Note that this function also calls other hooks (such as `hooks.assign_task_per_compute_unit`) that you probably still want to invoke. Check the `EESSI_Mixin` [class definition](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/eessi_mixin.py) to see which hooks you still want to call. +#### CI Tag +As mentioned in the [Test requirements](#test-requirements), there should be at least one light-weight (short, low-core, low-memory) test case, which should be marked with the `CI` tag. The `EESSI_Mixin` class will automatically add the `CI` tag if both `bench_name` (the current variant) and `bench_name_ci` (the CI variant) are defined. The `mpi4py` test contains only one test case (which is very light-weight). In this case, it is sufficient to set both to the same name in the class body: +```python +bench_name = 'mpi4pi' +bench_name_ci = 'mpi4pi' +``` + +Suppose that our test has 2 variants, of which only `'variant1'` should be marked `CI`. In that case, we can define `bench_name` as a parameter: +```python + bench_name = parameter(['variant1', 'variant2']) + bench_name_ci = 'variant1' +``` +Next, we can define a hook that does different things depending on the variant, for example: +```python +@run_after('init') +def do_something(self): + if self.bench_name == 'variant1': + do_this() + elif self.bench_name == 'variant2': + do_that() +``` #### Thread binding (optional) Thread binding is not done by default, but can be done by invoking the `hooks.set_compact_thread_binding` hook: From ba8bb1aa803a2e8905b462540ccc2805c8612b3d Mon Sep 17 00:00:00 2001 From: Caspar van Leeuwen Date: Tue, 19 Nov 2024 12:48:09 +0100 Subject: [PATCH 13/15] Fix codespell error --- docs/test-suite/writing-portable-tests.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/test-suite/writing-portable-tests.md b/docs/test-suite/writing-portable-tests.md index d51f8db01..6ea5be5ac 100644 --- a/docs/test-suite/writing-portable-tests.md +++ b/docs/test-suite/writing-portable-tests.md @@ -252,7 +252,7 @@ Instead, we only set the `compute_unit`. The number of launched tasks will be eq ``` compute_unit = COMPUTE_UNIT[CPU] ``` -will launch one task per (physical) CPU core. Other options are `COMPUTE_UNIT[HWTHREAD]` (one task per hardware thread), `COMPUTE_UNIT[NUMA_NODE]` (one task per numa node), `COMPUTE_UNIT[CPU_SOCKET]` (one task per CPU socket), `COMPUTE_UNIT[GPU]` (one task per GPU) and `COMPUTE_UNIT[NODE]` (one task per node). Check the `COMPUTE_UNIT` [constant](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/constants.py) for the full list of valid compute units. The number of cores per task will automatically be set based on this as the ratio of the number of cores in a node to the number of tasks per node (rounded down). Aditionally, the `EESSI_Mixin` class will set the `OMP_NUM_THREADS` environment variable equal to the number of cores per task. +will launch one task per (physical) CPU core. Other options are `COMPUTE_UNIT[HWTHREAD]` (one task per hardware thread), `COMPUTE_UNIT[NUMA_NODE]` (one task per numa node), `COMPUTE_UNIT[CPU_SOCKET]` (one task per CPU socket), `COMPUTE_UNIT[GPU]` (one task per GPU) and `COMPUTE_UNIT[NODE]` (one task per node). Check the `COMPUTE_UNIT` [constant](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/constants.py) for the full list of valid compute units. The number of cores per task will automatically be set based on this as the ratio of the number of cores in a node to the number of tasks per node (rounded down). Additionally, the `EESSI_Mixin` class will set the `OMP_NUM_THREADS` environment variable equal to the number of cores per task. !!! note `compute_unit` needs to be set in ReFrame's `setup` phase (or earlier). For the different phases of the pipeline, please see the [documentation on how ReFrame executes tests](https://reframe-hpc.readthedocs.io/en/stable/pipeline.html). From c640da2065dbc7a6bca5e538b47301ead25774ca Mon Sep 17 00:00:00 2001 From: Caspar van Leeuwen Date: Wed, 20 Nov 2024 15:35:40 +0100 Subject: [PATCH 14/15] Slight rephrasing to mention that parameter X needs to be set before a certain phase, and put the 'in' option between brackets --- docs/test-suite/writing-portable-tests.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/test-suite/writing-portable-tests.md b/docs/test-suite/writing-portable-tests.md index 6ea5be5ac..786b60f60 100644 --- a/docs/test-suite/writing-portable-tests.md +++ b/docs/test-suite/writing-portable-tests.md @@ -255,7 +255,7 @@ Instead, we only set the `compute_unit`. The number of launched tasks will be eq will launch one task per (physical) CPU core. Other options are `COMPUTE_UNIT[HWTHREAD]` (one task per hardware thread), `COMPUTE_UNIT[NUMA_NODE]` (one task per numa node), `COMPUTE_UNIT[CPU_SOCKET]` (one task per CPU socket), `COMPUTE_UNIT[GPU]` (one task per GPU) and `COMPUTE_UNIT[NODE]` (one task per node). Check the `COMPUTE_UNIT` [constant](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/constants.py) for the full list of valid compute units. The number of cores per task will automatically be set based on this as the ratio of the number of cores in a node to the number of tasks per node (rounded down). Additionally, the `EESSI_Mixin` class will set the `OMP_NUM_THREADS` environment variable equal to the number of cores per task. !!! note - `compute_unit` needs to be set in ReFrame's `setup` phase (or earlier). For the different phases of the pipeline, please see the [documentation on how ReFrame executes tests](https://reframe-hpc.readthedocs.io/en/stable/pipeline.html). + `compute_unit` needs to be set before (or in) ReFrame's `setup` phase. For the different phases of the pipeline, please see the [documentation on how ReFrame executes tests](https://reframe-hpc.readthedocs.io/en/stable/pipeline.html). #### Replacing hard-coded module names Instead of hard-coding a module name, we parameterize over all module names that match a certain regular expression. @@ -277,7 +277,7 @@ def set_modules(self): This is now taken care of by the `EESSI_Mixin` class. !!! note - `module_name` needs to be set in ReFrame's `init` phase (or earlier) + `module_name` needs to be set before (or in) ReFrame's `init` phase #### Replacing hard-coded system names and programming environments First, we remove the hard-coded system name and programming environment. I.e. we remove @@ -299,7 +299,7 @@ The device type that is set will be used by the `filter_valid_systems_by_device_ `EESSI_Mixin` also filters based on the supported scales, which can again be configured per partition in the ReFrame configuration file. This can e.g. be used to avoid running large-scale tests on partitions that don't have enough nodes to run them. !!! note - `device_type` needs to be set in ReFrame's `init` phase (or earlier) + `device_type` needs to be set before (or in) ReFrame's `init` phase #### Requesting sufficient RAM memory To make sure you get an allocation with sufficient memory, your test should declare how much memory per node it needs by defining a `required_mem_per_node` function in your test class that returns the required memory per node (in MiB). Note that the amount of required memory generally depends on the amount of tasks that are launched per node (`self.num_tasks_per_node`). From 470f0e0ca52f9b83246b56f23311e2f1757128af Mon Sep 17 00:00:00 2001 From: Caspar van Leeuwen Date: Wed, 20 Nov 2024 17:08:25 +0100 Subject: [PATCH 15/15] Marked codeblocks with the appropriate languages --- docs/test-suite/writing-portable-tests.md | 50 +++++++++++------------ 1 file changed, 25 insertions(+), 25 deletions(-) diff --git a/docs/test-suite/writing-portable-tests.md b/docs/test-suite/writing-portable-tests.md index 786b60f60..c19e9be78 100644 --- a/docs/test-suite/writing-portable-tests.md +++ b/docs/test-suite/writing-portable-tests.md @@ -200,7 +200,7 @@ Most of the functionality in the `EESSI_Mixin` class require certain class attri The first step is to actually inherit from the `EESSI_Mixin` class: -``` +```python from eessi.testsuite.eessi_mixin import EESSI_Mixin ... @rfm.simple_test @@ -216,7 +216,7 @@ First, we remove scale = parameter([2, 128, 256]) ``` from the test. The `EESSI_Mixin` class will define the default set of scales on which this test will be run as -``` +```python from eessi.testsuite.constants import SCALES ... scale = parameter(SCALES.keys()) @@ -225,7 +225,7 @@ from eessi.testsuite.constants import SCALES This ensures the test will run a test case for each of the default scales, as defined by the `SCALES` [constant](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/constants.py). If, and only if, your test can not run on all of those scales should you overwrite this parameter in your child class. For example, if you have a test that does not support running on multiple nodes, you could define a filtering function outside of the class -``` +```python def filter_scales(): return [ k for (k,v) in SCALES.items() @@ -233,13 +233,13 @@ def filter_scales(): ] ``` and then in the class body overwrite the scale parameter with a subset of items from the `SCALES` constant: -``` +```python scale = parameter(filter_scales()) ``` Next, we also remove -``` +```python @run_after('init') def define_task_count(self): self.num_tasks = self.scale @@ -249,7 +249,7 @@ Next, we also remove as `num_tasks` and and `num_tasks_per_node` will be set by the `assign_tasks_per_compute_unit` [hook](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/hooks.py), which is invoked by the `EESSI_Mixin` class. Instead, we only set the `compute_unit`. The number of launched tasks will be equal to the number of compute units. E.g. -``` +```python compute_unit = COMPUTE_UNIT[CPU] ``` will launch one task per (physical) CPU core. Other options are `COMPUTE_UNIT[HWTHREAD]` (one task per hardware thread), `COMPUTE_UNIT[NUMA_NODE]` (one task per numa node), `COMPUTE_UNIT[CPU_SOCKET]` (one task per CPU socket), `COMPUTE_UNIT[GPU]` (one task per GPU) and `COMPUTE_UNIT[NODE]` (one task per node). Check the `COMPUTE_UNIT` [constant](https://github.com/EESSI/test-suite/blob/main/eessi/testsuite/constants.py) for the full list of valid compute units. The number of cores per task will automatically be set based on this as the ratio of the number of cores in a node to the number of tasks per node (rounded down). Additionally, the `EESSI_Mixin` class will set the `OMP_NUM_THREADS` environment variable equal to the number of cores per task. @@ -260,7 +260,7 @@ will launch one task per (physical) CPU core. Other options are `COMPUTE_UNIT[HW #### Replacing hard-coded module names Instead of hard-coding a module name, we parameterize over all module names that match a certain regular expression. -``` +```python from eessi.testsuite.utils import find_modules ... module_name = parameter(find_modules('mpi4py')) @@ -269,7 +269,7 @@ from eessi.testsuite.utils import find_modules This parameter generates all module names available on the current system matching the expression, and each test instance will load the respective module before running the test. Furthermore, we remove the hook that sets `self.module`: -``` +```python @run_after('init') def set_modules(self): self.modules = [self.module_name] @@ -281,16 +281,16 @@ This is now taken care of by the `EESSI_Mixin` class. #### Replacing hard-coded system names and programming environments First, we remove the hard-coded system name and programming environment. I.e. we remove -``` +```python valid_prog_environs = ['default'] valid_systems = ['snellius'] ``` The `EESSI_Mixin` class sets `valid_prog_environs = ['default']` by default, so that is no longer needed in the child class (but it can be overwritten if needed). The `valid_systems` is instead replaced by a declaration of what type of device type is needed. We'll create an `mpi4py` test that runs on CPUs only: -``` +```python device_type = DEVICE_TYPES[CPU] ``` but note if we would have wanted to also generate test instances to test GPU <=> GPU communication, we could have defined this as a parameter: -``` +```python device_type = parameter([DEVICE_TYPES[CPU], DEVICE_TYPES[GPU]]) ``` @@ -306,7 +306,7 @@ To make sure you get an allocation with sufficient memory, your test should decl Our `mpi4py` test takes around 200 MB when running with a single task, plus about 70 MB for every additional task. We round this up a little so that we can be sure the test won't run out of memory if memory consumption is slightly different on a different system. Thus, we define: -``` +```python def required_mem_per_node(self): return self.num_tasks_per_node * 100 + 250 ``` @@ -318,7 +318,7 @@ While rounding up is advisable, do keep your estimate realistic. Too high a memo #### Process binding The `EESSI_Mixin` class binds processes to their respective number of cores automatically using the `hooks.set_compact_process_binding` hook. E.g. for a pure MPI test like `mpi4py`, each task will be bound to a single core. For hybrid tests that do both multiprocessing and multithreading, tasks are bound to a sequential number of cores. E.g. on a node with 128 cores and a hybrid test with 64 tasks and 2 threads per task, the first task will be bound to core 0 and 1, second task to core 2 and 3, etc. To override this behaviour, one would have to overwrite the -``` +```python @run_after('setup') def assign_tasks_per_compute_unit(self): ... @@ -349,7 +349,7 @@ def do_something(self): #### Thread binding (optional) Thread binding is not done by default, but can be done by invoking the `hooks.set_compact_thread_binding` hook: -``` +```python @run_after('setup') def set_binding(self): hooks.set_compact_thread_binding(self) @@ -397,7 +397,7 @@ on a system with 192 cores per node. I.e. any test of 2 nodes (384 cores) or abo #### Setting a time limit (optional) By default, the `EESSI_Mixin` class sets a time limit for jobs of 1 hour. You can overwrite this in your child class: -``` +```python time_limit = '5m00s' ``` For the appropriate string formatting, please check the [ReFrame documentation on time_limit](https://reframe-hpc.readthedocs.io/en/stable/regression_test_api.html#reframe.core.pipeline.RegressionTest.time_limit). We already had this in the non-portable version of our `mpi4py` test and will keep it in the portable version: since this is a very quick test, specifying a lower time limit will help in getting the jobs scheduled more quickly. @@ -405,7 +405,7 @@ For the appropriate string formatting, please check the [ReFrame documentation o Note that for the test to be portable, the time limit should be set such that it is sufficient _regardless of node architecture and scale_. It is pretty hard to guarantee this with a single, fixed time limit, without knowing upfront what architecture the test will be run on, and thus how many tasks will be launched. For strong scaling tests, you might want a higher time limit for low task counts, whereas for weak scaling tests you might want a higher time limit for higher task counts. To do so, you can consider setting the time limit after setup, and making it dependent on the task count. Suppose we have a weak scaling test that takes 5 minutes with a single task, and 60 minutes with 10k tasks. We can set a time limit based on linear interpolation between those task counts: -``` +```python @run_after('setup') def set_time_limit(self): # linearly interpolate between the single and 10k task count @@ -418,20 +418,20 @@ To be even safer, one could consider combining this with logic to [skip tests](# #### Summary To make the test portable, we added additional imports: -``` +```python from eessi.testsuite.eessi_mixin import EESSI_Mixin from eessi.testsuite.constants import COMPUTE_UNIT, DEVICE_TYPES, CPU from eessi.testsuite.utils import find_modules ``` Made sure the test inherits from `EESSI_Mixin`: -``` +```python @rfm.simple_test class EESSI_MPI4PY(rfm.runOnlyRegressionTest, EESSI_Mixin): ``` Removed the following from the class body: -``` +```python valid_prog_environs = ['default'] valid_systems = ['snellius'] @@ -440,7 +440,7 @@ scale = parameter([2, 128, 256]) ``` Added the following to the class body: -``` +```python device_type = DEVICE_TYPES[CPU] compute_unit = COMPUTE_UNIT[CPU] @@ -448,20 +448,20 @@ module_name = parameter(find_modules('mpi4py')) ``` Defined the class method: -``` +```python def required_mem_per_node(self): return self.num_tasks_per_node * 100 + 250 ``` Removed the ReFrame pipeline hook that sets `self.modules`: -``` +```python @run_after('init') def set_modules(self): self.modules = [self.module_name] ``` Removed the ReFrame pipeline hook that sets the number of tasks and number of tasks per node: -``` +```python @run_after('init') def define_task_count(self): # Set the number of tasks, self.scale is now a single number out of the parameter list @@ -474,7 +474,7 @@ def define_task_count(self): ``` The final test is thus: -``` +```python """ This module tests mpi4py's MPI_Reduce call """ @@ -576,7 +576,7 @@ if rank == 0: ``` Assuming we have `mpi4py` available, we could run this manually using -``` +```bash $ mpirun -np 4 python3 mpi4py_reduce.py Total ranks: 4 Sum of all ranks: 6