Skip to content

Commit

Permalink
Create compute build option (#3186)
Browse files Browse the repository at this point in the history
This creates scripts to run compute-node builds and also refactors the
build_all.sh script to make it easier to build all executables.

In place of various options to control what components are built when
using `build_all.sh`, instead it takes in a list of one or more systems
to build:

- `gfs` builds everything needed for forecast-only gfs (UFS model with
unstructured wave grid, gfs_utils, ufs_utils, upp, ww3 pre/post for
unstructured wave grid)
- `gefs` builds everything needed for GEFS (UFS model with structured
wave grid, gfs_utils, ufs_utils, upp, ww3 pre/post for structured wave
grid)
- `sfs` builds everything needed SFS (UFS model in hydrostatic mode with
unstructured wave grid, gfs_utils, ufs_utils, upp, ww3 pre/post for
structured wave grid)
- `gsi` builds GSI-based DA components (gsi_enkf, gsi_monitor,
gsi_utils)
- `gdas` builds JEDI-based DA components (gdas app, gsi_monitor,
gsi_utils)

`all` will build all of the above (mostly for testing)

Examples:
Build for forecast-only GFS:
```./build_all.sh gfs```
Build cycled GFS including coupled DA:
``` ./build_all.sh gfs gsi gdas```
Build GEFS:
```./build_all.sh gefs```
Build everything (for testing purposes):
```./build_all.sh all```
Other options, such as `-d` to build in debug mode, remain unchanged.

The full script signature is now:
```
./build_all.sh [-a UFS_app][-c build_config][-d][-f][-h][-v] [gfs] [gefs] [sfs] [gsi] [gdas] [all]
```

Additionally, there is a new script to build components on the compute
nodes using the job scheduler instead of the login node. This method
takes the load off of the login nodes and may be faster in some cases.
Compute build is invoked using the build_compute.sh script, which
behaves similarly to the new `build_all.sh:`
```
./build_compute.sh [-h][-v][-A <hpc-account>] [gfs] [gefs] [sfs] [gsi] [gdas] [all]
```
Compute build will generate a rocoto workflow and then call `rocotorun`
itself repeatedly until either a build fails or all builds succeed, at
which point the script will exit. Since the script is calling
`rocotorun` itself, you don't need to set up your own cron to do it, but
advanced users can also use all the regular rocoto tools on `build.xml`
and `build.db` if you wish.

Some things to note with the compute build:
- When a build fails, other build jobs are not cancelled and will
continue to run.
- Since the script stops running `rocotorun` once one build fails, the
rocoto database will no longer update with the status of the remaining
jobs after that point.
- Similarly, if the terminal running `build_compute.sh` gets
disconnected, the rocoto database will no longer update.
- In either of the above cases, you could run `rocotorun` yourself
manually to update the database as long as the job information hasn't
aged off the scheduler yet.

Resolves #3131

---------

Co-authored-by: Rahul Mahajan <[email protected]>
  • Loading branch information
DavidHuber-NOAA and aerorahul authored Dec 24, 2024
1 parent 290f1d2 commit d85214d
Show file tree
Hide file tree
Showing 10 changed files with 548 additions and 186 deletions.
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -211,3 +211,4 @@ ush/python/pygfs/utils/marine_da_utils.py @guillaumevernieres @AndrewEichmann-NO

# Specific workflow scripts
workflow/generate_workflows.sh @DavidHuber-NOAA
workflow/build_compute.py @DavidHuber-NOAA @aerorahul
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,9 @@ parm/wafs

# Ignore sorc and logs folders from externals
#--------------------------------------------
sorc/build.xml
sorc/build.db
sorc/build_lock.db
sorc/*log
sorc/logs
sorc/calc_analysis.fd
Expand Down
4 changes: 1 addition & 3 deletions ci/Jenkinsfile
Original file line number Diff line number Diff line change
Expand Up @@ -120,9 +120,7 @@ pipeline {
def error_logs_message = ""
dir("${HOMEgfs}/sorc") {
try {
sh(script: './build_all.sh -kgu') // build the global-workflow executables for GFS variant (UFS-wx-model, WW3 pre/post executables)
sh(script: './build_ww3prepost.sh -w > ./logs/build_ww3prepost_gefs.log 2>&1') // build the WW3 pre/post processing executables for GEFS variant
sh(script: './build_ufs.sh -w -e gefs_model.x > ./logs/build_ufs_gefs.log 2>&1') // build the UFS-wx-model executable for GEFS variant
sh(script: './build_compute.sh all') // build the global-workflow executables
} catch (Exception error_build) {
echo "Failed to build global-workflow: ${error_build.getMessage()}"
if ( fileExists("logs/error.logs") ) {
Expand Down
56 changes: 19 additions & 37 deletions docs/source/clone.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,35 +18,39 @@ Clone the `global-workflow` and `cd` into the `sorc` directory:
git clone --recursive https://github.com/NOAA-EMC/global-workflow
cd global-workflow/sorc

For forecast-only (coupled or uncoupled) build of the components:
.. _build_examples:

The build_all.sh script can be used to build all required components of the global workflow. The accepted arguments is a list of systems to be built. This includes builds for GFS and GEFS forecast-only experiments, GSI and GDASApp-based DA for cycled GFS experiments. See `feature availability <hpc.html#feature-availability-by-hpc>`__ to see which system(s) are available on each supported system.

::

./build_all.sh
./build_all.sh [gfs] [gefs] [gs] [gdas] [all]

For cycled (w/ data assimilation) use the `-g` option during build:
For example, to run GFS experiments with GSI DA, execute:

::

./build_all.sh -g
./build_all.sh gfs gsi

For coupled cycling (include new UFSDA) use the `-gu` options during build:
This builds the GFS, UFS-utils, GFS-utils, WW3 with PDLIB (structured wave grids), UPP, GSI, GSI-monitor, and GSI-utils executables.

[Currently only available on Hera, Orion, and Hercules]
For coupled cycling (include new UFSDA) execute:

::

./build_all.sh -gu
./build_all.sh gfs gdas

This builds all of the same executables, except it builds the GDASApp instead of the GSI.

For building without PDLIB (unstructured grid) for the wave model, use the `-w` options during build:
To run GEFS (forecast-only) execute:

::

./build_all.sh -w
./build_all.sh gefs

This builds the GEFS, UFS-utils, GFS-utils, WW3 *without* PDLIB (unstructure wave grids), and UPP executables.

Build workflow components and link workflow artifacts such as executables, etc.
Once the building is complete, link workflow artifacts such as executables, configuration files, and scripts via

::

Expand Down Expand Up @@ -107,40 +111,19 @@ Under the ``/sorc`` folder is a script to build all components called ``build_al

::

./build_all.sh [-a UFS_app][-g][-h][-u][-v]
./build_all.sh [-a UFS_app][-k][-h][-v] [list of system(s) to build]
-a UFS_app:
Build a specific UFS app instead of the default
-g:
Build GSI
-k:
Kill all builds immediately if one fails
-h:
Print this help message and exit
-j:
Specify maximum number of build jobs (n)
-u:
Build UFS-DA
-v:
Execute all build scripts with -v option to turn on verbose where supported

For forecast-only (coupled or uncoupled) build of the components:

::

./build_all.sh

For cycled (w/ data assimilation) use the `-g` option during build:

::

./build_all.sh -g

For coupled cycling (include new UFSDA) use the `-gu` options during build:

[Currently only available on Hera, Orion, and Hercules]

::

./build_all.sh -gu
Lastly, pass to build_all.sh a list of systems to build. This includes `gfs`, `gefs`, `sfs` (not fully supported), `gsi`, `gdas`, and `all`.

For examples of how to use this script, see :ref:`build examples <build_examples>`.

^^^^^^^^^^^^^^^
Link components
Expand All @@ -156,4 +139,3 @@ After running the checkout and build scripts run the link script:

Where:
``-o``: Run in operations (NCO) mode. This creates copies instead of using symlinks and is generally only used by NCO during installation into production.

Loading

0 comments on commit d85214d

Please sign in to comment.