Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 5: Simplify file and directory management #32

Open
seb-wahl opened this issue Sep 3, 2020 · 2 comments
Open

Release 5: Simplify file and directory management #32

seb-wahl opened this issue Sep 3, 2020 · 2 comments
Labels
enhancement New feature or request

Comments

@seb-wahl
Copy link
Contributor

seb-wahl commented Sep 3, 2020

I'd like to put the following for discussion (as it bugs me since I got to know the tools):

The file management and directory management should be simplified as tons of files are copied back and forth which makes (for those who didn't code the core parts (compute.py, jobclass.py, ...) very tricky to track down errors. In addition copying large (restart, forcing) files several times may significantly slow down job throughput. Having worked with the MPI-ESM runtime manager mkexp (python with Jinja2 style .config files) I find their file and directory management simpler and more efficient (while other things are horrible in mkexp); so here comes my suggestion:

  1. Upon start, esm_runscripts creates the the directory structure expid/restart/, expid/outdata/, expid/forcing ... like it is done at the moment.
  2. Copy/Link required forcing files for the current run into expid/forcing. On cold start optionally create a copy of esm_tools there as well.
  3. create a work folder expid/work/run_XXXX-YYYY/.
  4. copy/link all files (forcing, restart, namelists) required for the current run into expid/work/run_XXXX-YYYY/.
  5. cd expid/work/run_XXXX-YYYY/, sbatch .....
  6. Once done copy only the restart files into expid/restart/.
  7. trigger a subjob (like the post jobs at the moment) the does the cleanup (i.e. copying outdata, logs etc in place) of expid/work/run_XXXX-YYYY/ following the bullet-proof method used in mkexp (details later)
  8. increment date and go to 2.) and continue until run is done.

And last: Have all logs (model logs, esm_runscript logs, filelist, *finished.yaml, ...) in one place.

I know this against the current philosophy that everything related to the current run shall be in expid/run_XXXX-YYYY/ but it would certainly simplify the complete config dict and hence make error tracking easier.

@seb-wahl seb-wahl added the enhancement New feature or request label Sep 3, 2020
@dbarbi
Copy link
Member

dbarbi commented Dec 1, 2020

I think with the changes we made we are sufficiently close to what you wanted, right?

@pgierz
Copy link
Member

pgierz commented Oct 16, 2021

Going through some old issues to start cleaning up before the next release: is this still relevant? If not, @seb-wahl, please close or alternatively please respecify the problem that is happening so we can make a plan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants