-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hist fixed Merge (For Discussion Only!!!) #190
base: prep_release
Are you sure you want to change the base?
Conversation
Hotfix/model environment cases
Colorful diffs
Exception to add_<env_vars> added to the environment case checker
Hotfix/trace log into log dir
Hotfix/copy namelists
…ists in the experiment tree If a run crashes, and the user has been working in the virtual environment, they needed to remember to reactivate the environment to continue the run. Now, this check happens for you automatically and the virtual environment under ``$BASE_DIR/$EXP_ID/.venv_esmtools`` is reused if it exists.
feat(virtual_env_builder): recycles the virtual env if one already exists in the experiment tree
allows to exit right away from venv question
issue_268 merge
…t_models Hotfix/coupling fields different models
Hotfix/subdirs on targets
Hotfix/subdirs on targets followup
… section of a simulation. By default the only reusable files are 'bin' and 'src'. With this change it is possible to also reuse, for example, 'input'. Reused subfolders can be now copied correctly into the work directory (before it was making a mess)
Reusable_filetypes in config files
overcommit feature: possibility to use less CPUs than the number of processes
Fix/venv editable install
…efined in the namelist
…hat is defined in the namelist" This reverts commit 5389c6d. accidentally pushed
Hi all I am currently using this branch at Can I do anything? Cheers, |
Hi @chrisdane, the boys are at the CliDyn retreat today and tomorrow, so I'm the only guy running support. Can you try merging, see how much of the conflicts are just dumb stuff like version numbers and whatnot? That would cut down on the conflicts. |
I've just realized that this is |
So, I've checked the history of this branch and it consists of 2 commits, one made by @pgierz to add an additional functionality to the echam namelists, and one by @chrisdane to remove all the changes of the previous commit. So this branch at its current state is not different at all from what is included in Here a4080c5, you can find the actual merge into |
The work for that commit is halfway done. @chrisdane Found some inconsistencies, it's on my list. |
From email discussion earlier: Hi Christian, From my last information Fernanda has a running Historical setup, and is just waiting for the results. Fernanda, fingers crossed it all works :-) I’m currently in the process of setting up automatic tests. These will run on any commit made to esm-tools. I would say we test that all scenarios at least find all the correct files and are able to run the first and second year (so, once a “cold start” and then the next restart) I am slightly grumbly about wasting electricity on things that should work, but I would rather we waste a little bit than spend weeks debugging problems for students and production runs for the senior scientists. What do you think? Miguel, I may need to talk with you and Deniz briefly. We might need something that tells the sad runscript file to “wait” and give back the model exit code for the auto-tests to work. And we should also really consider renaming this sad file to a run file. It makes me sad whenever I look at it ;-) Christian, can you post any other info directly on github? Then it is available for anyone looking into the discussion, sorry if the system did not set that up for you, perhaps I misconfigured something. All the best |
There are currently various conflicts - is this something that can be quickly solved?
|
I've just made an issue in Jira for that. I'd say we do this when we have the monorepo up and running. |
I will have a look at the various merge conflicts. Christian, I will put a note here once I am done, could you then please have your student give it a try? |
Dear @pgierz @denizural @chrisdane. Our bachelor student Jule is a bit under time pressure to start a historical simulation with AWI-ESM2.1. Do you have an idea how severe the problems with hist_fixed are so that we can estimate whether it makes sense for her to run historical simulations based on esm-tools, or whether we have to look for an alternative? Please let me know in case the problem is on her side and not on side of the branch itself - so far I got the impression that the problem is rather in the branch. In case I can be of help to either fix the problem or help identifying the glitches, please also let me know. Thanks a lot! |
Hi @christian-stepanek, is your student currently having any problems? If you encountered a runtime problem could you please inform us here? |
Dear @denizural, yes we currently still are having problems with getting the hist_fixed branch work together with Paul's awiesm-2.1-wiso version that includes PICO-FESOM coupling. Thanks to a lot of help by @pgierz I have managed to get to the point where a historical simulation is successfully submitted. During starting phase all the necessary forcing files appear to arrive where they should do. Yet, unfortunately, ECHAM6 crashes at startup due to a missing library (libpnetcdf).
Interestingly the simulation does not halt at that point, but resubmits itself just to crash again with the same problem. We had problems of that kind before, likely the error code is not properly caught or interpreted. To overcome the problem of the missing library @pgierz suggested to recompile awiesm-2.1-wiso after switching to the hist_fixed branches of esm-runscripts and esm-tools. Yet, that does not work, as apparently on that branch esm_master is not able to compile awiesm-2.1-wiso due to conflicting model sources (it expects other sources than available in that version of the model). I have tried to overcome the crash due to the missing NetCDF library by adding the path, where the model (appears) to expect the library, to my PATH variable.
Indeed, the model then does not crash anymore and produces NetCDF output - yet, that output is corrupted. I am at the end of my ideas and would really appreciate some help in getting this solved. Here some information on where you find the model code, the version of the tools components, as well as a list of the most relevant changes that I did to get the hist_fixed branches into the work flow. Path to simulation:
Path to model code:
esm_tools versions
Key steps taken to set up model, infrastructure, and simulation:
|
@mandresm I heard from Paul that you are working on getting the wiso flavor of awiesm-2.1 into the pre-release branch, and that you are also testing historical. Could you please give a remark when historical is successfully working on the pre-release branch, and when we could employ that branch for our work? Thanks a lot. |
Dear @christian-stepanek , As far as I know awiesm-2.1-wiso is not supported in Currently, we are working on preparing version 6. The branch for version 6 has both integrated (standalone testing for historical runs is WIP, to be finished today), but I have not tested that they play well together. If you want, we can meet on monday to give it a try in the future release branch. |
Dear @mandresm, Actually, at the current state of my mind I do not really care anymore whether wiso is included in the model version or not. If you could point me to any awiesm-2.1 and esm-tools capable of running a historical simulation, that would be marvelous. It may be something that is not released yet, as long as it runs for this one particular simulation. We have here a bachelor student who has not much time to finish her research. My intention was to let her use a model version that is up-to-date to generate a historical simulation. Hence I have spent the last days trying to figure out how to get the hist_fixed branch into the awiesm-2.1-wiso version by @pgierz. Yet, maybe that plan was too aspirational. In the end I am happy if our student has any awiesm-2.1 model and an esm-tools at hand that can successfully run a historical simulation. We would also run a PALEO simulation thereafter, but as far as I know that setup type is generally stable in esm-tools. |
@mandresm I think we got very close to a running simulation, as outlined above. The hist_fixed branch was successfully implemented and the forcing files are also distributed to work. The last problem is the echam6 error with missing libpnetcdf.so. I think thereafter the simulation would actually run. In case you have any quick idea how this could be solved that would be marvelous as well. |
Can you provide a runscript? |
Of course. Here: /mnt/lustre02/work/ab0246/a270061/awiesm-2.1-wiso_NetCDF_error/pico-fesom/run_configs/run_hist_working.yaml The whole simulation runs in /mnt/lustre02/work/ab0246/a270061/awiesm-2.1-wiso_NetCDF_error/pico-fesom/experiments/run_hist/ |
Can you also provide the esm_tools path? As the error message says the problem is on echam missing |
Of course, also in that project directory:
|
I've compared your echam compilation scripts to a working compilation script in or newest version and there are no differences that would affect the libnetcdf. However, I see quite some changes in the echam source code. I'd suggest you do a clean installation of awiesm-2.1 first and check if there are libraries missing with |
Hi all,
I'd like to at some point get rid of this hist_fixed branch. It seems to be "long lived", and that's not really the point of a branch that only fixes something.
So, what actually needed to be changed to get the historical run to work? What of that is not yet in prep release?
@dbarbi @mandresm @chrisdane @denizural @christian-stepanek @ fernandadialzira
Please ping anyone I forgot.