Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

need reduced echam and jsbach output for spinup #384

Open
chrisdane opened this issue Jul 6, 2021 · 38 comments
Open

need reduced echam and jsbach output for spinup #384

chrisdane opened this issue Jul 6, 2021 · 38 comments
Assignees
Labels
echam enhancement New feature or request Hackathon PRs and Issues for hackaton

Comments

@chrisdane
Copy link
Contributor

Hi

Is your feature request related to a problem? Please describe.
The default namelist.echam for a PI-CTRL experiment, which is (almost always?) also used for spinups, creates a lot of data on a high temporal interval (< month) due to the namelist blocks

# cd to esm_tools
git checkout develop
git fetch
# namelists/echam/6.3.04p1/PI-CTRL/namelist.echam:
&mvstreamctl
    interval = 1, 'days', 'last', 0
    target = 'glday'
    source = 'gl'
    variables = 'q:mean'
/
&mvstreamctl
    target = 'g3bday'
    interval = 1, 'days', 'last', 0
    source = 'g3b'
    meannam = 'tslm1', 'tsi'
/
&mvstreamctl
    interval = 1, 'days', 'last', 0
    target = 'g3bid'
    source = 'g3b'
    variables = 'u10:mean', 'v10:mean', 'temp2:mean', 'relhum:mean', 'albedo:mean',
                'dew2:mean', 'ws:mean', 'sn:mean', 'wimax:max', 't2min:min', 't2max:max'
/
&mvstreamctl
    interval = 1, 'days', 'last', 0
    target = 'jsbid'
    source = 'jsbach'
    variables = 'layer_moisture:mean'
/
&mvstreamctl
    interval = 6, 'hours', 'last', 0
    target = 'sp6h'
    source = 'sp'
    variables = 'st:mean', 'svo:mean', 'lsp:mean', 'sd:mean'
/
&mvstreamctl
    interval = 1, 'hours', 'last', 0
    target = 'g3b1hi'
    source = 'g3b'
    variables = 'u10:inst', 'v10:inst', 'wimax:max'
/

For a spinup, this is unnecessary and bad practice since nobody will ever need this data but the disks are full with it.

Describe the solution you'd like
As far as I understand the esm tools, I would like to have a esm_tools/namelists/echam/<version>/PI-CTRL-SPINUP/namelist.echam which is the same as esm_tools/namelists/echam/<version>/PI-CTRL/namelist.echam but with only a few important variables on monthly output frequency, or similar.

I tried to achieve this. I ran a default echam-only PI-CTRL experiment on ollie:

esm_tools branch: develop
runscript: /home/ollie/cdanek/esm/runscripts/echam-ollie-initial-monthly.yaml 
work: /work/ollie/cdanek/out/echam-6.3.04p1/pictrl-grb

The resulting echam output after 1 month is

no	name	interval	time	lon	lat	lev	nsp	nc2	file
1	u10      	hr 	744	192	96	  	    	  	pictrl-grb_200001.01_g3b1hi
2	v10      	hr 	744	192	96	  	    	  	pictrl-grb_200001.01_g3b1hi
3	wimax    	hr 	744	192	96	  	    	  	pictrl-grb_200001.01_g3b1hi
4	lsp      	6hr	124	   	  	  	2080	2	pictrl-grb_200001.01_sp6h  
5	sd       	6hr	124	   	  	47	2080	2	pictrl-grb_200001.01_sp6h  
6	svo      	6hr	124	   	  	47	2080	2	pictrl-grb_200001.01_sp6h  
7	t        	6hr	124	   	  	47	2080	2	pictrl-grb_200001.01_sp6h  
8	tsi      	day	31	192	96	  	    	  	pictrl-grb_200001.01_g3bday
9	tslm1    	day	31	192	96	  	    	  	pictrl-grb_200001.01_g3bday
10	albedo   	day	31	192	96	  	    	  	pictrl-grb_200001.01_g3bid 
11	dew2     	day	31	192	96	  	    	  	pictrl-grb_200001.01_g3bid 
12	rhumidity	day	31	192	96	47	    	  	pictrl-grb_200001.01_g3bid 
13	sn       	day	31	192	96	  	    	  	pictrl-grb_200001.01_g3bid 
14	t2max    	day	31	192	96	  	    	  	pictrl-grb_200001.01_g3bid 
15	t2min    	day	31	192	96	  	    	  	pictrl-grb_200001.01_g3bid 
16	temp2    	day	31	192	96	  	    	  	pictrl-grb_200001.01_g3bid 
17	u10      	day	31	192	96	  	    	  	pictrl-grb_200001.01_g3bid 
18	v10      	day	31	192	96	  	    	  	pictrl-grb_200001.01_g3bid 
19	wimax    	day	31	192	96	  	    	  	pictrl-grb_200001.01_g3bid 
20	ws       	day	31	192	96	  	    	  	pictrl-grb_200001.01_g3bid 
21	q        	day	31	192	96	47	    	  	pictrl-grb_200001.01_glday 
22	apmegl   	mon	1	192	96	  	    	  	pictrl-grb_200001.01_accw  
23	drain    	mon	1	192	96	  	    	  	pictrl-grb_200001.01_accw  
24	grndflux 	mon	1	192	96	  	    	  	pictrl-grb_200001.01_accw  
25	rogl     	mon	1	192	96	  	    	  	pictrl-grb_200001.01_accw  
26	runoff   	mon	1	192	96	  	    	  	pictrl-grb_200001.01_accw  
27	snacl    	mon	1	192	96	  	    	  	pictrl-grb_200001.01_accw  
28	snmel    	mon	1	192	96	  	    	  	pictrl-grb_200001.01_accw  
29	sodif    	mon	1	192	96	  	    	  	pictrl-grb_200001.01_accw  
30	aclc     	mon	1	192	96	47	    	  	pictrl-grb_200001.01_aclcim
31	t2max    	mon	1	192	96	  	    	  	pictrl-grb_200001.01_g3bim 
32	t2min    	mon	1	192	96	  	    	  	pictrl-grb_200001.01_g3bim 
33	topmax   	mon	1	192	96	  	    	  	pictrl-grb_200001.01_g3bim 
34	xi       	mon	1	192	96	47	    	  	pictrl-grb_200001.01_glim  
35	xl       	mon	1	192	96	47	    	  	pictrl-grb_200001.01_glim  
36	lsp      	mon	1	   	  	  	2080	2	pictrl-grb_200001.01_spim  
37	sd       	mon	1	   	  	47	2080	2	pictrl-grb_200001.01_spim  
38	svo      	mon	1	   	  	47	2080	2	pictrl-grb_200001.01_spim  
39	t        	mon	1	   	  	47	2080	2	pictrl-grb_200001.01_spim  

and for jsbach

no	name	interval	time	lon	lat	depth	lev	file
1	layer_moisture	day	31	192	96	5	5	pictrl-grb_200001.01_jsbid

To test for reduced echam output, I ran the experiment

esm_tools branch: echam_scenario_PI-CTRL-SPINUP
runscript: /home/ollie/cdanek/esm/runscripts/echam-ollie-initial-monthly-spinup.yaml 
work: /work/ollie/cdanek/out/echam-6.3.04p1/pictrl-spinup-grb

The resulting echam output after 1 month is

no	name	interval	time	lon	lat	lev	nsp	nc2	file
1	aclcov	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
2	ahfl  	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
3	ahfs  	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
4	albedo	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
5	aprc  	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
6	aprl  	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
7	aprs  	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
8	aps   	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
9	dew2  	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
10	evap  	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
11	friac 	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
12	geosp 	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
13	qvi   	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
14	srad0d	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
15	srad0u	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
16	srads 	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
17	sradsu	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
18	sraf0 	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
19	t2max 	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
20	t2min 	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
21	temp2 	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
22	topmax	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
23	trad0 	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
24	trads 	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
25	tradsu	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
26	traf0 	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
27	trafs 	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
28	tsurf 	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
29	u10   	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
30	ustr  	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
31	v10   	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
32	vstr  	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
33	wind10	mon	1	192	96	  	    	  	pictrl-spinup-grb_200001.01_g3bmon
34	q     	mon	1	192	96	47	    	  	pictrl-spinup-grb_200001.01_glmon 
35	lsp   	mon	1	   	  	  	2080	2	pictrl-spinup-grb_200001.01_spmon 
36	sd    	mon	1	   	  	47	2080	2	pictrl-spinup-grb_200001.01_spmon 
37	svo   	mon	1	   	  	47	2080	2	pictrl-spinup-grb_200001.01_spmon 
38	t     	mon	1	   	  	47	2080	2	pictrl-spinup-grb_200001.01_spmon 

and for jsbach

no	name	interval	time	lon	lat	depth	lev	file
1	cover_fract       	mon	1	192	96	  	11	pictrl-spinup-grb_200001.01_jsbachmon
2	veg_ratio_max_mean	mon	1	192	96	  	1	pictrl-spinup-grb_200001.01_jsbachmon
3	layer_moisture    	mon	1	192	96	5	5	pictrl-spinup-grb_200001.01_jsbachmon
4	lai               	mon	1	192	96	  	1	pictrl-spinup-grb_200001.01_landmon  
5	snow_fract        	mon	1	192	96	  	1	pictrl-spinup-grb_200001.01_landmon  
6	soil_temperature  	mon	1	192	96	5	5	pictrl-spinup-grb_200001.01_landmon  

The total size of outdata of 1 month reduces from

cd /work/ollie/cdanek/out/echam-6.3.04p1/pictrl-grb/outdata
find -name "*200001*" | sort | xargs du -hcs
135K    ./echam/pictrl-grb_200001.01_accw
1.5K    ./echam/pictrl-grb_200001.01_accw.codes
1.1M    ./echam/pictrl-grb_200001.01_aclcim
512     ./echam/pictrl-grb_200001.01_aclcim.codes
79M     ./echam/pictrl-grb_200001.01_g3b1hi
512     ./echam/pictrl-grb_200001.01_g3b1hi.codes
2.2M    ./echam/pictrl-grb_200001.01_g3bday
512     ./echam/pictrl-grb_200001.01_g3bday.codes
63M     ./echam/pictrl-grb_200001.01_g3bid
1.5K    ./echam/pictrl-grb_200001.01_g3bid.codes
109K    ./echam/pictrl-grb_200001.01_g3bim
512     ./echam/pictrl-grb_200001.01_g3bim.codes
52M     ./echam/pictrl-grb_200001.01_glday
512     ./echam/pictrl-grb_200001.01_glday.codes
2.9M    ./echam/pictrl-grb_200001.01_glim
512     ./echam/pictrl-grb_200001.01_glim.codes
164M    ./echam/pictrl-grb_200001.01_sp6h
512     ./echam/pictrl-grb_200001.01_sp6h.codes
1.4M    ./echam/pictrl-grb_200001.01_spim
512     ./echam/pictrl-grb_200001.01_spim.codes
2.2M    ./jsbach/pictrl-grb_200001.01_jsbid
512     ./jsbach/pictrl-grb_200001.01_jsbid.codes
367M    total

to

cd /work/ollie/cdanek/out/echam-6.3.04p1/pictrl-spinup-grb/outdata
find -name "*200001*" | sort | xargs du -hcs
1.3M    ./echam/pictrl-spinup-grb_200001.01_g3bmon
4.0K    ./echam/pictrl-spinup-grb_200001.01_g3bmon.codes
1.7M    ./echam/pictrl-spinup-grb_200001.01_glmon
512     ./echam/pictrl-spinup-grb_200001.01_glmon.codes
1.4M    ./echam/pictrl-spinup-grb_200001.01_spmon
512     ./echam/pictrl-spinup-grb_200001.01_spmon.codes
247K    ./jsbach/pictrl-spinup-grb_200001.01_jsbachmon
512     ./jsbach/pictrl-spinup-grb_200001.01_jsbachmon.codes
102K    ./jsbach/pictrl-spinup-grb_200001.01_landmon
512     ./jsbach/pictrl-spinup-grb_200001.01_landmon.codes
4.6M    total

This reduction is roughly 100 - 4.6 MB / 367 MB * 100 ~ 99% although many of the most important variables are included. Of course, one could argue if e.g. the 3d variable q (specific humidity) needs to be included or if other variables should be included and so on ...

I would be very happy if you could implement this in some way as a standard for echam since I am really sick of this huge unnecessary spinup output everywhere.

Thanks a lot for consideration,
Chris

ps: I couldnt get the jsbach stream veg to work (vegmon in the new namelist.echam in the echam_scenario_PI-CTRL-SPINUP branch).

@pgierz
Copy link
Member

pgierz commented Jul 6, 2021

Can someone PLEASE PLEASE put this in? It's been on my list for months.

@pgierz
Copy link
Member

pgierz commented Jul 6, 2021

How would this be best solved, Christopher?

echam:
    scenario: PI-CTRL-SPINUP

Or rather:

echam:
    namelist_variant: spinup # Or reduced, minimal, or something

I think using a new "scenario" may be confusing to some people

@chrisdane
Copy link
Contributor Author

I dont know. Someone of the esm tools heads need to decide that.

Another point I cannot decide is how to distinguish different streams-definitions in echam.yaml and jsbach.yaml in a clean and esm tools-way.

@chrisdane
Copy link
Contributor Author

chrisdane commented Jul 8, 2021

There is one problem to define scenario-dependent output streams of echam/jsbach. The echam.yaml-entries

streams:
        - accw
        - co2
        - echam
        - g3bid
        - g3bim
        - g3bday
        - g3b1hi
        - glday
        - aclcim
        - sp6h
        - spim
streamsnc:
        - aclcim
        - g3b1hi
        - g3bday
        - g3bid
        - g3bim
        - glday
        - glim
        - jsbid
        - sp6h
        - spim

define which output files will be moved from work to outdata/<model>/. This way, some files are ignored if I change the stream definitions in the namelist, or, the other way around, the esm tools complain that there are missing files if the namelist does not define some of these streams.

In my view the solution to this would be to consider all work/<expname>_<current_date>.01_* files and drop the streams and streamsnc lists completely (<current_date> may be YYYYMM or something similar, I dont know all cases). Would this be in accordance with how the esm tools work?

@mandresm @dbarbi could you please comment on that? Thanks a lot!

@pgierz
Copy link
Member

pgierz commented Jul 9, 2021

Distinguishing that cleanly isn't really so easy. The Namelist for echam also influences jsbach output, but then we have two different yaml files and also two different namelists even just controlling model behaviour. Confusion lurks around every single corner there.

Ultimately, we need to find some way of getting a single source of truth for the streams, and define that only once and have it propagate everywhere.

Chris, did you make a branch? Given that we have space issues, I'd like to put that in and update the handbook

@mandresm
Copy link
Contributor

mandresm commented Jul 9, 2021

Thanks @chrisdane for this suggestion, I think it's a very good point. About the streams, I would need to have a look again at the echam files. However, even if scenario is an extra variable, I think it is a very intuitive one and could help us in the setup of the strings and on the selection of the namelist_dir. @pgierz , why do you say scenario might be a confusing variable for some people?

@pgierz
Copy link
Member

pgierz commented Jul 9, 2021

"Scenario" controls a lot more than just output, also for example which model features should be active. PI shouldn't have human land use change, a future run should. I would therefore recommend separating output control from the scenario.

@chrisdane
Copy link
Contributor Author

Thanks Miguel. I agree that having another scenario is not confusing.

@pgierz1: Having those streams in the echam.yaml and jsbach.yaml is in my view the source of the confusion. Thats why I suggest to drop them completely. I made a suggestion of an alternative and my question was if this is possible.

@pgierz2: Yes I made a esm_tools branch and wrote that several times in my initial post: echam_scenario_PI-CTRL-SPINUP.

@pgierz3: Separating output control and scenario is a good idea. But for echam/jsbach, the output control is defined in the namelist.echam. This namelist, in turn, enters the esm tools via scenario. You would have to change this workflow if you want to separate output control and scenario. It would certainly be possible but Im afraid this would mean much more work than just adding another scenario.

@pgierz
Copy link
Member

pgierz commented Jul 9, 2021

Chris is correct, that would need a bit of a stronger rewrite.

We could, for example, add one more layer of folders? Or maybe better, merge in the output part of the name list, separate from whatever defines physical behaviour? Do we have a merge feature for namelists yet? They are just dictionaries on the python side.

I would again strongly suggest against making the output dependent on the scenarios. That (gut feeling) seems too easy to mess up accidentally. We will fix one, but forget the other.

@pgierz
Copy link
Member

pgierz commented Jul 9, 2021

Chris, sorry I didn't see your branch in the initial post. Info is obviously there, I just need to learn how to read...

@chrisdane
Copy link
Contributor Author

chrisdane commented Jul 9, 2021

Paul, the output definitions for echam/jsbach are not as simple as for fesom. Separating the stream definitions from everything else in the namelist.echam would yield more confusion in my view. For example, &runctl:default_output, i.e. a namelist parameter from another chapter within the namelist, affects the output as well.

I dont see a problem in making output dependent on the scenario. In fact, my initial motivation is to have it.

@mandresm
Copy link
Contributor

mandresm commented Jul 9, 2021

I think this discussion is very interesting and valuable, but I also think we need to include the main advance users of ECHAM into it if we are going to make a major change that affect the streams. What do you think?

@chrisdane
Copy link
Contributor Author

Sure please do so =)

@dbarbi
Copy link
Member

dbarbi commented Jul 9, 2021

The original idea was to use the streams and streamsnc arrays to automatically GENERATE the output stream part in the namelist.echam, but we never got there.

@chrisdane
Copy link
Contributor Author

Ok. I dont think this is feasible. How would you specify that e.g. 1) only certain variables from a specfic stream should be saved 2) in a specific temporal interval? If the yaml lists streams and streamsnc will be extended to support those and more features, I feel you end up having similar entries as in the actual namelist.echam. Why reinventing the wheel?

@dbarbi
Copy link
Member

dbarbi commented Jul 9, 2021

initially, because you could use the same style of writing that to set the output of fesom, openifs, nemo, etc. without having to do it differently for each model.
but it is some amount of work, and noone really called for it, so...

@chrisdane
Copy link
Contributor Author

chrisdane commented Jul 21, 2021

There is related problem with the historical echam namelist, i.e. namelists/echam/6.3.04p1/HIST/namelist.echam. The entries

&runctl
  PUTDATA        = 3,'hours','last',0
  default_output = .true.

and

&mvstreamctl
    interval = 1, 'days', 'last', 0
    target = 'g3bid'
    source = 'g3b'
    variables = ... , 'temp2:mean', ...
/

yield, for annual runs, the variable temp2 in the annual files

<expid>_<YYYY>01.01_echam.nc # ntime = 2928 --> 3hourly output
<expid>_<YYYY>01.01_g3bid.nc # ntime = 366 --> daily output (this test year is a leap year)

The annual (cdo yearmean) temp2 anomaly between the _echam and the _g3bid files looks like this:
anomaly_yearmean
or

cdo info anomaly_yearmean.nc
Minimum        Mean     Maximum
 -0.064362  1.9698e-05    0.061951 # Kelvin

The monthly (cdo -seltimestep,1 -monmean) temp2 anomaly between the _echam and the _g3bid files is even larger:
anomaly_monmean_mon1
or

cdo info anomaly_monmean_mon1.nc
Minimum        Mean     Maximum
-0.12109  -0.0027946     0.11356 # Kelvin

The daily (cdo -seltimestep,1 -daymean) temp2 anomaly between the _echam and the _g3bid files is even larger:
anomaly_daymean_day1
or

cdo info anomaly_daymean_day1.nc
Minimum        Mean     Maximum
 -1.0143  -0.0027390      1.2628 # Kelvin

Since temp2 from the _g3bid file is explicitly set to be the mean ('temp2:mean'), temp2 from the _echam file seems to represent something else, maybe a snapshot, its not clear. I think this is another argument to set default_output to false since its not clear what the output is.

@pgierz
Copy link
Member

pgierz commented Jul 21, 2021

I can briefly comment on this one: temp2 in the default_output = true is misleading. In fact the entire "default output" is...well, let's go with the description "weird". The file in that case is a mixture of snapshots and means. Christian is currently working on a table to definitively say which is which, but if I remember correctly temp2 was snapshots. I would therefore recommend caution using that particular variable in that file for any "sensible" analysis....

If you are after monthly means, I have a few template namelists that we are ironing out here. Check out the "production" version, that might be what you need. You actually even made the spinup ones :-) https://gitlab.awi.de/paleodyn/Models/namelists

Chris, to understand your screenshots (maybe I just need to read): those are anomalies between two files of the same run? We saw similar patterns comparing tsurf and temp2, but that was in one case a snapshot and in the second a monthly mean. You could rather clearly see where the sun was for the snapshot case (of course, there are also other differences -- tsurf shows whatever the actual surface is. SST, soil, plant canopy....)

@chrisdane
Copy link
Contributor Author

those are anomalies between two files of the same run?

Yes. Same variable, same run, different output files.

@pgierz
Copy link
Member

pgierz commented Jul 21, 2021

Alright, wrong place to complain, but: "ugh....echam....why"

Consider the following more to be "public note taking" (or whatever thinking out loud is for a forum):

For your screenshots, what you have is -- if I understand it correctly -- a yearly (or monthly, or daily) average of snapshots vs a yearly average of daily means. That would (maybe) explain the vertical bands you see there (day/night difference??) It's quite a small difference though, I would have expected something clearer if you're always capturing European midnight/3am/6am/noon.

Long story short, depending on the analysis you want to do, I would instinctively prefer the data in the g3bid files.

Yet another point on my ever-growing list of why we need to fix the echam namelist for sensible output....

@chrisdane
Copy link
Contributor Author

It's quite a small difference though

For someone working with daily data the difference is on the order of 1 K, i.e. super large :p

@pgierz
Copy link
Member

pgierz commented Jul 21, 2021

Yes, I was more referring to the top two figures. Funny how it is so localized over North America. I would have guessed that if you have 3 hour output and average noon/3pm/6pm/9pm/midnight/3am/6am/9am that there isn't such a clear spatial pattern. Plus you can see waves over Eurasia. Odd....but, as I said, I'd just use the data in the other file, that will be at least clearer.

And still, one for the list: we need to tame echam output. Urgently.

@chrisdane
Copy link
Contributor Author

The default stream definitions are printed in every atmout. The echam:temp2 variable is defined as

name               : g3b
output  file suffix: _echam
name                            units rank  ke alloc.  grid prt acc mis rst tbl cde bit lev_type
temp2                           K        2   1 T GAUSSIAN    T   F   F   T  128 167  16 SURFACE 

whereas the gsbid:temp2 variable is defined as

name               : g3bid
output  file suffix: _g3bid
name                            units rank  ke alloc.  grid prt acc mis rst tbl cde bit lev_type
temp2                           K        2   1 T GAUSSIAN    T   T   F   T  128 167  16 SURFACE

that means

echam:temp2:acc=laccu=F # see mo_linked_list.f90 for how the lines above are printed in atmout
g3bid:temp2:acc=laccu=T

In the echam6 docu its written that (p. 119)

In order to write a field to an output file, lpost=.true. must be specified. Generally
the actual values of the field are written. However, if laccu=.true. is specified, the
values are divided by the number of seconds of the output interval before output and set
to the value of the variable reset afterwards.

And indeed, the default of laccu is false (p 118) and this also explains the differences in those two temp2 variables as laccu=true yields:

“Accumulation” flag: Does no accu-
mulation but divides variable by the
number of seconds of the output in-
terval and resets it to reset after
output.

That would mean that for any variable, that is not explicitly defined via

&mvstreamctl
    interval = 1, '<interval>', 'last', 0
    target = '<filetag_where_varname_should_be_saved>'
    source = '<echam_streamname_from_which_to_take_varname>'
    variables = ... , '<varname>:mean', ...
/

, one must check the respective atmout-entry to figure out if it represents accumulated (laccu=true), i.e. averaged values, or non-accumulated (laccu=false), i.e. snapshots?

@pgierz
Copy link
Member

pgierz commented Jul 23, 2021

One note: the outputs seems to be recorded for any changed stream. Not sure if the default is also written to the log. If it is, that'd be great to know, please also forward that info to Christian.

At some point, we need to ask Hamburg to clarify. Perhaps that point has been reached. At least on my end, I'm out of expertise. Sorry.

@chrisdane
Copy link
Contributor Author

@mandresm @denizural @dbarbi could you please implement some kind of switch so that switching namelist.echam and/or namelist.jsbach becomes possible? The current workflow via streams (and streamsnc) makes this impossible as files from user-specific streams will be ignored if they are not set in those echam.yaml or jsbach.yaml lists.

@pgierz
Copy link
Member

pgierz commented Jul 27, 2021

You could try to implement that yourself. I can take 20 minutes tomorrow and show you the relevant code that would need to be modified.

Open source works best if anyone who wants a feature tries to build it. That also would be good to increase overall knowledge of the code base by more than just the core team.

So, if you want to try, pull develop, and start a new branch from there, open a draft PR, and via comments and whatever, we will talk you through it :-)

@chrisdane
Copy link
Contributor Author

Dropping echam.yaml:streams would be a large modification of the esm tools. I have not enough knowledge and time for this =)

@pgierz pgierz mentioned this issue Aug 5, 2021
@pgierz
Copy link
Member

pgierz commented Aug 24, 2021

There is a relevant PR that allows you to use the echam namelist to automatically define the streams. You must set it up in your run config. Please see: esm-tools/esm_runscripts#165

Closing for now, re-open if more discussion is needed.

@pgierz pgierz closed this as completed Aug 24, 2021
@chrisdane
Copy link
Contributor Author

The PR does not work.

The following workaround quasi-works:

  1. specify an individual namelist.echam and further reading files via
echam:
    namelist_dir: "/path/to/my/special/namelist.echam/"
    further_reading: "echam_myoutput.yaml"
jsbach:
    further_reading: "jsbach_myoutput.yaml"
  1. similarly as for Fesom, put the echam and jsbach streams to be created by the special namelist.echam:
cat echam_myoutput.yaml
streams:
        - echamstream1
        - echamstream2
streamsnc: ${streams}

and

cat jsbach_myoutput.yaml
streams:
        - jsbachstream1
  1. Run the model

The following does not work yet:

  1. Only January outdata files are moved to the respective outdata/<model> dirs. The output of all other months is moved to unknown/
  2. .code files are not considered at all and moved to unknown/

@chrisdane chrisdane reopened this Oct 14, 2021
@pgierz
Copy link
Member

pgierz commented Oct 15, 2021

I would not say the PR does not work. I would say it does not work yet ;-)

Can you send me the path to your experiment? I will have a look.

@chrisdane
Copy link
Contributor Author

chrisdane commented Oct 15, 2021

/work/ba1103/a270073/out/awicm-1.0-recom/awi-esm-1-1-lr_kh800/historical2

Correction: the 2 problems affect jsbach files only.

@JanStreffing
Copy link
Contributor

Catching up on this: Does the reduced output look realistic in release 6?

@chrisdane
Copy link
Contributor Author

chrisdane commented Apr 5, 2022

Due to the totally different output strategies of the individual models, I think a general out of-the-box implementation in the esm_tools is difficult.

For fesom it works quite well via fesoms output scheduler and a yaml file listing all wanted variables in the wanted output frequencies.

For echam/jsbach its more complicated. I found a workflow that suits for me, it works using

  1. namelists of echam and jsbach
  2. specific stream lists in the echam and jsbach yamls
    for a set of variables that I want to save in a desired frequency. I dont know how to properly implement this in the esm_tools so that its useful. Work in progress I would say.

For other models I dont know.

@JanStreffing: yes of course the output is realistic.

@mandresm
Copy link
Contributor

Hi all,

I've been looking at this issue for a bit and also to the solution @pgierz offered, that I believe never got merged (esm-tools/esm_runscripts#165). Is that correct?

I see the solutions given to, and used by @chrisdane as provisional solutions, as in my eyes there is a "major" issue as @chrisdane points out, and it's that there is duplication of information. Not only that, but if you add a stream on a namelist and then you forget to update the stream list you won't realize unless you are looking at the log files very much in detail or you are monitoring the output and making sure it doesn't get dump into the unknown folder.

I think the streams for echam and jsbach should be built based on the final namelist as @pgierz tried in his PR.

I think it would also be great to add the stream's control through the runscript (still providing a good namelist.echam template which the user modifies through the runscript). That would mean new syntax specific to ECHAM, but it could be made so that it's as close as possible to the namelist syntax, or the existing namelis_changes syntax, so that it is intuitive to use.

Please, let me know what do you think about these two ideas (getting a version of what Paul did into release 6 and stream-namelist modification through the runscript). In turn, if you are okay with your workflow now and you don't want any changes, you can go ahead and close the issue.

@chrisdane
Copy link
Contributor Author

chrisdane commented Jun 17, 2022

Hi

The PR esm-tools/esm_runscripts#165 does not work.

I think its close to impossible to build an algorithm that deals with all possible namelist.echam stream definitions. Its a black box.

I think the better solution is to put a list of streams (and streamsnc) in a separate file next to the used namelist.echam file, overwriting the default streams. E.g. streams1=stream1,stream2,... to namelist.echam1 in which model output from stream1 and stream2 is wanted and streams2=stream3,stream4,... to namelist.echam2 in which model output from stream3 and stream4 is wanted. Sure, that's not beautiful, but at least it would enable some sort of modularity.

So I am ok with closing, but this issue remains a dead end for the modularity approach of the esm_tools :(

@mandresm
Copy link
Contributor

My idea was not to build a full syntax for the streams, but instead let the user include their modifications to the namelist through the runscript with the same structure of the namelists, with sections and variables, the same way that was done for namelist_changes but special to the streams in that repetition of sections would be allowed (which is not the case in namelist_changes). That's one point and it might be an overkill.

The other point is the PR esm-tools/esm_runscripts#165. That one is broken, but is fixable as the problem it is trying to fix is relatively simple (at least as I understand it, maybe I am loosing something important): read the namelist and extract the stream names. This one might be something we want to pursue in the future. Anyway, let's see if someone reopens this one or something similar in the future.

@pgierz
Copy link
Member

pgierz commented Jun 17, 2022

I would not say close to impossible. After all, there are rules inside of echam for how it produces output and what those files are called. Rules which we could replicate. Yes, it is very echam specific, but we already have specific model things on the tools. See for example Oasis. Plus Echam being sketchy about its documentation should be given as a lesson to anyone who thinks about writing climate code. Our own project is also slowly working on improving that.

To me this boils down at the end to a design question. Duplicate information is by definition error prone. You are bound to forget one of your multiple places. I'd like to keep this issue open, as a place for discussion if nothing else.

@pgierz pgierz reopened this Jun 17, 2022
@pgierz pgierz added Hackathon PRs and Issues for hackaton enhancement New feature or request echam labels Jun 21, 2022
@pgierz pgierz self-assigned this Jun 21, 2022
@github-actions
Copy link

This issue has been inactive for the last 365 days. It will now be marked as stale and closed after 30 days of further inactivity. Please add a comment to reset this automatic closing of this issue or close it if solved.

@github-actions github-actions bot added Stale and removed Stale labels Aug 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
echam enhancement New feature or request Hackathon PRs and Issues for hackaton
Projects
None yet
Development

No branches or pull requests

5 participants