Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AIRCRAFT prepbufr to SPOC #5

Open
2 of 4 tasks
nicholasesposito opened this issue Oct 17, 2024 · 5 comments
Open
2 of 4 tasks

Add AIRCRAFT prepbufr to SPOC #5

nicholasesposito opened this issue Oct 17, 2024 · 5 comments
Assignees

Comments

@nicholasesposito
Copy link

nicholasesposito commented Oct 17, 2024

Add acft_profiles prepbufr scripts to the SPOC repository.

These are the high-level scripts.

  • Build the backend
  • Write scripts
  • Compare to GSI diag and GSI IODA
  • Compare to low level outputs, generated by global workflow using gdasprepatmiodaobs

All of the comparisons are done here:
https://docs.google.com/presentation/d/1CQCtbzntg2hXruLKEqqEVuKsMizE8h2DjPmAcka29eg/edit?pli=1#slide=id.g2fc02b94591_0_31

Notes:

  • There are slightly fewer obs in the high level backend outputs compared to GSI IODA and the low level API output. This is because the scripts I wrote are inclusively bounded in the beginning but not the end (-3 hours from cycle time to +2.999999 hours from cycle time). GSI and the low level api are inclusive at -3 hours and + 3 hours, though this may be a bug in the GSI as it is not supposed to include up to +3 and should end at +2.99999. EDIT: This was indeed a bug. I made changes to my GSI so if the time was +3 hours, it was cycled out.
  • For instantaneousAltitudeRate, GSI makes all "missing" values zero, so that is what I did as well.
  • There is no virtualTemperature readings from AIRCFT, though the option to add it using the TVO mnemonic is there.
  • ObsTypes 134 and 234 are filtered out by GDASApp scripts between the GSI diag and GSI IODA step. The converter leaves 134 and 234 in there, meaning we may not send data out for external use. They are for internal use ONLY. All plots filter out these data types. Some of the "hidden plots" in the slides contain this data.
  • The low level api is somewhat different because it is inclusive to +3 hours since all of that data is also in the prepbufr files.
@nicholasesposito nicholasesposito self-assigned this Oct 23, 2024
@nicholasesposito
Copy link
Author

It should be mentioned that there are definitely differences, but once most of the data where there is bad QM, too large data, missing data is filterd out with plotting scripts, they are all remarkably close.

In the UFO filters, the following will need to be done:

  • Make IALR values 0 that are "missing"
  • filter out dateTime data that is +3 (though the backend should take care of this)
  • Maybe get rid of missing data
  • Filter out data that is too large (I arbitrarily chose 20000 since the vast majority of data of any unit is below this, at least for conventional data)
  • Only keep obstypes 130 ,131 ,133 ,135 ,230 ,231 ,233 ,235. 132/232 is dropsondes and 134/234 are TAMDAR which are filtered out by the gsi_ncdiag.py.
  • stationElevation and height can be somewhat "off" at times, but that is because it is grabbing all of the data available. When only using data that has an affirmative UseFlag or with data that QM <= 3, the differences decrease to very small.

@nicholasesposito
Copy link
Author

nicholasesposito commented Oct 24, 2024

The following are exactly equivalent when using the above filters:

  • ObsValue/airTemperature
  • ObsValue/windNorthward
  • ObsValue/windEastward
  • ObsValue/specificHumidity

Small differences with

  • MetaData/instantaneousAltitudeRate (21 datapoints of 120600, .001 m/s in mean, .003m in standard deviation)
  • MetaData/stationElevation ( 188 datapoints of 114500, ~0.9 m in mean, ~0.75m in standard deviation)
  • MetaData/height (188 datapoints of 114500, ~1.0m in mean, ~0.25m in standard deviation)
  • MetaData/pressure (188 datapoints of 114500. ~6 Pa in mean, ~1 Pa in standard deviation)

NOTE:
Some updates may be needed in the future once it's decided if height, stationElevation and pressure should be in MetaData or ObsValue. There is no stationPressure in the GSI files but in the backend files, both stationPressure and pressure are POB. None of the variables are in ObsValue for GSI and none are assimilated, so it is my opinion that they should stay in MetaData. Of course that means we may have to change some of the scripts and source code, but It may be worth it for consistencies sake.

@nicholasesposito
Copy link
Author

nicholasesposito commented Oct 29, 2024

stationPressure is now pressure. pressure has been moved to MetaData because it is not simulated.
ZOB as heightOfStation and ELV as stationElevation are now also MetaData.

There is some longitude QC (if == 180 or >180) from GSI that should be in UFO.
There is some "missing POB" QC from GSI that should also be placed in UFO.

MetaData/sequenceNumber as an array of 0s has been added, similar to GSI.

@nicholasesposito
Copy link
Author

nicholasesposito commented Jan 2, 2025

Added dateTime and cycle_time to py. dateTime is computed in mapping for backend, unless script is used. If script, re-computes. cycle_time comes from yaml and is used to compute dateTime. minDateTime calculated. All scripts edited to be like Emily's suggestions, input names and output names finalized, attributes good. Now using original prepbufr file (gdas.t??z.prepbufr) and using subsets from yaml to determine which data to grab, rather than splitting into a separate file. Deleted yamls that use 'mpi0' since we don't want it in output filename if mpi is not used.

Added obstype to the mapping file because the data is needed later. Note for users of just the mapping that the values still off (300-600 instead of 100-300). Check the python script for how to update the ObsType values.

@nicholasesposito
Copy link
Author

For acft_profiles, dateTime min is a large negative number for the script2netcdf and script4backend. This doesn't matter as those are treated as missing and then are filled in because the group_by.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant