Skip to content

Analog_practice

jbedia edited this page Sep 12, 2014 · 6 revisions

PRACTICE: Downscaling lead-month 2 System4 hindcast predictions for maximum summer temperature in a set of GSN station in the Iberian Peninsula

Learning goals:

The goal of this practice is to get familiar with the classical data handling steps involved in typical downscaling exercises, in particular:

  • Loading data of various types from local and remote locations using the downscaleR and ecomsUDG.Raccess tools to this aim
  • Application of a perfect-prog downscaling method (analogs)
  • Manipulation of climate data classes in downscaleR: creation of multifields, pre-processing of predictors, EOF analysis, re-gridding...
  • Visualization of the downscaled forecast skill and basic assessment of the quality of the predictions

Aim of the practice

During the practice we will perform analog downscaling of the System4 seasonal forecast of summer (JJA) maximum surface temperature on six weather stations of the GSN network in Spain. NCEP reanalysis data will be used for training the model and then we will perform the test on the S4 data. To this aim, we will access the ECOMS User Data Gateway for loading both predictors and test data, while the predictand (weather station data) will be obtained from the built-in dataset included in the downscaleR package. To undertake the practice, you will need to obtain some data from this paper.

Data needed and methods:

  • Predictands: Max daily surface temperature. GSN station data of Spanish stations (buit-in data in downscaleR library).
  • Predictors: Atmospheric pattern P5 in the reference paper (see Tables 1 and 2). Load data from ECOMS-UDG. Use the 10 first Principal Components
  • Spatial domain: Domain Z6 in the reference article (approximated)
  • Test data: All members from System4 seasonal forecast (15 members). Forecast lead month = 2 (i.e., April run)
  • Target season: Boreal summer (JJA)
  • Training period: 1991-2000
  • Test period: 2001-2010
  • Downscaling method: M1c (Table 3)
  • Validation: multimember skill assessment by means of a plot of terciles (type help("tercileValidation"))
  • Extra: If the practice is successfully accomplished and you have still some extra time, repeat the exercise using as predictand the gridded dataset WFDEI instead of the GSN stations, for the same spatial domain.

Practice steps:

1- Load all the GSN stations within the domain of analysis for the calibration period and the target season using the appropriate arguments of the loadStationData function. Explore the structure of the object returned and retrieve some key information like the station names and their codes, their altitude... Also, plot the annual time series for both stations (Note: you don't need to replicate exactly the example plot below).

# Load station data
gsn <- file.path(find.package("downscaleR"), "datasets/observations/GSN_Iberia")
obs <- loadStationData(dataset = gsn, var = "tmax", lonLim = c(-11,5), latLim = c(35,46), season = 6:8, years = 1991:2000, tz = "GMT")
# Plotting of annual time series
yrs <- getYearsAsINDEX(obs)
plot.data <- sapply(1:ncol(obs$Data), function(x) {tapply(obs$Data[ ,x], INDEX = yrs, FUN = mean, na.rm = TRUE)})
colnames(plot.data)  <- obs$Metadata$station_id
plot(unique(yrs), plot.data[ ,1], ylim = c(floor(min(plot.data)) - 5, ceiling(max(plot.data))), ty = "b", ylab = "tasmax", xlab = "year")
for (i in 2:ncol(plot.data)) {
      lines(unique(yrs), plot.data[ ,i], col = i, ty = 'b')
}
legend("bottom", colnames(plot.data), col = 1:6, pch = 21, ncol = 2)
title("Mean maximum temperature summer (JJA)")

Fig1-p:ts_tasmax

2- Load the predictors from the ECOMS User Data Gateway, considering again the domain of analysis, the target season and the calibration period. Explore the returned objects and make a visualization of each field.

library(ecomsUDG.Raccess) # Load the ecomsUDG.Raccess package
loginECOMS_UDG("user", "password") # Login with your personal username and password
psl.ncep <- loadECOMS(dataset = "NCEP", var = "psl", lonLim = c(-11,5), latLim = c(35,46), season = 6:8, years = 1991:2000, time = "DD")
tas.ncep <- loadECOMS(dataset = "NCEP", var = "tas", lonLim = c(-11,5), latLim = c(35,46), season = 6:8, years = 1991:2000, time = "DD")
# Plotting fields:
par(mfrow = c(1,2))
plotMeanField(psl.ncep)
plotMeanField(tas.ncep)
par(mfrow = c(1,1))
# Note the different grids of temperature and pressure!

Fig2-p:predictor_fields

3- Join the predictors loaded to create a multifield. Note that their grids have different resolution, so prior to multifield creation you will need to perform a re-gridding...

psl.ncep2 <- interpGridData(psl.ncep, getGrid(tas.ncep))
mf <- makeMultiField(psl.ncep2, tas.ncep)
plotMeanField(mf)

Fig3-p:predictor_multifield

4- Perform the PCA on the calibration data (remember: retain just the first 10 PCs).

pred.pca <- prinComp(mf, n.eofs = 10)

5- Load the test data. Make a visualization of the predictors and create a multifield using the same grid as the predictors.

psl.s4 <- loadECOMS(dataset = "System4_seasonal_15", var = "psl", lonLim = c(-11,5), latLim = c(35,46), season = 6:8, years = 2001:2010, time = "DD", leadMonth = 2)
tas.s4 <- loadECOMS(dataset = "System4_seasonal_15", var = "tas", lonLim = c(-11,5), latLim = c(35,46), season = 6:8, years = 2001:2010, time = "DD", leadMonth = 2)
# Example: visualization of a multimember
plotMeanField(psl.s4, multi.member = TRUE)

Fig4-p:multimember_psl

# Regridding to match predictors grid
psl.s4.2 <- interpGridData(psl.s4, new.grid = getGrid(mf))
tas.s4.2 <- interpGridData(tas.s4, new.grid = getGrid(mf))
# Multifield creation
mf.test <- makeMultiField(psl.s4.2, tas.s4.2)

6- Perform the downscaling (remember: method M1c).

an <- analogs(obs = obs, pred = pred.pca, sim = mf.test, n.neigh = 15, sel.fun = "random")

7- Load the validation data.

val <- loadStationData(dataset = gsn, var = "tmax", lonLim = c(-11,5), latLim = c(35,46), season = 6:8, years = 2001:2010, tz = "GMT")

8- Visualize the downscaled forecast skill using the tercileValidation function.

# This is a plot that performs the mean of all stations
tercileValidation(an, obs = val
# This example enables the visualization of results for a particular station (e.g. Barcelona):
tercileValidation(an, obs = val, stationId = "SP000008181")
title("Barcelona")

Fig5-p:cofinogram_bcn


Extended practice The next steps are optional once the previous steps have been accomplished.

9- Load the gridded observational dataset WFDEI and visualize it.

gridded.obs <- loadECOMS(dataset = "WFDEI", var = "tasmax", lonLim = c(-11,5), latLim = c(35,46), season = 6:8, years = 1991:2000)
plotMeanField(gridded.obs)

Fig6-p:tasmax_wfdei

10- Do the downscaling using the same predictors as before, and applying the method M1a of the reference paper (Table 3).

an.wfdei <- analogs(obs = gridded.obs, pred = pred.pca, sim = mf.test, n.neigh = 1)

11- Visualize the skill of the downscaled forecast.

tercileValidation(an.wfdei, gridded.val)

Fig7-p:cg1

12- Compare the skill of the downscaled forecast with the skill of the raw forecast. What can you say about the added value of downscaling in this example?

# First the raw forecast data of tasmax are loaded
S4.forecast.raw <- loadECOMS(dataset = "System4_seasonal_15", var = "tasmax", lonLim = c(-11,5), latLim = c(35,46), season = 6:8, years = 2001:2010, time = "DD", leadMonth = 2)
tercileValidation(S4.forecast.raw, gridded.val)

Fig8-p:cg2

13- Calculate the mean bias of the downscaled output with regard to the validation data. Do it for each member sepparately and then plot a map of the mean multimember bias.

# Spatial mean of the observations, used as the reference
attr(gridded.val$Data, "dimensions") # returns the following vector: "time" "lat"  "lon" 
obs <- apply(gridded.val$Data, MARGIN = c(2,3), FUN = mean, na.rm = TRUE)
# These are the dimensions of the data array
dimNames <- attr(an.wfdei$Data, "dimensions") # returns the following vector: "member" "time"   "lat"    "lon"   
# This is the index position of the dimension "member"
ens.dim.index <- grep("member", dimNames)
# Number of members:
n.mem <- dim(an.wfdei$Data)[ens.dim.index]
# This function computes the bias of each downscaled member w.r.t. the reference
bias.list <- lapply(1:n.mem, function(x) {
      # This step selects the data from the member x
      mem.data <- asub(an.wfdei$Data, idx = x, dims = ens.dim.index)      
      # This step calculates the spatial mean
      pred <- apply(mem.data, MARGIN = c(2,3), FUN = "mean", na.rm = TRUE)
      return(pred - obs)
})
names(bias.list) <- paste("Member", 1:15, sep = "_")
# 'bias.list' is a list of bias objects
mean.bias <- sapply(1:length(bias.list), function(x) mean(bias.list[[x]], na.rm = TRUE))
# Plot of mean bias by members
barplot(mean.bias, names.arg = names(bias.list), las = 2, main = "Mean bias of different members", ylab = "deg-C")

Fig9-p:barplot

# Now the spatial mean bias is computed by binding all members along a new dimension
require(abind) # used to bind data arrays along selected dimensions
bias.array <- do.call("abind", c(bias.list, along = -1))
# And computing the mean along the lon/lat dimensions
mean.spatial.bias <- apply(bias.array, MARGIN = c(3,2), FUN = mean)
require(fields) # used for 'image.plot'
image.plot(gridded.val$xyCoords$x, gridded.val$xyCoords$y, mean.spatial.bias, asp = 1, xlab = "lon", ylab = "lat")
world(add = TRUE)
title("Mean multimember hindcast bias")

Fig10-p:biasmap

Clone this wiki locally