Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory usage #15

Open
AlexandreWadoux opened this issue Apr 10, 2019 · 0 comments
Open

memory usage #15

AlexandreWadoux opened this issue Apr 10, 2019 · 0 comments

Comments

@AlexandreWadoux
Copy link

It happened several times that I had a memory usage problem with optimUSER, even though my user-defined objective function was very fast to compute and was not taking any memory. I finally managed to make a reproducible example.

I define my objective function as the spatially average prediction error variance obtained by kriging.

See the example below:

library(sp)
library(raster)
library(gstat)
data(meuse.grid, package = "sp")

# create the candidate locations based on a fine resolution grid
candi <- disaggregate(rasterFromXYZ(meuse.grid), fact=3)
candi <- as.data.frame(candi, xy=T, na.rm=T )

# study area boundaries
data(meuse.area, package = 'sp')
meuse.area <- list(Polygons(list(Polygon(meuse.area)), "meuse.area"))
meuse.area <- SpatialPolygons(meuse.area)

# make a grid
coordinates(meuse.grid) <- ~x+y
gridded(meuse.grid) <- TRUE

# fit a geostatistical model using log(zinc) as variable of interest
data(meuse, package='sp')
coordinates(meuse) <- ~x+y
lzn.vgm = variogram(log(zinc)~1, meuse)
lzn.fit = fit.variogram(lzn.vgm, model = vgm(1, "Sph", 900, 1))

FUN <- function(points, meuse.grid, model ){
  
  # take the matrix and change to dataframe, give a value to zinc (required by gstat, 
  # not used to compute the error variance)
  points <- as.data.frame(points)
  meuse <- points
  meuse$zinc <- 100
  
  # compute kriging prediction error variance
  coordinates(meuse)<- ~x+y
  lzn.kriged = krige(log(zinc)~1, meuse, meuse.grid, model = lzn.fit)
  varpred <- lzn.kriged$var1.var
  
  # avoid possible NA, return the averaged prediction error variance
  if(is.na(mean(varpred))){return(Inf)}else{return(mean(varpred))}
  
}
  
schedule <- scheduleSPSANN(chains = 100, initial.temperature = 10, 
                           temperature.decrease = 0.5, 
                           x.min = 0, y.min = 0, 
                           cellsize = 0)

set.seed(2001)
res <- optimUSER(points = 10, candi = candi[,c('x', 'y')], fun=FUN,  
                  schedule = schedule, track=T, plotit = T, boundary = meuse.area, 
                 meuse.grid= meuse.grid, model = lzn.fit)

it returns:
Error: cannot allocate vector of size 5.8 Gb
Very surprising!

It disappears by either:
1 - reducing the number of possible candidate locations, for example candi = meuse.grid
2 - setting y.max and x.max. I read in the documentation that, by default, x.max and y.max are set to half the maximum distance in the x- and y-coordinates, respectively. This means in the example above that x.max = (max(candi$x) - min(candi$x))/2 and y.max =(max(candi$y) - min(candi$y))/2. I find x.max = 1553 and y.max = 2073. When I set x.max and y.max it now works without problem, without even the need to decrease the number of candidate locations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant