Safe pulls: Implement pull by year instead off one shot #7

davidclarance · 2019-09-01T13:18:06Z

There's a suggestion to have defaults as the entire date range. Implementing this in the current functions will increase server utilization by a lot. This isn't healthy for the server and the package. Therefore we need to break requests by year instead of pull it all at once and then combine it inside the function.

bluehill · 2019-09-02T06:18:27Z

Internal data breaks are a good idea, but currently the extract_species function is within data call parameters: the most data intensive species call which is for all time available for cape turtle dove for south africa: this returns 150 000 records: so the function is still well within margin's of good etiquette (<250 000). Reproducible example:
ctdove_raw_records <- extract_species(

species_ids = 316,
start_date = '2007-01-01',
end_date = '2019-09-01',
region_type = 'country',
region_id = 'southafrica'

)

Happy coding :)

davidclarance · 2019-09-02T14:47:40Z

Great point and thanks for the example. I think I'll still do it for three reasons:

The imperative word is that this works now. We want to avoid accumulating technical debt and so if it's something easy to implement why not do it now especially as the extract functions are the bed rock of all future functions.
Even though 250K is the limit, it doesn't mean that multiple queries operating at 250K are healthy. You can think of it in terms of many users drawing on one common CPU. You don't want everyone maxing out their limits.
The larger the query/operation, the longer the rollback takes if there's a timeout or server error, the longer competing queries are blocked.

I think the addition is simple, can be used across all the extract functions and provides a safety net.

davidclarance added the enhancement New feature or request label Sep 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Safe pulls: Implement pull by year instead off one shot #7

Safe pulls: Implement pull by year instead off one shot #7

davidclarance commented Sep 1, 2019

bluehill commented Sep 2, 2019

davidclarance commented Sep 2, 2019

Safe pulls: Implement pull by year instead off one shot #7

Safe pulls: Implement pull by year instead off one shot #7

Comments

davidclarance commented Sep 1, 2019

bluehill commented Sep 2, 2019

davidclarance commented Sep 2, 2019