forked from ropensci/rfishbase
-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.Rmd
203 lines (112 loc) · 11.5 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
---
output: github_document
---
# rfishbase <img src="man/figures/logo.svg" align="right" alt="" width="120" />
[![Build Status](https://travis-ci.org/ropensci/rfishbase.svg)](https://travis-ci.org/ropensci/rfishbase)
[![Build status](https://ci.appveyor.com/api/projects/status/decpqq5s57b7b0t6/branch/master?svg=true)](https://ci.appveyor.com/project/cboettig/rfishbase/branch/master)
[![cran checks](https://cranchecks.info/badges/worst/rfishbase)](https://cranchecks.info/pkgs/rfishbase)
[![Coverage status](https://codecov.io/gh/ropensci/rfishbase/branch/master/graph/badge.svg)](https://codecov.io/github/ropensci/rfishbase?branch=master)
[![Onboarding](https://badges.ropensci.org/137_status.svg)](https://github.com/ropensci/onboarding/issues/137)
[![CRAN status](https://www.r-pkg.org/badges/version/rfishbase)](https://cran.r-project.org/package=rfishbase)
[![Downloads](http://cranlogs.r-pkg.org/badges/grand-total/rfishbase)](https://github.com/metacran/cranlogs.app)
<details> <summary><strong>FishBase NEEDS YOUR HELP!</strong></summary>
Dear FishBase Users,
FishBase needs help and I am writing to you because you either have at one point or another requested data to be extracted from FishBase for your own research purposes or have contributed your own data to FishBase.
One of the FishBase funders has had to reduce its commitment and as a result, there is a US$200,000 gap in the FishBase 2018/2019 budget, which will result in forced unpaid leave of key staff with direct consequences for the constant updating and growth of FishBase, a resource on which many of us rely.
Many of us use FishBase regularly in our work given it provides important data on distribution, traits etc. Indeed, these data are so valuable that FishBase receives over 700,000 unique visits per month and underpins key scientific breakthroughs such as the Nature paper on rates of evolution [it’s slower in the tropics!] (see Nature (see https://www.nature.com/articles/d41586-018-05575-2 and https://www.facebook.com/FishBase/posts/1885134558216592).
Key about FishBase is that it is free to everyone in the world, regardless of whether their institutions can afford journal subscriptions.
FishBase co-founder Daniel Pauly once said “sending a bibliography is like providing cookbooks in a famine” and it has been the underpinning ethos of FishBase to make information available equally, regardless of where one works.
So for nearly 30 years, FishBase (www.fishbase.org), with its team of biologists and programmers has done just that, while constantly improving and expanding this valued resource.
But FishBase needs our help now. So, when you are next online on FishBase and see the donate request pop up, please donate at least $25. That’s one bottle of good VQA wine in Canada or half a carton of decent beer in Australia, 3 packs of organic Hess avocados from Loblaws or 6 latte grandes at Starbucks. If you drink more beer, use more avocado on your toast or are caffeine dependent, please consider a larger donation. IF EVERY MARINE RESEARCHER GETS ON BOARD, we can make a major contribution to FishBase. Imagine if you had to pay to access this type of information.
It’s time to pay it forward!
Thank you for your consideration and we all look forward to a flood of world-wide support to FishBase.
</details>
<br>
Welcome to `rfishbase 3.0`. This package is the third rewrite of the original `rfishbase` package described in [Boettiger et al. (2012)](http://www.carlboettiger.info/assets/files/pubs/10.1111/j.1095-8649.2012.03464.x.pdf).
`rfishbase` 3.0 queries pre-compressed tables from a static server and employs local caching (through memoization) to provide much greater performance and stability, particularly for dealing with large queries involving 10s of thousands of species. The user is never expected to deal with pagination or curl headers and timeouts.
We welcome any feedback, issues or questions that users may encounter through our issues tracker on GitHub: <https://github.com/ropensci/rfishbase/issues>
```{r include=FALSE}
knitr::opts_chunk$set(warning=FALSE, comment=NA)
```
## Installation
```{r message=FALSE, warning=FALSE, results="hide", eval=FALSE}
remotes::install_github("ropensci/rfishbase")
```
```{r message=FALSE, warning=FALSE, results="hide"}
library("rfishbase")
library("dplyr") # convenient but not required
```
## Getting started
[FishBase](http://fishbase.org) makes it relatively easy to look up a lot of information on most known species of fish. However, looking up a single bit of data, such as the estimated trophic level, for many different species becomes tedious very soon. This is a common reason for using `rfishbase`. As such, our first step is to assemble a good list of species we are interested in.
### Building a species list
Almost all functions in `rfishbase` take a list (character vector) of species scientific names, for example:
```{r}
fish <- c("Oreochromis niloticus", "Salmo trutta")
```
You can also read in a list of names from any existing data you are working with. When providing your own species list, you should always begin by validating the names. Taxonomy is a moving target, and this well help align the scientific names you are using with the names used by FishBase, and alert you to any potential issues:
```{r}
fish <- validate_names(c("Oreochromis niloticus", "Salmo trutta"))
```
Another typical use case is in wanting to collect information about all species in a particular taxonomic group, such as a Genus, Family or Order. The function `species_list` recognizes six taxonomic levels, and can help you generate a list of names of all species in a given group:
```{r}
fish <- species_list(Genus = "Labroides")
fish
```
`rfishbase` also recognizes common names. When a common name refers to multiple species, all matching species are returned:
```{r}
trout <- common_to_sci("trout")
trout
```
Note that there is no need to validate names coming from `common_to_sci` or `species_list`, as these will always return valid names.
### Getting data
With a species list in place, we are ready to query fishbase for data. Note that if you have a very long list of species, it is always a good idea to try out your intended functions with a subset of that list first to make sure everything is working.
The `species()` function returns a table containing much (but not all) of the information found on the summary or homepage for a species on [fishbase.org](http://fishbase.org). `rfishbase` functions always return [tidy](http://www.jstatsoft.org/v59/i10/paper) data tables: rows are observations (e.g. a species, individual samples from a species) and columns are variables (fields).
```{r}
species(trout$Species)
```
Most tables contain many fields. To avoid overly cluttering the screen, `rfishbase` displays tables as "tibbles" from the `dplyr` package. These act just like the familiar `data.frames` of base R except that they print to the screen in a more tidy fashion. Note that columns that cannot fit easily in the display are summarized below the table. This gives us an easy way to see what fields are available in a given table.
Most `rfishbase` functions will let the user subset these fields by listing them in the `fields` argument, for instance:
```{r}
dat <- species(trout$Species, fields=c("Species", "PriceCateg", "Vulnerability"))
dat
```
Alternatively, just subset the table using the standard column selection in base R (`[[`) or `dplyr::select`.
### FishBase Docs: Discovering data
Unfortunately identifying what fields come from which tables is often a challenge. Each summary page on fishbase.org includes a list of additional tables with more information about species ecology, diet, occurrences, and many other things. `rfishbase` provides functions that correspond to most of these tables.
Because `rfishbase` accesses the back end database, it does not always line up with the web display. Frequently `rfishbase` functions will return more information than is available on the web versions of the these tables. Some information found on the summary homepage for a species is not available from the `species` summary function, but must be extracted from a different table. For instance, the species `Resilience` information is not one of the fields in the `species` summary table, despite appearing on the species homepage of fishbase.org. To discover which table this information is in, we can use the special `rfishbase` function `list_fields`, which will list all tables with a field matching the query string:
```{r}
list_fields("Resilience")
```
This shows us that this information appears on the `stocks` table. We can then request this data from the stocks table:
```{r}
stocks(trout$Species, fields=c("Species", "Resilience", "StockDefs"))
```
## Version stability
`rfishbase` relies on periodic cache releases. The current database release is `17.07` (i.e. dating from July 2017). Set the version of FishBase you wish to access by setting the environmental variable:
```{r}
Sys.setenv(FISHBASE_VERSION="17.07")
```
Note that the same version number applies to both the `fishbase` and `sealifebase` data. Stay tuned for new releases.
## SeaLifeBase
SeaLifeBase.org is maintained by the same organization and largely parallels the database structure of Fishbase. As such, almost all `rfishbase` functions can instead be instructed to address the
We can begin by getting the taxa table for sealifebase:
```{r}
sealife <- load_taxa(server="sealifebase")
```
(Note: running `load_taxa()` at the beginning of any session, for either fishbase or sealifebase is a good way to "warm up" rfishbase by loading in taxonomic data it will need. This information is cached throughout your session and will make all subsequent commands run faster. But no worries if you skip this step, `rfishbase` will peform it for you on the first time it is needed, and will cache these results thereafter.)
Let's look at some Gastropods:
```{r}
sealife %>% filter(Class == "Gastropoda")
```
All other tables can also take an argument to `server`:
```{r}
species(server="sealifebase")
```
CAUTION: if switching between `fishbase` and `sealifebase` in a single R session, we strongly advise you always set `server` explicitly in your function calls. Otherwise you may confuse the caching system.
## Backwards compatibility
`rfishbase` 3.0 tries to maintain as much backwards compatibility as possible with rfishbase 2.0. However, there are cases in which the rfishbase 2.0 behavior was not desirable -- such as throwing errors when a introducing simple `NA`s for missing data would be more appropriate, or returning vectors where `data.frame`s were needed to include all the context.
- Argument names have been retained where possible to maximize backwards compatibility. Using previous arguments that are no longer relevant (such as `limit` for the maximum number of records) will not now introduce errors, but nor will they have any effect (they are simply consumed by the `...`). There are no longer any limits in return sizes.
- You can still specify server using the rfishbase `2.x` format of providing a URL argument for server, e.g. `"http://fishbase.ropensci.org/sealifebase"` or `Sys.setenv(FISHBASE_API = "http://fishbase.ropensci.org/sealifebase")`, or simply `Sys.setenv("FISHBASE_API" = "sealifebase")` if you prefer. Also recall that environmental variables can always be set in an `.Renviron` file.
-----------
Please note that this project is released with a [Contributor Code of Conduct](CONDUCT.md). By participating in this project you agree to abide by its terms.
[![ropensci_footer](http://ropensci.org/public_images/github_footer.png)](http://ropensci.org)