-
Notifications
You must be signed in to change notification settings - Fork 36
/
README.Rmd
121 lines (95 loc) · 4.48 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
---
output:
github_document:
html_preview: false
# used by altdoc
default-image-extension: ''
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# polars
<!-- badges: start -->
[![R-multiverse status](https://img.shields.io/badge/dynamic/json?url=https%3A%2F%2Fcommunity.r-multiverse.org%2Fapi%2Fpackages%2Fpolars&query=%24.Version&label=r-multiverse)](https://community.r-multiverse.org/polars)
[![R-universe status badge](https://rpolars.r-universe.dev/badges/polars)](https://rpolars.r-universe.dev)
[![CRAN status](https://www.r-pkg.org/badges/version/polars)](https://CRAN.R-project.org/package=polars)
[![Dev R-CMD-check](https://github.com/pola-rs/r-polars/actions/workflows/check.yaml/badge.svg)](https://github.com/pola-rs/r-polars/actions/workflows/check.yaml)
[![Docs dev version](https://img.shields.io/badge/docs-dev-blue.svg)](https://pola-rs.github.io/r-polars)
<!-- badges: end -->
The **polars** package for R gives users access to [a lightning
fast](https://duckdblabs.github.io/db-benchmark/) Data Frame library written in
Rust. [Polars](https://www.pola.rs/)' embarrassingly parallel execution, cache
efficient algorithms and expressive API makes it perfect for efficient data
wrangling, data pipelines, snappy APIs, and much more besides. Polars also supports
"streaming mode" for out-of-memory operations. This allows users to analyze
datasets many times larger than RAM.
Examples of common operations:
- read CSV, JSON, Parquet, and other file formats;
- filter rows and select columns;
- modify and create new columns;
- group by and aggregate;
- reshape data;
- join and concatenate different datasets;
- sort data;
- work with dates and times;
- handle missing values;
- use the lazy execution engine for maximum performance and
memory-efficient operations
Note that this package is rapidly evolving and there are a number of breaking
changes at each version. Be sure to check the [changelog](https://pola-rs.github.io/r-polars/NEWS.html)
when updating `polars`.
## Install
The recommended way to install this package is via R-multiverse:
```r
Sys.setenv(NOT_CRAN = "true")
install.packages("polars", repos = "https://community.r-multiverse.org")
```
[The "Install" vignette](https://pola-rs.github.io/r-polars/vignettes/install.html) (`vignette("install", "polars")`)
gives more details on how to install this package and other ways to install it.
## Quickstart example
To avoid conflicts with other packages and base R function names, **polars**'s
top level functions are hosted in the `pl` namespace, and accessible via the
`pl$` prefix.
This means that `polars` queries written in Python and in R are very similar.
For example, rewriting the Python example from <https://github.com/pola-rs/polars> in R:
```{r}
library(polars)
df = pl$DataFrame(
A = 1:5,
fruits = c("banana", "banana", "apple", "apple", "banana"),
B = 5:1,
cars = c("beetle", "audi", "beetle", "beetle", "beetle")
)
# embarrassingly parallel execution & very expressive query language
df$sort("fruits")$select(
"fruits",
"cars",
pl$lit("fruits")$alias("literal_string_fruits"),
pl$col("B")$filter(pl$col("cars") == "beetle")$sum(),
pl$col("A")$filter(pl$col("B") > 2)$sum()$over("cars")$alias("sum_A_by_cars"),
pl$col("A")$sum()$over("fruits")$alias("sum_A_by_fruits"),
pl$col("A")$reverse()$over("fruits")$alias("rev_A_by_fruits"),
pl$col("A")$sort_by("B")$over("fruits")$alias("sort_A_by_B_by_fruits")
)
```
The [Get Started vignette](https://pola-rs.github.io/r-polars/vignettes/polars.html) (`vignette("polars")`) provides
a more detailed introduction to **polars**.
## Extensions
While one can use **polars** as-is, other packages build on it to
provide different syntaxes:
- [polarssql](https://rpolars.github.io/r-polarssql/) provides a **polars**
backend for [DBI](https://dbi.r-dbi.org/) and [dbplyr](https://dbplyr.tidyverse.org/).
- [tidypolars](https://tidypolars.etiennebacher.com/) allows one to
use the [tidyverse](https://www.tidyverse.org/) syntax while using the power of **polars**.
## Getting help
The online documentation can be found at <https://pola-rs.github.io/r-polars/>.
If you encounter a bug, please file an issue with a minimal reproducible example on
[GitHub](https://github.com/pola-rs/r-polars/issues).
Consider joining our [Discord](https://discord.com/invite/4UfP5cfBE7) subchannel for
additional help and discussion.