Releases: pola-rs/r-polars
Releases · pola-rs/r-polars
v0.11.0
BREAKING CHANGES DUE TO RUST-POLARS UPDATE
- rust-polars is updated to 0.35.0 (2023-11-17) (#515)
- changes in
$write_csv()
andsink_csv()
:has_header
is renamed
include_header
and there's a new argumentinclude_bom
. pl$cov()
gains addof
argument.$cumsum()
,$cumprod()
,$cummin()
,$cummax()
,$cumcount()
are
renamed$cum_sum()
,$cum_prod()
,$cum_min()
,$cum_max()
,
$cum_count()
.take()
andtake_every()
are renamed$gather()
andgather_every()
.$shift()
and$shift_and_fill()
now accept Expr as input.- when
reverse = TRUE
,$arg_sort()
now places null values in the first
positions. - Removed argument
ambiguous
in$dt$truncate()
and$dt$round()
. $str$concat()
gains an argumentignore_nulls
.
- changes in
Breaking changes and deprecations
- The rowwise computation when several columns are passed to
pl$min()
,pl$max()
,
andpl$sum()
is deprecated and will be removed in 0.12.0. Passing several
columns to these functions will now compute the min/max/sum in each column
separately. Usepl$min_horizontal()
pl$max_horizontal()
, and
pl$sum_horizontal()
instead for rowwise computation (#508). $is_not()
is deprecated and will be removed in 0.12.0. Use$not()
instead
(#511, #531).$is_first()
is deprecated and will be removed in 0.12.0. Use$is_first_distinct()
instead (#531).- In
pl$concat()
, the argumentto_supertypes
is removed. Use the suffix
"_relaxed"
in thehow
argument to cast columns to their shared supertypes
(#523). - All duration methods (
days()
,hours()
,minutes()
,seconds()
,
milliseconds()
,microseconds()
,nanoseconds()
) are renamed, for example
from$dt$days()
to$dt$total_days()
. The old usage is deprecated and will
be removed in 0.12.0. - DataFrame methods
$as_data_frame()
is removed in favor of$to_data_frame()
(#533). - GroupBy methods
$as_data_frame()
and$to_data_frame()
which were used to
convert GroupBy objects to R data frames are removed.
Use$ungroup()
method and theas.data.frame()
function instead (#533).
What's changed
- Fix the installation issue on Ubuntu 20.04 (#528, thanks @brownag).
- New methods
$write_json()
and$write_ndjson()
for DataFrame (#502). - Removed argument
name
inpl$date_range()
, which was deprecated for a while
(#503). - New private method
.pr$DataFrame$drop_all_in_place(df)
to dropDataFrame
in-place, to release memory without invoking gc(). However, if there are other
strong references to any of the underlying Series or arrow arrays, that memory
will specifically not be released. This method is aimed for r-polars extensions,
and will be kept stable as much as possible (#504). - New functions
pl$min_horizontal()
,pl$max_horizontal()
,pl$sum_horizontal()
,
pl$all_horizontal()
,pl$any_horizontal()
(#508). - New generic functions
as_polars_df()
andas_polars_lf()
to create polars
DataFrames and LazyFrames (#519). - New method
$ungroup()
forGroupBy
andLazyGroupBy
(#522). - New method
$rolling()
to apply an Expr over a rolling window based on
date/datetime/numeric indices (#470). - New methods
$name$to_lowercase()
and$name$to_uppercase()
to transform
variable names (#529). - New method
$is_last_distinct()
(#531). - New methods of the Expressions class,
$floor_div()
,$mod()
,$eq_missing()
and$neq_missing()
. The base R operators%/%
and%%
for Expressions are
now translated to$floor_div()
and$mod()
(#523).- Note that
$mod()
of Polars is different from the R operator%%
, which is
not guaranteedx == (x %% y) + y * (x %/% y)
.
Please check the upstream issue pola-rs/polars#10570.
- Note that
- The extract function (
[
) for polars objects now behave more like for base R objects (#543).
lib-v0.35.0
docs(news): tweak news (#539)
v0.10.1
What's changed
- The argument
quote_style
in$write_csv()
and$sink_csv()
can now take
the value"never"
(#483). pl$DataFrame()
now errors if the variables specified inschema
do not exist
in the data (#486).- S3 methods for base R functions are well documented (#494).
- A bug that failing
pl$SQLContext()$register()
without load the package was fixed (#496).
lib-v0.34.1
docs(NEWS): tweak news about recent updates (#498)
v0.10.0
BREAKING CHANGES DUE TO RUST-POLARS UPDATE
- rust-polars is updated to 2023-10-25 unreleased version (#442)
- New subnamespace
"name"
that contains methods$prefix()
,$suffix()
keep()
(renamed fromkeep_name()
) andmap()
(renamed frommap_alias()
). $dt$round()
gains an argumentambiguous
.- The following methods now accept an
Expr
as input:$top_k()
,$bottom_k()
,
$list$join()
,$str$strip_chars()
,$str$strip_chars_start()
,
$str$strip_chars_end()
,$str$split_exact()
. - The following methods were renamed:
$str$n_chars()
->$str$len_chars()
$str$lengths()
->$str$len_bytes()
$str$ljust()
->$str$pad_end()
$str$rjust()
->$str$pad_start()
$concat()
withhow = "diagonal"
now accepts an argumentto_supertypes
to automatically convert concatenated columns to the same type.pl$enable_string_cache()
doesn't take any argument anymore. The string cache
can now be disabled withpl$disable_string_cache()
.$scan_parquet()
gains an argumenthive_partitioning
.$meta$tree_format()
has a better formatted output.
- New subnamespace
Breaking changes
$scan_csv()
and$read_csv()
now match more closely the Python-Polars API (#455):sep
is renamedseparator
,overwrite_dtypes
is renameddtypes
,
parse_dates
is renamedtry_parse_dates
.- new arguments
rechunk
,eol_char
,raise_if_empty
,truncate_ragged_lines
path
can now be a vector of characters indicating several paths to CSV files.
This only works if all CSV files have the same schema.
What's changed
- New class
RPolarsSQLContext
and its methods to perform SQL queries on DataFrame-
like objects. To use this feature, needs to build Rust library with full features
(#457). - New methods
$peak_min()
and$peak_max()
to find local minima and maxima in
an Expr (#462). - New methods
$read_ndjson()
and$scan_ndjson()
(#471). - New method
$with_context()
forLazyFrame
to have access to columns from
other Data/LazyFrames during the computation.
lib-v0.34.0
chore: fix typo and revert Rd file incorrect change (#476)
v0.9.0
BREAKING CHANGES DUE TO RUST-POLARS UPDATE
- rust-polars is updated to 0.33.2 (#417)
- In all date-time related methods, the argument
use_earliest
is replaced byambiguous
. - In
$sample()
and$shuffle()
, the argumentfixed_seed
is removed. - In
$value_counts()
, the argumentsmultithreaded
andsort
(sometimes calledsorted
) have been swapped and renamedsort
andparallel
. $str$count_match()
gains aliteral
argument.$arg_min()
doesn't considerNA
as the minimum anymore (this was already the behavior of$min()
).- Using
$is_in()
withNA
on both sides now returnsNA
and notTRUE
anymore. - Argument
pattern
of$str$count_matches()
can now use expressions. - Needs Rust toolchain
nightly-2023-08-26
for to build with full features.
- In all date-time related methods, the argument
- Rename R functions to match rust-polars
Breaking changes
- Remove some deprecated methods.
- Setting and getting polars options is now made with
pl$options
,
pl$set_options()
andpl$reset_options()
(#384).
What's changed
- Bump supported R version to 4.2 or later (#435).
pl$concat()
now also supportsSeries
,Expr
andLazyFrame
(#407).- New method
$unnest()
forLazyFrame
(#397). - New method
$sample()
forDataFrame
(#399). - New method
$meta$tree_format()
to display anExpr
as a tree (#401). - New argument
schema
inpl$DataFrame()
andpl$LazyFrame()
to override the
automatic type detection (#385). - Fix bug when calling R from polars via e.g.
$map()
where query would not
complete in one edge case (#409). - New method
$cat$get_categories()
to list unique values of categorical
variables (#412). - New methods
$fold()
and$reduce()
to apply an R function rowwise (#403). - New function
pl$raw_list
and classrpolars_raw_list
a list of R Raw's, where missing is
encoded asNULL
to aid conversion to polars binary Series. Support back and forth conversion
from polars binary literal and Series to R raw (#417). - New method
$write_csv()
forDataFrame
(#414). - New method
$sink_csv()
forLazyFrame
(#432). - New method
$dt$time()
to extract the time from adatetime
variable (#428). - Method
$profile()
gains optimization arguments and plot-related arguments (#429). - New method
pl$read_parquet()
that is a shortcut forpl$scan_parquet()$collect()
(#434). - Rename
$str$str_explode()
to$str$explode()
(#436). - New method
$transpose()
forDataFrame
(#440). - New argument
eager
ofLazyFrame$set_optimization_toggle()
(#439). {polars}
can now be installed with "R source package with Rust library binary",
by a mechanism copied from the prqlr package.The URL and SHA256 hash of the available binaries are recorded inSys.setenv(NOT_CRAN = "true") install.packages("polars", repos = "https://rpolars.r-universe.dev")
tools/lib-sums.tsv
.
(#435, #448, #450, #451)
lib-v0.33.0
ci: add release-lib workflow to upload binary libraries to GitHub rel…
v0.8.1
- New string method
to_titlecase()
(#371). - Although stated in news for PR (#334)
strip = true
was not actually set for the "release-optimized" compilation profile. Now it is, but the binary sizes seems unchanged (#377). - New vignette on best practices to improve
polars
performance (#188). - Subnamespace name "arr" as in
<Expr>$arr$
&<Series>$arr$
is deprecated in favor of "list". Finally at polars 0.9.0 the "arr" will be removed (#375).
v0.8.0
What's Changed
- docs: can't install from CRAN now by @eitsupi in #333
- Revert "Revert "Implement
with_row_count
forDataFrame
andLazyFrame
"" by @eitsupi in #332 - fix local build CMD by @philipp-baumann in #331
- Enable
$explode()
forData/LazyFrame
by @etiennebacher in #314 - build: always NOT_CRAN=true by @eitsupi in #340
- Add method
$clone()
forLazyFrame
by @etiennebacher in #347 - Deprecate
with_column()
by @etiennebacher in #313 - Add robj clone by @sorhawell in #348
- Rework Expr operators and errors by @sorhawell in #346
- Implement
cov
,corr
,rolling_cov
,rolling_corr
by @Sicheng-Pan in #351 - Implement
concat_str()
by @etiennebacher in #349 - Fix
$describe()
bug for column names containing a:
by @sorhawell in #342 - fix use polars without library(polars) by @sorhawell in #355
- Moderate ** claims in docs + rework title by @sorhawell in #353
- Implement lazyframe profiling and optimization toggles by @Sicheng-Pan in #323
- Background execution and R process pool by @Sicheng-Pan in #311
- fix named_exprs not allowed and any related error by @sorhawell in #357
- Minor rework for
$explode()
by @sorhawell in #358 - fix collect(collect_in_background=TRUE) by @sorhawell in #359
- Minor error by @etiennebacher in #362
- Implement sink stream for LazyFrame by @Sicheng-Pan in #343
- Add rust polars version info by @sorhawell in #363
- Add several methods to use string cache by @sorhawell in #361
- bump rust-polars to 0.32.0 by @sorhawell in #334
<LazyFrame>$fetch()
by @sorhawell in #319- refactor lit, col, DataFrame, Series by @sorhawell in #369
- support extendr_polars by @sorhawell in #326
New Contributors
- @philipp-baumann made their first contribution in #331
Full Changelog: v0.7.0...v0.8.0