Changelog
Source:NEWS.md
tidypolars (development version)
New features
-
Add support for several functions:
from package
base
:all()
,any()
,diff()
,ISOdatetime()
,length()
,rev()
,unique()
.from package
dplyr
:consecutive_id()
,min_rank()
,na_if()
,n_distinct()
,nth()
.from package
lubridate
:make_datetime()
.from package
stringr
:str_dup()
,str_split()
,str_split_i()
.
Bug fixes
Local variables in custom functions could not be used in tidypolars functions (reported in a blog post of Art Steinmetz). This is now fixed.
across()
now works when.cols
contains only one variable and.fns
contains only one function.-
In
across()
, the.cols
argument now takes into account variables created in the samemutate()
orsummarize()
call beforeacross()
.as_polars_df(mtcars) |> head(n = 3) |> mutate( foo = 1, across(.cols = contains("oo"), \(x) x - 1) ) shape: (3, 12) ┌──────┬─────┬───────┬───────┬───┬─────┬──────┬──────┬─────┐ │ mpg ┆ cyl ┆ disp ┆ hp ┆ … ┆ am ┆ gear ┆ carb ┆ foo │ │ --- ┆ --- ┆ --- ┆ --- ┆ ┆ --- ┆ --- ┆ --- ┆ --- │ │ f64 ┆ f64 ┆ f64 ┆ f64 ┆ ┆ f64 ┆ f64 ┆ f64 ┆ f64 │ ╞══════╪═════╪═══════╪═══════╪═══╪═════╪══════╪══════╪═════╡ │ 21.0 ┆ 6.0 ┆ 160.0 ┆ 110.0 ┆ … ┆ 1.0 ┆ 4.0 ┆ 4.0 ┆ 0.0 │ │ 21.0 ┆ 6.0 ┆ 160.0 ┆ 110.0 ┆ … ┆ 1.0 ┆ 4.0 ┆ 4.0 ┆ 0.0 │ │ 22.8 ┆ 4.0 ┆ 108.0 ┆ 93.0 ┆ … ┆ 1.0 ┆ 4.0 ┆ 1.0 ┆ 0.0 │ └──────┴─────┴───────┴───────┴───┴─────┴──────┴──────┴─────┘
Note that the
where()
function is not supported here. For example:as_polars_df(mtcars) |> mutate( foo = 1, across(.cols = where(is.numeric), \(x) x - 1) )
will not return 0 for the variable
foo
. A warning is emitted about this behavior. Better handling of negative values in
c()
when called inmutate()
andsummarize()
.
tidypolars 0.7.0
tidypolars
requires polars
>= 0.16.0.
Breaking changes and deprecations
as_polars()
is now removed. It was deprecated in 0.6.0. Useas_polars_df()
oras_polars_lf()
instead.to_r()
is now removed. It was deprecated in 0.6.0. Useas.data.frame()
oras_tibble()
instead.For consistency with
dplyr
, the behavior ofcollect()
will change in 0.8.0 as it will perform the lazy query and convert the result to a standarddata.frame
. For now,collect()
only throws a warning about this future change. It is recommended to usecompute()
to only perform the query and get a Polars DataFrame as output (#101).
New features
-
Several improvements and changes for
pivot_wider()
(#95):-
names_from
can now takes several variables; - add support for
id_cols
andnames_glue
; - default value of
names_sep
now is_
, for consistency withtidyr
; - fix documentation as
pivot_wider()
doesn’t work on LazyFrame.
-
Add support for
stringr::regex()
. Note that only the argumentignore_case
is supported for now (#97).Add support for several
lubridate
functions:dweeks()
,ddays()
,dhours()
,dminutes()
,dseconds()
,dmilliseconds()
,make_date()
(#107).When a
polars
function called internally fails, the original error message is now displayed.Add support for
group_split()
(forDataFrame
only).Add support for argument
relationship
inleft_join()
,right_join()
,full_join()
andinner_join()
(#106).
tidypolars 0.6.0
tidypolars
requires polars
>= 0.15.0.
Breaking changes and deprecations
as_polars()
is deprecated and will be removed in 0.7.0. Useas_polars_lf()
oras_polars_df()
instead.as_polars()
doesn’t have an argumentwith_string_cache
anymore. When set toTRUE
, this enabled the string cache globally, which could lead to undesirable side effects.to_r()
is deprecated and will be removed in 0.7.0. Useas.data.frame()
oras_tibble()
instead. This used to silently return aLazyFrame
if the input wasLazyFrame
. It now automatically collects theLazyFrame
(#88).
New features
Add support for
group_vars()
andgroup_keys()
(#81).Experimental support of
rowwise()
. For now, this is limited to a few functions:mean()
,median()
,min()
,max()
,sum()
,all()
,any()
.rowwise()
andgroup_by()
cannot be used at the same time (#40).All functions that return a polars
Data/LazyFrame
now add the class"tidypolars"
to the output (#86).Support
which.min()
,which.max()
,dplyr::n()
.Support
.data[[
and.env[[
in addition to.data$
and.env$
. Better error messages when the objects specified in.data
or.env
don’t exist.
Bug fixes
-
pull()
now errors whenvar
is of length > 1.
tidypolars 0.5.0
tidypolars
requires polars
>= 0.12.0.
Breaking changes
across()
now errors if the argument.cols
is not provided (either named or unnamed). This behavior was deprecated indplyr
1.1.0.It is no longer possible to use
!
inarrange()
to sort by decreasing order, for compatibility withdplyr::arrange()
. Use-
ordesc()
instead.
New features
summarize()
now works on ungrouped data and returns a 1-row output.It is now possible to use
desc(x1)
inarrange()
to sort in decreasing order ofx1
(this is equivalent to-x1
).Add support for argument
names_prefix
inpivot_longer()
.Add support for arguments
names_prefix
andnames_sep
inpivot_wider()
.Add support for
tidyr::uncount()
.All
*_join()
functions now work whenby
is a specification created bydplyr::join_by()
. Notice that this is limited to equality joins for now.You can now use the “embrace” operator
{{ }}
to pass unquoted column names (among other things) as arguments of custom functions. See the “Programming with dplyr” vignette for some examples.bind_cols_polars()
now works with twoLazyFrame
s, but not more.Add support for argument
.name_repair
inbind_cols_polars()
(#74).Support for
.env$
and.data$
pronouns in expressions offilter()
,mutate()
andsummarize()
.Support named vector in the argument
pattern
ofstr_replace_all()
, where names are patterns and values are replacements.Using
%in%
for factor variables doesn’t require enabling the string cache anymore.
Bug fixes
summarize()
no longer errors whenacross(everything(), ...)
is used with.by
.All
*_join()
functions no longer error when a named vector is provided in the argumentby
.Expressions with values only are not named “literal” anymore.
tidypolars 0.4.0
tidypolars
requires polars
>= 0.11.0.
Breaking changes
- It is no longer possible to pass a list in
rename()
.
New features
The argument
with_string_cache
inas_polars()
now enables the string cache globally if set toTRUE
(#54).Better error message in
filter()
when comparing factors to strings while the string cache is disabled.Basic support for
strptime()
. It is possible to usestrptime(*, strict = FALSE)
to not error when the parsing of some characters fails.New argument
.by
infilter()
,mutate()
, andsummarize()
, and new argumentby
in theslice_*()
functions. This allows to do operations on groups without usinggroup_by()
andungroup()
. See thedplyr
vignette for more information (#59).rename()
now accepts unquoted names both old and new names.Support fixed regexes in
str_detect()
(usingfixed()
) and ingrepl()
(usingfixed = TRUE
).
Bug fixes
Improve robustness of sequential expressions in
mutate()
andsummarize()
(i.e expressions that should be run one after the other because they depend on variables created in the same call) (#58).relocate()
now works correctly when.after = last_col()
.All functions that work on grouped data now correctly restore the groups structure (#62).
Misc
Error messages coming from
mutate()
,summarize()
, andfilter()
now give the right function call.Faster tidy selection (#61).
tidypolars 0.3.0
tidypolars
requires polars
>= 0.10.0.
Breaking changes
All functions starting with
pl_
have been removed to the benefit of the S3 methods. For example,pl_distinct()
doesn’t exist anymore so the only way to use it is to loaddplyr
and to usedistinct()
on a Polars DataFrame or LazyFrame. This is to avoid confusion about compatibility withdplyr
andtidyr
. See #49 for a more detailed explanation.pl_bind_rows()
andpl_bind_cols()
are renamedbind_rows_polars()
andbind_cols_polars()
respectively. This is becausebind_rows()
andbind_cols()
are not S3 methods (this might change in future versions ofdplyr
).
New features
New function
duplicated_rows()
that is the opposite ofdistinct()
(#50).New argument
.id
inbind_rows_polars()
.bind_rows_polars()
can now bind Data/LazyFrames that don’t have the same schema. Columns will be upcast to common types if necessary. Unknown columns will be filled withNA
.
Bug fixes
-
complete()
now works correctly on grouped data.
tidypolars 0.2.0
tidypolars
requires polars
>= 0.9.0.
New features
Rename
pl_fetch()
tofetch()
.New functions supported:
describe()
,sink_csv()
,slice_sample()
.New argument
fill
inpl_complete()
.Support
stringr::str_to_title()
andtools::toTitleCase()
.Support
stringr::fixed()
to use literal strings.Support replacements with captured groups like
\\1
instringr::str_replace()
andstringr::str_replace_all()
.
Bug fixes
-
sink_parquet()
didn’t use the user inputs (apart from thepath
).
tidypolars 0.1.0
New features
Support
as.numeric()
,as.character()
,as.logical()
,grepl()
, andpaste()
in expressions inpl_filter()
,pl_mutate()
andpl_summarize()
.Support
sink_parquet()
(#38).Support for additional
stringr
functions:str_detect()
,str_extract_all()
,str_pad()
,str_squish()
,str_trim()
,word()
(some arguments or corner cases are not supported yet).Add all optimization parameters in
collect()
.