Skip to contents

tidypolars (development version)

Breaking changes

  • As announced in tidypolars 0.7.0, the behavior of collect() has changed. It now returns a standard R data.frame and not a Polars DataFrame anymore. Replace collect() by compute() (with the same arguments) to keep the old behavior.

New features

Bug fixes

  • Local variables in custom functions could not be used in tidypolars functions (reported in a blog post of Art Steinmetz). This is now fixed.

  • across() now works when .cols contains only one variable and .fns contains only one function.

  • In across(), the .cols argument now takes into account variables created in the same mutate() or summarize() call before across().

    as_polars_df(mtcars) |> 
      head(n = 3) |> 
      mutate(
        foo = 1, 
        across(.cols = contains("oo"), \(x) x - 1)
      )
    
    shape: (3, 12)
    ┌──────┬─────┬───────┬───────┬───┬─────┬──────┬──────┬─────┐
    │ mpg  ┆ cyl ┆ disp  ┆ hp    ┆ … ┆ am  ┆ gear ┆ carb ┆ foo │
    ------------   ┆   ┆ ------------
    │ f64  ┆ f64 ┆ f64   ┆ f64   ┆   ┆ f64 ┆ f64  ┆ f64  ┆ f64 │
    ╞══════╪═════╪═══════╪═══════╪═══╪═════╪══════╪══════╪═════╡
    21.06.0160.0110.0 ┆ … ┆ 1.04.04.00.0
    21.06.0160.0110.0 ┆ … ┆ 1.04.04.00.0
    22.84.0108.093.0  ┆ … ┆ 1.04.01.00.0
    └──────┴─────┴───────┴───────┴───┴─────┴──────┴──────┴─────┘

    Note that the where() function is not supported here. For example:

    
    as_polars_df(mtcars) |> 
      mutate(
        foo = 1, 
        across(.cols = where(is.numeric), \(x) x - 1)
      )

    will not return 0 for the variable foo. A warning is emitted about this behavior.

  • Better handling of negative values in c() when called in mutate() and summarize().

tidypolars 0.7.0

tidypolars requires polars >= 0.16.0.

Breaking changes and deprecations

  • as_polars() is now removed. It was deprecated in 0.6.0. Use as_polars_df() or as_polars_lf() instead.

  • to_r() is now removed. It was deprecated in 0.6.0. Use as.data.frame() or as_tibble() instead.

  • For consistency with dplyr, the behavior of collect() will change in 0.8.0 as it will perform the lazy query and convert the result to a standard data.frame. For now, collect() only throws a warning about this future change. It is recommended to use compute() to only perform the query and get a Polars DataFrame as output (#101).

New features

  • Several improvements and changes for pivot_wider() (#95):

    • names_from can now takes several variables;
    • add support for id_cols and names_glue;
    • default value of names_sep now is _, for consistency with tidyr;
    • fix documentation as pivot_wider() doesn’t work on LazyFrame.
  • Add support for stringr::regex(). Note that only the argument ignore_case is supported for now (#97).

  • Add support for several lubridate functions: dweeks(), ddays(), dhours(), dminutes(), dseconds(), dmilliseconds(), make_date() (#107).

  • When a polars function called internally fails, the original error message is now displayed.

  • Add support for group_split() (for DataFrame only).

  • Add support for argument relationship in left_join(), right_join(), full_join() and inner_join() (#106).

tidypolars 0.6.0

tidypolars requires polars >= 0.15.0.

Breaking changes and deprecations

  • as_polars() is deprecated and will be removed in 0.7.0. Use as_polars_lf() or as_polars_df() instead.

  • as_polars() doesn’t have an argument with_string_cache anymore. When set to TRUE, this enabled the string cache globally, which could lead to undesirable side effects.

  • to_r() is deprecated and will be removed in 0.7.0. Use as.data.frame() or as_tibble() instead. This used to silently return a LazyFrame if the input was LazyFrame. It now automatically collects the LazyFrame (#88).

  • pull() nows automatically collects input LazyFrame (#89).

New features

Bug fixes

  • pull() now errors when var is of length > 1.

tidypolars 0.5.0

tidypolars requires polars >= 0.12.0.

Breaking changes

  • across() now errors if the argument .cols is not provided (either named or unnamed). This behavior was deprecated in dplyr 1.1.0.

  • It is no longer possible to use ! in arrange() to sort by decreasing order, for compatibility with dplyr::arrange(). Use - or desc() instead.

New features

  • summarize() now works on ungrouped data and returns a 1-row output.

  • It is now possible to use desc(x1) in arrange() to sort in decreasing order of x1 (this is equivalent to -x1).

  • Add support for argument names_prefix in pivot_longer().

  • Add support for arguments names_prefix and names_sep in pivot_wider().

  • Add support for tidyr::uncount().

  • All *_join() functions now work when by is a specification created by dplyr::join_by(). Notice that this is limited to equality joins for now.

  • You can now use the “embrace” operator {{ }} to pass unquoted column names (among other things) as arguments of custom functions. See the “Programming with dplyr” vignette for some examples.

  • bind_cols_polars() now works with two LazyFrames, but not more.

  • Add support for argument .name_repair in bind_cols_polars() (#74).

  • Support for .env$ and .data$ pronouns in expressions of filter(), mutate() and summarize().

  • Support named vector in the argument pattern of str_replace_all(), where names are patterns and values are replacements.

  • Using %in% for factor variables doesn’t require enabling the string cache anymore.

Bug fixes

  • summarize() no longer errors when across(everything(), ...) is used with .by.

  • All *_join() functions no longer error when a named vector is provided in the argument by.

  • Expressions with values only are not named “literal” anymore.

Misc

  • Simplify the procedure to support new functions.

tidypolars 0.4.0

tidypolars requires polars >= 0.11.0.

Breaking changes

  • It is no longer possible to pass a list in rename().

New features

  • The argument with_string_cache in as_polars() now enables the string cache globally if set to TRUE (#54).

  • Better error message in filter() when comparing factors to strings while the string cache is disabled.

  • Basic support for strptime(). It is possible to use strptime(*, strict = FALSE) to not error when the parsing of some characters fails.

  • New argument .by in filter(), mutate(), and summarize(), and new argument by in the slice_*() functions. This allows to do operations on groups without using group_by() and ungroup(). See the dplyr vignette for more information (#59).

  • rename() now accepts unquoted names both old and new names.

  • Support fixed regexes in str_detect() (using fixed()) and in grepl() (using fixed = TRUE).

Bug fixes

  • Improve robustness of sequential expressions in mutate() and summarize() (i.e expressions that should be run one after the other because they depend on variables created in the same call) (#58).

  • relocate() now works correctly when .after = last_col().

  • All functions that work on grouped data now correctly restore the groups structure (#62).

Misc

tidypolars 0.3.0

tidypolars requires polars >= 0.10.0.

Breaking changes

  • All functions starting with pl_ have been removed to the benefit of the S3 methods. For example, pl_distinct() doesn’t exist anymore so the only way to use it is to load dplyr and to use distinct() on a Polars DataFrame or LazyFrame. This is to avoid confusion about compatibility with dplyr and tidyr. See #49 for a more detailed explanation.

  • pl_bind_rows() and pl_bind_cols() are renamed bind_rows_polars() and bind_cols_polars() respectively. This is because bind_rows() and bind_cols() are not S3 methods (this might change in future versions of dplyr).

New features

Bug fixes

Misc

  • relig_income and fish_encounters are not reexported anymore since tidyr is now imported.

tidypolars 0.2.0

tidypolars requires polars >= 0.9.0.

New features

Bug fixes

Misc

  • Clearer error message when an expression contains <pkg>::. This is not supported for now but could potentially be implemented later.

  • pl_colnames() is no longer exported.

tidypolars 0.1.0

New features

Bug fixes

  • Fix pl_mutate() and pl_summarize() when expressions use some variables previously created or modified (#10, #37).

  • Fix bug in pl_filter() when passing a vector in the RHS of %in%.

Misc

  • Improve the backend to translate R expressions into Polars expressions. This also led to a complete rewriting of the vignette “R and Polars expressions” (#27).

  • Error messages should now report the correct function call.

  • Improve CI coverage (#35).

tidypolars 0.0.1

  • First Github release.