Fills missing values in selected columns using the next or previous entry. This is useful in the common output format where values are not repeated, and are only recorded when they change.
Usage
# S3 method for class 'RPolarsDataFrame'
fill(data, ..., .direction = c("down", "up", "downup", "updown"))
Arguments
- data
A Polars Data/LazyFrame
- ...
Any expression accepted by
dplyr::select()
: variable names, column numbers, select helpers, etc.- .direction
Direction in which to fill missing values. Either "down" (the default), "up", "downup" (i.e. first down and then up) or "updown" (first up and then down).
Details
With grouped Data/LazyFrames, fill() will be applied within each group, meaning that it won't fill across group boundaries.
Examples
pl_test <- polars::pl$DataFrame(x = c(NA, 1), y = c(2, NA))
fill(pl_test, everything(), .direction = "down")
#> shape: (2, 2)
#> ┌──────┬─────┐
#> │ x ┆ y │
#> │ --- ┆ --- │
#> │ f64 ┆ f64 │
#> ╞══════╪═════╡
#> │ null ┆ 2.0 │
#> │ 1.0 ┆ 2.0 │
#> └──────┴─────┘
fill(pl_test, everything(), .direction = "up")
#> shape: (2, 2)
#> ┌─────┬──────┐
#> │ x ┆ y │
#> │ --- ┆ --- │
#> │ f64 ┆ f64 │
#> ╞═════╪══════╡
#> │ 1.0 ┆ 2.0 │
#> │ 1.0 ┆ null │
#> └─────┴──────┘
# with grouped data, it doesn't use values from other groups
pl_grouped <- polars::pl$DataFrame(
grp = rep(c("A", "B"), each = 3),
x = c(1, NA, NA, NA, 2, NA),
y = c(3, NA, 4, NA, 3, 1)
) |>
group_by(grp)
fill(pl_grouped, x, y, .direction = "down")
#> shape: (6, 3)
#> ┌─────┬──────┬──────┐
#> │ grp ┆ x ┆ y │
#> │ --- ┆ --- ┆ --- │
#> │ str ┆ f64 ┆ f64 │
#> ╞═════╪══════╪══════╡
#> │ A ┆ 1.0 ┆ 3.0 │
#> │ A ┆ 1.0 ┆ 3.0 │
#> │ A ┆ 1.0 ┆ 4.0 │
#> │ B ┆ null ┆ null │
#> │ B ┆ 2.0 ┆ 3.0 │
#> │ B ┆ 2.0 ┆ 1.0 │
#> └─────┴──────┴──────┘
#> Groups [2]: grp
#> Maintain order: FALSE