Fetch is a way to collect only the first n
rows of a LazyFrame. It is
mainly used to test that a query runs as expected on a subset of the data
before using collect()
on the full query. Note that fetching n
rows
doesn't mean that the output will actually contain n
rows, see the section
'Details' for more information.
Usage
fetch(
.data,
n_rows = 500,
type_coercion = TRUE,
predicate_pushdown = TRUE,
projection_pushdown = TRUE,
simplify_expression = TRUE,
slice_pushdown = TRUE,
comm_subplan_elim = TRUE,
comm_subexpr_elim = TRUE,
cluster_with_columns = TRUE,
no_optimization = FALSE,
streaming = FALSE
)
Arguments
- .data
A Polars LazyFrame
- n_rows
Number of rows to fetch.
- type_coercion
Coerce types such that operations succeed and run on minimal required memory (default is
TRUE
).- predicate_pushdown
Applies filters as early as possible at scan level (default is
TRUE
).- projection_pushdown
Select only the columns that are needed at the scan level (default is
TRUE
).- simplify_expression
Various optimizations, such as constant folding and replacing expensive operations with faster alternatives (default is
TRUE
).- slice_pushdown
Only load the required slice from the scan. Don't materialize sliced outputs level. Don't materialize sliced outputs (default is
TRUE
).- comm_subplan_elim
Cache branching subplans that occur on self-joins or unions (default is
TRUE
).- comm_subexpr_elim
Cache common subexpressions (default is
TRUE
).- cluster_with_columns
Combine sequential independent calls to
$with_columns()
.- no_optimization
Sets the following optimizations to
FALSE
:predicate_pushdown
,projection_pushdown
,slice_pushdown
,simplify_expression
. Default isFALSE
.- streaming
Run parts of the query in a streaming fashion (this is in an alpha state). Default is
FALSE
.
Details
The parameter n_rows
indicates how many rows from the LazyFrame should be
used at the beginning of the query, but it doesn't guarantee that n_rows
will
be returned. For example, if the query contains a filter or join operations
with other datasets, then the final number of rows can be lower than n_rows
.
On the other hand, appending some rows during the query can lead to an output
that has more rows than n_rows
.
See also
collect()
for applying a lazy query on the full data.
Examples
dat_lazy <- polars::pl$DataFrame(iris)$lazy()
# this will return 30 rows
fetch(dat_lazy, 30)
#> shape: (30, 5)
#> ┌──────────────┬─────────────┬──────────────┬─────────────┬─────────┐
#> │ Sepal.Length ┆ Sepal.Width ┆ Petal.Length ┆ Petal.Width ┆ Species │
#> │ --- ┆ --- ┆ --- ┆ --- ┆ --- │
#> │ f64 ┆ f64 ┆ f64 ┆ f64 ┆ cat │
#> ╞══════════════╪═════════════╪══════════════╪═════════════╪═════════╡
#> │ 5.1 ┆ 3.5 ┆ 1.4 ┆ 0.2 ┆ setosa │
#> │ 4.9 ┆ 3.0 ┆ 1.4 ┆ 0.2 ┆ setosa │
#> │ 4.7 ┆ 3.2 ┆ 1.3 ┆ 0.2 ┆ setosa │
#> │ 4.6 ┆ 3.1 ┆ 1.5 ┆ 0.2 ┆ setosa │
#> │ 5.0 ┆ 3.6 ┆ 1.4 ┆ 0.2 ┆ setosa │
#> │ … ┆ … ┆ … ┆ … ┆ … │
#> │ 5.0 ┆ 3.0 ┆ 1.6 ┆ 0.2 ┆ setosa │
#> │ 5.0 ┆ 3.4 ┆ 1.6 ┆ 0.4 ┆ setosa │
#> │ 5.2 ┆ 3.5 ┆ 1.5 ┆ 0.2 ┆ setosa │
#> │ 5.2 ┆ 3.4 ┆ 1.4 ┆ 0.2 ┆ setosa │
#> │ 4.7 ┆ 3.2 ┆ 1.6 ┆ 0.2 ┆ setosa │
#> └──────────────┴─────────────┴──────────────┴─────────────┴─────────┘
# this will return less than 30 rows because there are less than 30 matches
# for this filter in the whole dataset
dat_lazy |>
filter(Sepal.Length > 7.0) |>
fetch(30)
#> shape: (12, 5)
#> ┌──────────────┬─────────────┬──────────────┬─────────────┬───────────┐
#> │ Sepal.Length ┆ Sepal.Width ┆ Petal.Length ┆ Petal.Width ┆ Species │
#> │ --- ┆ --- ┆ --- ┆ --- ┆ --- │
#> │ f64 ┆ f64 ┆ f64 ┆ f64 ┆ cat │
#> ╞══════════════╪═════════════╪══════════════╪═════════════╪═══════════╡
#> │ 7.1 ┆ 3.0 ┆ 5.9 ┆ 2.1 ┆ virginica │
#> │ 7.6 ┆ 3.0 ┆ 6.6 ┆ 2.1 ┆ virginica │
#> │ 7.3 ┆ 2.9 ┆ 6.3 ┆ 1.8 ┆ virginica │
#> │ 7.2 ┆ 3.6 ┆ 6.1 ┆ 2.5 ┆ virginica │
#> │ 7.7 ┆ 3.8 ┆ 6.7 ┆ 2.2 ┆ virginica │
#> │ … ┆ … ┆ … ┆ … ┆ … │
#> │ 7.2 ┆ 3.2 ┆ 6.0 ┆ 1.8 ┆ virginica │
#> │ 7.2 ┆ 3.0 ┆ 5.8 ┆ 1.6 ┆ virginica │
#> │ 7.4 ┆ 2.8 ┆ 6.1 ┆ 1.9 ┆ virginica │
#> │ 7.9 ┆ 3.8 ┆ 6.4 ┆ 2.0 ┆ virginica │
#> │ 7.7 ┆ 3.0 ┆ 6.1 ┆ 2.3 ┆ virginica │
#> └──────────────┴─────────────┴──────────────┴─────────────┴───────────┘