read_ndjson_polars()
imports the data as a Polars DataFrame.
scan_ndjson_polars()
imports the data as a Polars LazyFrame.
Usage
read_ndjson_polars(
source,
...,
infer_schema_length = 100,
batch_size = NULL,
n_rows = NULL,
low_memory = FALSE,
rechunk = FALSE,
row_index_name = NULL,
row_index_offset = 0,
ignore_errors = FALSE,
reuse_downloaded
)
scan_ndjson_polars(
source,
...,
infer_schema_length = 100,
batch_size = NULL,
n_rows = NULL,
low_memory = FALSE,
rechunk = FALSE,
row_index_name = NULL,
row_index_offset = 0,
ignore_errors = FALSE,
reuse_downloaded
)
Arguments
- source
Path(s) to a file or directory. When needing to authenticate for scanning cloud locations, see the
storage_options
parameter.- ...
These dots are for future extensions and must be empty.
- infer_schema_length
The maximum number of rows to scan for schema inference. If
NULL
, the full data may be scanned (this is slow). Setinfer_schema = FALSE
to read all columns aspl$String
.- batch_size
Number of rows to read in each batch.
- n_rows
Stop reading from the source after reading
n_rows
.- low_memory
Reduce memory pressure at the expense of performance.
- rechunk
Reallocate to contiguous memory when all chunks/files are parsed.
- row_index_name
If not
NULL
, this will insert a row index column with the given name.- row_index_offset
Offset to start the row index column (only used if the name is set by
row_index_name
).- ignore_errors
Keep reading the file even if some lines yield errors. You can also use
infer_schema = FALSE
to read all columns as UTF8 to check which values might cause an issue.- reuse_downloaded