read_ndjson_polars()
imports the data as a Polars DataFrame.
scan_ndjson_polars()
imports the data as a Polars LazyFrame.
Usage
read_ndjson_polars(
source,
...,
infer_schema_length = 100,
batch_size = NULL,
n_rows = NULL,
low_memory = FALSE,
rechunk = FALSE,
row_index_name = NULL,
row_index_offset = 0,
reuse_downloaded = TRUE,
ignore_errors = FALSE
)
scan_ndjson_polars(
source,
...,
infer_schema_length = 100,
batch_size = NULL,
n_rows = NULL,
low_memory = FALSE,
rechunk = FALSE,
row_index_name = NULL,
row_index_offset = 0,
reuse_downloaded = TRUE,
ignore_errors = FALSE
)
Arguments
- source
Path to a file or URL. It is possible to provide multiple paths provided that all NDJSON files have the same schema. It is not possible to provide several URLs.
- ...
Ignored.
- infer_schema_length
Maximum number of rows to read to infer the column types. If set to 0, all columns will be read as UTF-8. If
NULL
, a full table scan will be done (slow).- batch_size
Number of rows that will be processed per thread.
- n_rows
Maximum number of rows to read.
- low_memory
Reduce memory usage (will yield a lower performance).
- rechunk
Reallocate to contiguous memory when all chunks / files are parsed.
- row_index_name
If not
NULL
, this will insert a row index column with the given name into the DataFrame.- row_index_offset
Offset to start the row index column (only used if the name is set).
- reuse_downloaded
If
TRUE
(default) and a URL was provided, cache the downloaded files in session for an easy reuse.- ignore_errors
Keep reading the file even if some lines yield errors. You can also use
infer_schema_length = 0
to read all columns as UTF8 to check which values might cause an issue.