R Recipe Streaming
We're working on a project utilizing an R notebook with some very large datasets and are wondering what the recommended approach is for working with a dataset that does not fit into memory.
We are big fans of the streaming API for Python - is there any equivalent for R?
Thanks!
Answers
-
Hello,
The dataiku R library allows you to read your data in by chunks. More information can be found in the docs here : https://doc.dataiku.com/dss/api/8.0/R/dataiku/reference/dkuReadDataset.html
-
rmoore Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Participant, Neuron 2023 Posts: 33 Neuron
Thanks @Triveni
. The ability to read a random sample from the dataset is terrific, but can you tell me if it's possible to read records from a specific offset? For example, I want to iterate through the entire set but only take 10,000 records at a time. -
Tanguy Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2023 Posts: 118 Neuron
+1 for this feature in R