Number of Rows
Hello,
I am using Dataiku Dss for managing for the organization I am working now. The idea is I am collecting above 230000 records (rows) by using Kobo Form. I use API to import the data from Kobo to Dataiku. Currently, I have above 50,000 records on Kobo but am able to get only 30K rows on Dataiku.
Is there any way to get all rows in real time?
Best Answer
-
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron
Welcome to the Dataiku community.
I was interested in your question here because I do a bunch of work with REST APIs as data sources.
I just wanted to share that I have had good results with using the API connect plug-in from Dataiku. It can be setup to do pagination, which sounds like it may be helpful in your use case. With an enterprise license to Dataiku you can also set up Scenarios that can wake up intermittently and pull data from your source. Here is the documentation:
https://doc.dataiku.com/dss/latest/scenarios/index.html
Here is some training material on Scenarios,
https://knowledge.dataiku.com/latest/mlops-o16n/automation/concept-scenarios.html
Answers
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,225 Dataiker
Hi @sefinew
,The 30k row limit is Kobo API limitation. You will need to use pagination to retrieve your data in batches of up to 30k rows.
https://mixedanalytics.com/knowledge-base/import-kobotoolbox-data-to-google-sheets/#pagination -
Thank you so much this helped a lot
-
Thank you, this has become so handy for me today.
-
This solution has saved me big time.