Meet DSS user Ben Powis, Data Science Manager at UK retail company MandM Direct Read More

Build dataset from python

Dataiker
Dataiker
Build dataset from python
Hello,

Is it possible to build a dataset through a python method ?



Cheers,
Clément
0 Kudos
6 Replies
Dataiker
Dataiker
Yes, you can define a Python recipe with an ouput dataset but no input. Use case could be that you retrieve data from an external API, process it in a Pandas dataframe, and then save it to your output dataset using a dataiku method such as write_with_schema.
0 Kudos
Dataiker
Dataiker
Author
The function I was looking for is scenario.build_dataset(). I didn't ask It the proper way 😉
Thks
0 Kudos
Dataiker
Dataiker
OK great! We have some examples in our doc that may help you: https://doc.dataiku.com/dss/latest/api/public/client-python/index.html#examples
0 Kudos
Level 3
Regarding this particular function (scenario.build_dataset()), is there a list of accepted parameters for the build_mode keyword argument? I saw that RECURSIVE_BUILD is the default parameter, but is there something similar for non-recursive ones?
0 Kudos
Dataiker
Dataiker
Here are the available options:
/** Rebuild what is required for dependencies */
RECURSIVE_BUILD,
/** Only rebuild the dataset directly, ignore the state of the dependencies */
NON_RECURSIVE_FORCED_BUILD,
/** Rebuild all recursively, ignore the state of the dependencies */
RECURSIVE_FORCED_BUILD,
/** Recursive build, but only build "missing" datasets, don't refresh out of date ones */
RECURSIVE_MISSING_ONLY_BUILD
0 Kudos
Level 3
Thanks a lot!
0 Kudos
Labels (2)