Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hello to all Dataiku users,
I am writing to you because I have a problem. I hope you can help me. (a diagram of the problem is attached)
I'm looking for a way to historicize some datasets in green CSV or excel (HDFS). Indeed I should be able to keep a history of datasets in HDFS in a subfolder for example at each RUN
I explain, at each RUN of the flow zone, I would like the intermediate dataset and the final dataset to be stored in a subfolder in hdfs.
The objective is that I can compare the different versions at each run (because in my recipes, I can modify things)
I don't know if I am very clear. Thanks for your help
Take a look at the following Dataiku Features see if they can be of help to you.
This seems to go over some of this https://community.dataiku.com/t5/Using-Dataiku/how-can-select-the-append-mode-in-a-dataset/td-p/3367
I know that I've been able to use these two features to acheive something like I think you want to do.
There is another discussion about doing something like this through python.