data historization

Kevin_dataiku8 · April 2023

Hello to all Dataiku users,

I am writing to you because I have a problem. I hope you can help me. (a diagram of the problem is attached)

I'm looking for a way to historicize some datasets in green CSV or excel (HDFS). Indeed I should be able to keep a history of datasets in HDFS in a subfolder for example at each RUN

I explain, at each RUN of the flow zone, I would like the intermediate dataset and the final dataset to be stored in a subfolder in hdfs.
The objective is that I can compare the different versions at each run (because in my recipes, I can modify things)

I don't know if I am very clear. Thanks for your help