Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Added on May 8, 2019 4:45PM
Likes: 2
Replies: 6
Hello,
Each month, I have to compute a dataset that takes the previous month's dataset (M-1) and add some stuff in it.
I wonder how I could to it in Dataiku as for the recipe, I should take the last output dataset (M-1) as the input.
I don't think it is currently possible to produce a feedback-loop in Dataiku: do you confirm ?
How could I achieve my computation with Dataiku ? The "append-only" feature is not a good answer, because before writing anything, I should read the (last month) output dataset to know what will be new in the (current month) output.
Best regards.
Hi @Liev
,
I'm currently facing a similar issue. I need to "update" some data based on previous results. Could you please explain me how to do what you suggested?
Thanks!
P.S. I'm quite new to Dataiku
Hi @tomtom
,
If you don't mind working in SQL, another option besides the one described by @Liev
is to use a SQL Script code recipe (but to be clear not a SQL Query recipe). SQL Script recipes don't need to have an input dataset and so you could easily do what you describe (albeit entirely in SQL code).
Marlan
Because English is not my native language, I wouldn't be able to explain in words @Liev
solution in a better way. So, I did the best thing that I could: a recording showing how I solve a similar problem in exactly the same way that Liev mentioned.
The video doesn't have audio, and what I'm doing is:
1) create a dataset connected to a table called "daily_status_table"
2) open a dataset that contains a history of the daily statuses: the idea is to add new information into this dataset ("history_daily_track") by doing some crossmatch with the "daily_status_table". So first I create a dataset by using a connection to the table "history_daily_track" and I named it "history_daily_track_as_input"
3) Then I create a second dataset that is also connected to the table "history_daily_track", but now I named the dataset as "history_daily_track_as_output"
4) In my case, I wanted to use a python recipe to do the crossmatch. So I create the recipe and give as input "daily_status_table" and "historydaily_track_as_input", and I set as output the already created "history_daily_track_as_output" dataset.
Hope this helps!
Looks like your video disappeared @Ignacio_Toledo , but if I understand correctly, this workaroung is connected to an external SQL table.
Could this solution apply to a internal dataset inside my dataiku project ? that I want to update at the start of my flow, to catch old + new parameters inside the flow
I can't find a solution to do it this way,
Thank you,
Hi @bob
Sadly I don't longer have the video. But yes, you can do a similar thing with filesystem internal datasets, by creating a "new" dataset, and then editing the "Path" (Edit Anyway) to point to the file you would like to change.
I hope this helps!