-
Is there a way to roll back the changes that we have done to a flow?
Hi all, when our flow involves lot of transformation there can come a point where some thing goes wrong and we want to revert back to the point where everything was working. Is there a way I can check what all changes were made to the flow, and go back to the previous version or branch out from there?
-
Perform quick SQL query on SQL dataset from UI
For my workflow it would be very helpful to have the option to perform a quick SQL query on a (SQL) dataset in the Flow from the UI. For example by right clicking. Things like count distinct values of a specific column, etc. Right now, I go to my separate SQL client to perform these quick checks, but that requires tool…
-
The recipe execution is taking long time due to handling a large volume of data in dataiku
We are experiencing long execution times for a recipe in Dataiku due to handing large datasets, while we have implemented partitioning using a filter on a specific column, it still takes 1.5-2 hours to partitioning 30M records. Is there a more efficient way to handle and process this data quickly and effectively because…
-
How to implement a feedback loop on a dataset ?
Hello, Each month, I have to compute a dataset that takes the previous month's dataset (M-1) and add some stuff in it. I wonder how I could to it in Dataiku as for the recipe, I should take the last output dataset (M-1) as the input. I don't think it is currently possible to produce a feedback-loop in Dataiku: do you…
-
Standalone Recipe Configuration
I am having difficulty in configuring the Standalone Evaluate Recipe It is not able to provide performance drift however i configure. The below is what i see It was also not clear on what a reference dataset is needed as it is optional and there is no mention of it here.
-
Build several partitions in one go
Hi, I want to synchronize an Oracle table of 1 billion rows to another Oracle table. The query is very long and I end up with the following Oracle error: [11:06:27] [INFO] [dku.output.sql] - appended 178620000 rows, errors=0 [11:06:27] [INFO] [dku.utils] - Closing oracle.jdbc.driver.T4CConnection@7fc1cb4f [11:06:27] [INFO]…
-
How to add data to a existing dataset with python?
I have data set by name weather_data , i want to add data everyday to this dataset How can i do this with python?
-
Is there a simple way to reapply flow/recipes to a second dataset?
Hi I created a flow sequencing recipes for a first dataset. The goal is to create a prediction model at the end of the flow. Next I need to apply my model to an input dataset that has the same schema as my first dataset. I could not figure out how to apply the whole flow to the 2nd dataset and I had to copy the flow steps…
-
Is there a way to propagate schema changes in a whole flow?
When changes occur to the schema of a dataset early in a flow, you would like to be able to ensure that these changes are reflected in datasets further down the flow.
-
Rename Output Dataset of Recipe
Can I rename the output dataset of a recipe? thanks uli