-
Secure “Run As” Permissions for Visual Apps
Dataiku currently switches “Run As” permissions to the end user when the App instance is created. To work around this, developers create app instances with their own credentials and grant access, often requiring full edit permissions for users to modify variables. While this enables access to otherwise restricted data (the…
-
Is there a way to roll back the changes that we have done to a flow?
Hi all, when our flow involves lot of transformation there can come a point where some thing goes wrong and we want to revert back to the point where everything was working. Is there a way I can check what all changes were made to the flow, and go back to the previous version or branch out from there?
-
Perform quick SQL query on SQL dataset from UI
For my workflow it would be very helpful to have the option to perform a quick SQL query on a (SQL) dataset in the Flow from the UI. For example by right clicking. Things like count distinct values of a specific column, etc. Right now, I go to my separate SQL client to perform these quick checks, but that requires tool…
-
The recipe execution is taking long time due to handling a large volume of data in dataiku
We are experiencing long execution times for a recipe in Dataiku due to handing large datasets, while we have implemented partitioning using a filter on a specific column, it still takes 1.5-2 hours to partitioning 30M records. Is there a more efficient way to handle and process this data quickly and effectively because…
-
How to implement a feedback loop on a dataset ?
Hello, Each month, I have to compute a dataset that takes the previous month's dataset (M-1) and add some stuff in it. I wonder how I could to it in Dataiku as for the recipe, I should take the last output dataset (M-1) as the input. I don't think it is currently possible to produce a feedback-loop in Dataiku: do you…
-
Standalone Recipe Configuration
I am having difficulty in configuring the Standalone Evaluate Recipe It is not able to provide performance drift however i configure. The below is what i see It was also not clear on what a reference dataset is needed as it is optional and there is no mention of it here.
-
Build several partitions in one go
Hi, I want to synchronize an Oracle table of 1 billion rows to another Oracle table. The query is very long and I end up with the following Oracle error: [11:06:27] [INFO] [dku.output.sql] - appended 178620000 rows, errors=0 [11:06:27] [INFO] [dku.utils] - Closing oracle.jdbc.driver.T4CConnection@7fc1cb4f [11:06:27] [INFO]…
-
How to add data to a existing dataset with python?
I have data set by name weather_data , i want to add data everyday to this dataset How can i do this with python?
-
Is there a simple way to reapply flow/recipes to a second dataset?
Hi I created a flow sequencing recipes for a first dataset. The goal is to create a prediction model at the end of the flow. Next I need to apply my model to an input dataset that has the same schema as my first dataset. I could not figure out how to apply the whole flow to the 2nd dataset and I had to copy the flow steps…
-
Is there a way to propagate schema changes in a whole flow?
When changes occur to the schema of a dataset early in a flow, you would like to be able to ensure that these changes are reflected in datasets further down the flow.