-
Comments in Formula
User Story: As a creator of formulas in Dataiku, I would like to be able to add comments in formulas, this would allow me to leave information in formulas about why formulas are configured the way that they are, increasing trust and communications, and it would allow the ability to "comment out" chunks of code while…
-
The recipe execution is taking long time due to handling a large volume of data in dataiku
We are experiencing long execution times for a recipe in Dataiku due to handing large datasets, while we have implemented partitioning using a filter on a specific column, it still takes 1.5-2 hours to partitioning 30M records. Is there a more efficient way to handle and process this data quickly and effectively because…
-
Remove the 30 char table/column name limitation for Oracle datasets
Oracle 19c has extended the table/column name length limit from 30 char to about 1000, but DSS (ver 9) still honors this old 30 char length limit for Oracle datasets. Hope this limit can be removed in future versions since everybody is on 19c or higher now.
-
How to run integration tests on flows with Python recipes
I've recently started to use the "Run integration test" scenario step for testing. It's definitely some work to create the test reference datasets but it once set up it's great to be able to run this test after later code changes to confirm the process works as expected. Our flows typically mostly use SQL script recipes.…
-
Add Venn diagram and UpSet plot to Charts
I'm encountering some use cases where I want to easily visualize the number of records belonging to one or several groups and their overlap where group membership is spread over multiple 1/0 columns. Would be super handy to have Venn diagrams in the Charts or, sometimes even better, UpSet plots.
-
Select Columns Outside of Join Recipe
I would like to be able to select the columns of data outside of a join recipe. A couple of examples: 1 - Usage of "unmatched rows". The column selection occurs after the join does not apply to data that isn't joined. In this case I am using both sets of data so need the option to select columns from both sets. 2 - Removal…
-
Have a dataiku templating engine based on Python mako or jinja
Hi, Python based templating engines like jinja and mako allow users to 'print' text in various formats, using conditional logic statements like if-else and for loops. I think dataiku should offer an off the shelf Python based templating engine that would allow users to upload their template(s) and pass a `context dict` to…
-
Edit default metrics and checks as a project-wide setting
When creating a new dataset, I practically always edit the default metrics and checks to run row counts after build. Ideally, I could define this from the project settings so that every new dataset created automatically has my desired metrics and checks configured. Of course, this doesn't apply to column-specific values,…
-
Option to rearrange output columns in join recipe
I would like to have the option to rearrange output columns in the join recipe. Perhaps by making the 'hamburger' icons on the Output panel draggable.
-
Properly implement support for Building Flow Zones in Scenarios and the Dataiku API
In Dataiku v12.0.0 a new feature was added that allows users to build flow zones from the flow UI: https://knowledge.dataiku.com/latest/data-preparation/pipelines/tutorial-build-modes.html#build-a-flow-zone This works well however this capability was never added properly to Scenarios and to the Dataiku API. In 12.1.0…