-
Ctrl + Enter to run a recipe
It would be great to be able to use the shortcut key combination Ctrl + Enter to run a recipe while in the recipe editor screen. This keyboard shortcut would be consistent with what you can do in both Jupyter Notebooks and in SQL Notebooks. I realize that there is a current keyboard shortcut for running a recipe (@ run)…
-
Add option to support non-pandas dataframes (e.g. polars) in Python recipes
Hi, There are many pandas alternatives. One that is new and very fast is polars. Polars is built on Rust so it is memory safe and runs in parallel by design. I use polars in one of my recipes but have to convert it to pandas to write the dataset. thx
-
Make Dataiku Managed Datasets Less Opinionated (aka stop dropping my tables)
After 11.4.0 (or earlier as we upgraded from 11.0.3), Dataiku not defaults to dropping and re-creating by default when using Dataset python APIs if for some reason the dataset schema and underlying table do not match. It will do this silently and pass jobs, where later we find out that we've lost our history in the base…
-
Perform quick SQL query on SQL dataset from UI
For my workflow it would be very helpful to have the option to perform a quick SQL query on a (SQL) dataset in the Flow from the UI. For example by right clicking. Things like count distinct values of a specific column, etc. Right now, I go to my separate SQL client to perform these quick checks, but that requires tool…
-
The recipe execution is taking long time due to handling a large volume of data in dataiku
We are experiencing long execution times for a recipe in Dataiku due to handing large datasets, while we have implemented partitioning using a filter on a specific column, it still takes 1.5-2 hours to partitioning 30M records. Is there a more efficient way to handle and process this data quickly and effectively because…
-
Comments in Formula
User Story: As a creator of formulas in Dataiku, I would like to be able to add comments in formulas, this would allow me to leave information in formulas about why formulas are configured the way that they are, increasing trust and communications, and it would allow the ability to "comment out" chunks of code while…
-
Remove the 30 char table/column name limitation for Oracle datasets
Oracle 19c has extended the table/column name length limit from 30 char to about 1000, but DSS (ver 9) still honors this old 30 char length limit for Oracle datasets. Hope this limit can be removed in future versions since everybody is on 19c or higher now.
-
How to run integration tests on flows with Python recipes
I've recently started to use the "Run integration test" scenario step for testing. It's definitely some work to create the test reference datasets but it once set up it's great to be able to run this test after later code changes to confirm the process works as expected. Our flows typically mostly use SQL script recipes.…
-
Add Venn diagram and UpSet plot to Charts
I'm encountering some use cases where I want to easily visualize the number of records belonging to one or several groups and their overlap where group membership is spread over multiple 1/0 columns. Would be super handy to have Venn diagrams in the Charts or, sometimes even better, UpSet plots.
-
Select Columns Outside of Join Recipe
I would like to be able to select the columns of data outside of a join recipe. A couple of examples: 1 - Usage of "unmatched rows". The column selection occurs after the join does not apply to data that isn't joined. In this case I am using both sets of data so need the option to select columns from both sets. 2 - Removal…