-
Idea: Include Associated Objects When Duplicating Dataset / Flows
Hello When duplicating parts of a flow in Dataiku, the associated datasets are duplicated, but the developed charts linked to those datasets are not included. This means that users have to manually recreate or copy these charts, which can be time-consuming and prone to errors. Benefits Including the duplicate feature for…
-
Improve Tasks in Project Todo List
I would really like to use the todo list accessible in a Dataiku DSS project home page to communicate with my team. However it is not always easy to use it. Here are a couple of improvements suggestions: Allow tasks to be moved / reordered The text editing area has a fixed maximum height, which makes it difficult to edit,…
-
Invalid Scenario step logic condition should cause scenario failure
I have noted a very dangerous behavior in the latest v12 release although I believe this has been in DSS for a long while. DSS will not cause a scenario failure or even a warning if you have an invalid Scenario step logic. For instance I created a scenario step and set two variables: {"var1": 123, "var2": 456} Then I…
-
Being able to set User Settings via Python API
There are now more user settings than ever in the user's profile page (DSS/profile/). In v12.3.2 there are now 10 different email notification settings. We would like to be able to customise these to our preferred defaults via the Dataiku Python API. Currently the Dataiku Python API does not support this. Thanks
-
Allow nested flow zones
Hi, I use flow zones a lot and appreciate the value. Why not extend the capability and allow nested flow zones, i.e. a flow zone within a flow zone? thx
-
Easier Undo Actions in Dataiku DSS
Hello, Dataiku users. In my daily use of Dataiku, I find it very convenient overall, but the lack of an "undo" feature is often inconvenient. Currently, Dataiku does not have a direct "undo" button or a "Ctrl+Z" function to immediately revert mistakenly deleted steps or recipes. If such a feature were available, I believe…
-
Add support for Pandas 2.0
Pandas 2.0 can bring great performance improvements when using the pyarrow backend: https://towardsdatascience.com/pandas-2-0-a-game-changer-for-data-scientists-3cd281fcc4b4
-
Add option to support non-pandas dataframes (e.g. polars) in Python recipes
Hi, There are many pandas alternatives. One that is new and very fast is polars. Polars is built on Rust so it is memory safe and runs in parallel by design. I use polars in one of my recipes but have to convert it to pandas to write the dataset. thx
-
Calculate a single metric/check via the Dataiku API
Hi, It is currently not possible to calculate a single metric or check via the Dataiku API while this is possible via the GUI. The following APIs exists: dataset.compute_metrics() dataset.run_checks() but they will calculate all enabled metrics/checks which may take a lot of time. So this idea is to provide an API to allow…
-
Improve Code Env Rebuild Process for On Prem Upgrades
When upgrading Dataiku in place, it can take 10+ hours to rebuild images and all code envs. Workarounds include including core packages in base images or caching pypi indices but these are minor. Upgrade time scales linearly with the number of envs on nodes, which is unfortunate. We cannot execute an actual blue/green…