-
Properly implement support for Building Flow Zones in Scenarios and the Dataiku API
In Dataiku v12.0.0 a new feature was added that allows users to build flow zones from the flow UI: https://knowledge.dataiku.com/latest/data-preparation/pipelines/tutorial-build-modes.html#build-a-flow-zone This works well however this capability was never added properly to Scenarios and to the Dataiku API. In 12.1.0…
-
Smart indexing: recommend index based on downstream recipes
It would be helpful if on the index selection menu for a dataset, some smart values could be displayed based on downstream recipes, and if in the recipe creation views, upstream datasets could be reindexed to optimize them as well. For example, after I've created two join recipes downstream of a dataset, on the index…
-
Project Folder should be capable to manage permissions for underlying projects.
Hi everyone, I’d like to suggest an improvement for Dataiku's folder and project permission management. I find it strange that Dataiku doesn’t inherit folder permissions into project permissions. In case of, project folders are set up for different teams of entities - it shouldn't just be a visual organisation on the…
-
Unique key detector tool
What's your use case? More than often, you have to deal with a dataset without knowing what's make a row unique. This can lead to misinterpret the data, cartesian product at join and other funny stuff. What's your proposed solution? This is a feature I haven't seen in any data prepation/etl. The core feature is to detect…
-
Data Upsert
Currently, Dataiku offers the choice to either overwrite or append data during dataset updates, yet it lacks the capability for a user to perform an upsert on their data. An upsert operation, which merges the functions of updating and inserting, enables users to harmonize their existing dataset with new or modified data.…
-
Create a code environment manager module
Hi, My group has dozens of code environments and use various Python packages. It would be great if Dataiku had a dedicated module that would allow the user to specify a python package and get the current version in each code environment for the code environments I have access to. Second, it would be great if I could push a…
-
Allow post-join computed columns for columns that begin with underscores
When a column starts with an underscore, it cannot be used in a post-join computed column. For example, this column is defined for every record in the table: However, the preview fails when it is used in a post-join computed column formula: Other columns beginning with underscores that I know are fully defined have the…
-
Creating a QABot about Dataiku using "Dataiku Answers"
Hello, I thought it would be very helpful for all users if we could create a QABot about Dataiku using "Dataiku Answers." However, obtaining text data related to documentation or knowledge is not easy. According to Dataiku's website terms of services, web scraping is prohibited, so I gave up on preparing data through…
-
Add/Delete rows button for VisualEdit Plugin
Hi, I am using the visual edit plugin as a solution for editable datasets, lots of our projects at Convex are using editable datasets in our production environment and we have been looking for a solution for a while now that allows user to edit to datasets without needing write permissions. The visual edit plugin is really…
-
git - clear notebooks before commit
Several times we've encountered projects that could not be exported due to "git saturation." From memory, I believe the export limit for a project occurs when version control exceeds 2 GB (but I think this limit has recently been raised). After investigating, we found that this issue was caused by committing notebooks,…