-
Create a code environment manager module
Hi, My group has dozens of code environments and use various Python packages. It would be great if Dataiku had a dedicated module that would allow the user to specify a python package and get the current version in each code environment for the code environments I have access to. Second, it would be great if I could push a…
-
Allow post-join computed columns for columns that begin with underscores
When a column starts with an underscore, it cannot be used in a post-join computed column. For example, this column is defined for every record in the table: However, the preview fails when it is used in a post-join computed column formula: Other columns beginning with underscores that I know are fully defined have the…
-
Creating a QABot about Dataiku using "Dataiku Answers"
Hello, I thought it would be very helpful for all users if we could create a QABot about Dataiku using "Dataiku Answers." However, obtaining text data related to documentation or knowledge is not easy. According to Dataiku's website terms of services, web scraping is prohibited, so I gave up on preparing data through…
-
Add/Delete rows button for VisualEdit Plugin
Hi, I am using the visual edit plugin as a solution for editable datasets, lots of our projects at Convex are using editable datasets in our production environment and we have been looking for a solution for a while now that allows user to edit to datasets without needing write permissions. The visual edit plugin is really…
-
git - clear notebooks before commit
Several times we've encountered projects that could not be exported due to "git saturation." From memory, I believe the export limit for a project occurs when version control exceeds 2 GB (but I think this limit has recently been raised). After investigating, we found that this issue was caused by committing notebooks,…
-
Improve Git configurations in general.
Hello, I believe it could be interesting to improve the interface for configuring the security of dataiku nodes. On the visual aspect but also to have more config parameters. a few examples; At the interface level, to be able to choose groups from the list available on the instance. To be able to manage segregation methods…
-
Maintain case of SQL table name when creating SQL datasets
Currently, when a SQL dataset is created, the name of the associated SQL table is set to PROJECTKEY_tablename regardless of the case of the SQL dataset name. It would be great if either the case of the dataset name was maintained in the SQL table name (so dataset ABC would result in a SQL table name of PROJECTKEY_ABC…
-
Support image segmentation in labelling tasks and Visual ML
I have a couple of use cases where I need to train image (instance) segmentation models (as opposed to predicting bounding boxes in object detection). I'd love for the ML labelling to support image segmentation approaches. For example using SAM (Segment Anything Model) to pre-segment images which can then be annotated by…
-
Support fot 2way partial dependence plots
I'd love to see support for 2way partial dependence plots in mode summary reports to get insights into the interaction of 2 features on their model impact. This would give some deeper insight into feature behavior in the model at hand. See here under 4.1.1 for the sklearn implementation 4.1. Partial Dependence and…
-
Select which test queries to run in API Designer
Often times when testing and adjusting test queries, I only want a single query to be executed. It would be helpful to have selection boxes next to the test queries in API Designer for which queries to (re-)run when using the Run testqueries command.