-
CHAR(1) columns turning into lengths of 2 with spaces in Exports
We have to use data preps from many database tables on different platforms that have columns defined as CHAR(1) to keep them as length's of 1. Otherwise the exports change them to lengths of 2 with spaces added on. So, an indicator column with only "Y" or "N" becomes "Y " (added space) or "N " (added space). Using a data…
-
Auto Detect column types of strings TEXT only and not other
We have to continuously change auto detection column types of: Boolean, State, Country, Email, Phone, Natural Language, and maybe others to TEXT to work with database platforms we use. (Teradata, Snowflake, Databricks, SQL Server, Oracle) Can we add an option where Dataiku recognizes strings as TEXT only thus helping us to…
-
Ability to terminate a custom Python scenario step with a Warning
We would like to have a custom Python scenario step generate a warning based on custom code logic. Generating a step failure is easy as we can just abort the step (as shown in this link) or just raise an exception in Python code. However there is no option to end the custom Python scenario step in a warning outcome. The…
-
Anomaly Detection
In the Visual ML, a new type of problem could be added - Anomaly Detection. It could include algorithms like Isolation Forest, Robust Covariance, Local Outlier Factor, One Class SVM, Gaussian Mixture, Kernel Density, DBSCAN, OPTICS, Elliptice Envelope. etc. I would like to find anomalies in data using multiple algorithms…
-
Dark Mode
Every developer needs a dark mode A dark theme for the flow, datasets, and recipe configs would go a long way toward making Dataiku fit into workflows that involve many other dark mode tools. Dataiku is definitely very bright when swapping from other tools which operate in dark mode. Extensions like Dark Reader do a pretty…
-
Marimo Notebooks Integration in DSS
I'd like to propose the integration of Marimo notebooks alongside the existing Jupyter notebooks in DSS. Marimo is an innovative notebook environment that addresses several limitations of traditional Jupyter notebooks while maintaining compatibility. Here are some key advantages of Marimo notebooks: Code quality : Marimo…
-
Shared Secrets
As we've been developing plugins and for other more exotic use cases, we've seen the need for shared secrets in Dataiku. Teams share account credentials or plugins may rely on some group based credential (e.g. Box JWT tokens for a "team account"). We hack around this using FTP type connections and parsing their secrets or…
-
Invalid Scenario step logic condition should cause scenario failure
I have noted a very dangerous behavior in the latest v12 release although I believe this has been in DSS for a long while. DSS will not cause a scenario failure or even a warning if you have an invalid Scenario step logic. For instance I created a scenario step and set two variables: {"var1": 123, "var2": 456} Then I…
-
ETL
Je propose de développer une solution moderne de Lakehouse pour accélérer la mise en œuvre des projets de Data Science et optimiser le temps. Se connecter aux différentes sources de données, les nettoyer et les mettre dans des formats adaptés à l’analyse et aux modèles de Machine Learning peut souvent être long et…
-
Sharepoint plugin update
Adding a column of "modified by" that is available on sharepoint on the sharepoint plugin. It would help me get my data on who actually uploaded datasets so we can track the users of who is adjusting the files.
-
Cartesian product detection in join recipe
What's your use case? Cartesian product is a common issue when joining dataset with a bad key. It's not always easy to detect and users can even forget to check for it because they think they know their data. What's your proposed solution? What I suggest is an option to check if there will be a cartesian product on the…
-
Allow the window recipe to ignore partitioning.
I frequently the window recipe along with partitioning. The window recipe currently has no option to ignore partitioning which causes me to unpartition, then repartition to get the recipe to work correctly. This causes a host of other issues. Please give us the option to ignore partitioning for recipes that frequently…
-
Show webapps and lib in bundle
Hello, Working on projects especially based on webapp and using libraries. It appears that when creating a bundle, it does not appear. my workaround solution is to download the bundle and check manually what's in. Best regards, Simon
-
An easy way to download uploaded files
Hello all, Dataiku makes it very easy to upload files and create datasets based on these files. However, there isn't a clear and easy way (like a button) to download the said files while it can be pretty useful (here, the file is on my coworker machine, he uploaded it and I want to retrieve it… from what I undersand, I…
-
Enable/Disable Reporters in Deployer
I would really like to be able to manually enable reporters in a deployed project, even if they are disabled in the design node. I imagine this would work exactly like triggers being enabled/disabled. My use case for this is I have triggers that send emails to end users when certain scenarios sucessfully complete their…
-
Changing Data Meaning and/or Storage Type Creates Recipe Step
Data meaning and storage types for columns are currently editable in explore tabs for datasets OR inside a prepare recipe. From my experience, adding it in a prepare recipe DOES allow it to be re-changed when the recipe is re-run. However, this does not visible appear as a processing step in the prepare recipe. Can that be…
-
Search ability in the discussion and Product ideas
That will allow people if the question has been asked already and maybe even a solution/resolution is already. It will help reducing duplicates and make it easier to work with.
-
Data Catalog - Database Stored Procedures
Is there a way currently to have the dataiku catalog read the database stored procedures? It would be great if the procedure name and definition were searchable through the tool. Bonus points if the tool was able to read the procedures to determine input tables and output!
-
Enhance the UX for "Require authentication" attribute in the settings panel for DSS web apps
As end user I get confused when the DSS admins have set the Authentication Mode to "Require authentication for all webapps (except whitelist)" on the /admin/general/security-other/ page in DSS. This is especially confusing when I am working on a web app and I can still uncheck/check "Requires authentication," in the web…
-
Managed Folder tree-like view and preview for readable files
Hello, My team and me have a massive usage of managed folders for basically anything we do by code. Indeed, we have mixed data types (parquet, json, pickle, xlsx, etc) and to use them we found managed folders easy to use and work in Python. We find it difficult to navigate in a Managed Folder (DSS 13.5.4) and we think that…
-
Add option to support non-pandas dataframes (e.g. polars) in Python recipes
Hi, There are many pandas alternatives. One that is new and very fast is polars. Polars is built on Rust so it is memory safe and runs in parallel by design. I use polars in one of my recipes but have to convert it to pandas to write the dataset. thx
-
Managed Folder tree-like view and preview for readable files
Hello, My team and me have a massive usage of managed folders for basically anything we do by code. Indeed, we have mixed data types (parquet, json, pickle, xlsx, etc) and to use them we found managed folders easy to use and work in Python. We find it difficult to navigate in a Managed Folder (DSS 13.5.4) and we think that…
-
Align and enrich the editing features of different recipe code types
Hello, Currently, there are differences in the editing functions offered by recipe codes. Of course, jupyter notebooks are better off. Our users ask that basic functions, such as "find & replace", be available regardless of the type of recipe code used (SQL, Python, Shell, Notebooks). Thank you in advance
-
Change Auto-Typing to an off or on option with default “Off”
Would like to have the Auto-Typing setup as an option that can be turned off and on with the default being “Off”. This feature is changing my unit serial numbers (230836735F) to a Float (2.30836735E8) which causes me to lose records when joining on the unit serial numbers field in a following step. This will cause my…
-
Paste list in interim table filter
I would like to be able to copy a list of data from excel and paste it in the interim table filter when using the "Is any of the strings" option instead of having to enter them one at a time. Helps in troubleshooting workflows when you are looking for multiple records.
-
Ability to resize section within a view
Can we get the ability to resize the sections within any of the views in DSS? I like having the info in the different section but a lot of times I wish I could shrink one down a little to get better view of another section within the same view. Attached are a couple of views but would be nice on all.
-
Make Bar Chart widths adjustable
Right now, theres no good way to adjust bar chart width. I think that this should be a formatting option for charts, since its really had to make bar charts look nice when there's a lot of data
-
Prepare Recipe : Format date with custom format - Multiple columns
As of DSS v. 13.5, The processor "Format date with custom format" in the Prepare recipe does not allow to apply the same format to multiple columns at at time. It would be very usefull to perform this step accros multiple columns, just like in the Parse date processor : Best regards,
-
Add Charts to dataikuapi
I'm working with Agentic workflows and stuff in Dataiku 14. One of the ideas I'm toying with is using Dataiku's API module to automatically make very basic Prepare recipes, for example. The next thing I planned on doing was looking at a dataset and creating charts in a similar way, but there's no way currently (that I know…
-
Export Recipe - Filter option DSS formula - Unable to use variables syntax ${}
Hello, Using the export to folder visual recipe with Dynamic recipe repaeat enable, we are unable to perform a filter using a DSS formula and a variable defined from a another dataset using the syntax ${variable_name}. It seems that it's treated as a column name. The syntax variables["variable_name"] works perfectly. Using…
-
Show API deployment comments on "Last updates" tab
When you deploy an API or API version from the API Designer to the Local Deployer, you can enter a comment for this version. I would love to see this comment field represented on the new "Last updates" tab on the deployer for a service in order to have a quick overview on why each update to the service was made or what…
-
Permission to edit Scenarios only
We have use cases where we do not want someone to have edit or admin permissions to a project, but we would like them to be able to enable/disable scenarios entirely or the steps within them. This is in addition to being able to run the scenario of course. We are thinking about the case where there is a failure or need to…
-
Upper case bug
Have you guys done any changes to the Community site? Every reply I hit and every piece of code I copy/paste gets upper cased.
-
Allow nested flow zones
Hi, I use flow zones a lot and appreciate the value. Why not extend the capability and allow nested flow zones, i.e. a flow zone within a flow zone? thx
-
Enhance Excel output for "Export to folder"
I would like to request an enhancement to the "Export to Folder" recipe when exporting datasets to Excel format. Specifically, it would be extremely helpful if the export could support: Freezing the header row Adding auto filters to the header These features are commonly used in Excel for better data readability and…
-
Ctrl + Enter to run a recipe
It would be great to be able to use the shortcut key combination Ctrl + Enter to run a recipe while in the recipe editor screen. This keyboard shortcut would be consistent with what you can do in both Jupyter Notebooks and in SQL Notebooks. I realize that there is a current keyboard shortcut for running a recipe (@ run)…
-
Make Dataiku Managed Datasets Less Opinionated (aka stop dropping my tables)
After 11.4.0 (or earlier as we upgraded from 11.0.3), Dataiku not defaults to dropping and re-creating by default when using Dataset python APIs if for some reason the dataset schema and underlying table do not match. It will do this silently and pass jobs, where later we find out that we've lost our history in the base…
-
Advanced container settings for R code environments
Like the Python code environments can have dockerfiles defined to be applied when building code env we need the same for R. We find our self modifying the base DSS image to accommodate some features needed in an R code environment.
-
Longer Connection text box on New Snowflake dataset page as needed
Request for the text box for Connection on New Snowflake dataset page to get longer to fit the full connection text if the connection text is longer than the current text box length. Our organization has a standard prefix for connections based on division/team/project, so I have multiple connections with the same prefix…
-
Support Panel Webapps in DataIku
Panel is a powerful #python #mlapp framework build on top of Bokeh. A high-level app and dashboarding solution for Python — Panel 0.12.4 documentation (holoviz.org) Panel can be supported similarly to Bokeh. The Panel server is build on top of Bokeh, therefore it should be very easy to support. For example `bokeh serve` is…
-
Higher Resolution Geometry Viewing
Dataiku condenses geometries such as lines and polygons when visualizing them in the 'charts' tab of a dataset. Polygons for example are visualized with a minimal number of their vertices. For many of our use cases, we'd like to have the option to view these with their original vertices. This could be an optional toggle…
-
Option to display short descriptions on flow
Hi All, Forgive me if this has been discussed before, or if it is a polarizing topic as far as visual design goals. In evaluating Dataiku against other products, and ultimately deciding on Dataiku due to its many strengths, one thing my team lamented was that it was not possible to display descriptions of flow elements on…
-
Support user agent string with download recipe
I am trying to download data from some US government websites, and most will allow the download recipe to run without hassle, but a few like the Bureau of Labor and Statistics (BLS) will sometimes return a 403:Forbidden error when trying to download data. I can easily get around this by using CURL or anything else that…
-
Perform quick SQL query on SQL dataset from UI
For my workflow it would be very helpful to have the option to perform a quick SQL query on a (SQL) dataset in the Flow from the UI. For example by right clicking. Things like count distinct values of a specific column, etc. Right now, I go to my separate SQL client to perform these quick checks, but that requires tool…
-
Add Ability to add updated/inserted time to UPSERT recipe
The new upsert recipe is great, and has alot of potential. It would be awesome to have the ability to add an audit column to this recipe, updated_at if the row was updated, or created_at if its a new row. If its a new row, updated_at would be left blank, created_at would be now() if an update, updated_at would be now(),…
-
Add description to each IP set in IP Allowlist extension to identify them
Add description to each IP set in IP Allowlist extension to identify them. It enable to know for which organization and who this ip or ips range is added in whitelist.
-
Provide ability to export Insights to images in Scenario Steps and the Python API
Currently only Dashboards can be exported to images in Scenario Steps (Export Dashboard step). While there is an export option in the GUI to export Insights to images this is not possible to do via Scenario Steps nor the Python API. So please add support for this. And also extend the Python API to allow Dashboard exports…
-
List managed folders from project
Currently, the only way to view which managed folders are associated with a project is to check the flow. However, on large projects, the flow is too large to load. (On my project of just 7,000 datasets, the flow crashes the browser tab). Datasets and recipes can be listed in the datasets and recipes pages, but managed…
-
Comments in Formula
User Story: As a creator of formulas in Dataiku, I would like to be able to add comments in formulas, this would allow me to leave information in formulas about why formulas are configured the way that they are, increasing trust and communications, and it would allow the ability to "comment out" chunks of code while…
-
Allow datasets to automatically reload schema when jobs run.
Currently, if columns in a dataset source are added or removed, jobs and scenarios that read from that dataset will fail until you reload the the schema from table. Even if everything downstream does not have dependencies on the column changes. We would like to see a setting to allow datasets to always reload schema when…
-
Ability to choose input data set for copied and pasted subflows
I often have to copy a portion of a flow to use in a different section. Having the ability to define my input data would make things more efficient and eliminate some human error. In the use case I have, I want to copy the portion circled in red and paste it to where the green circle is, but I don't want it to branch off…
-
Managed-datasets Metadata Synchronization Across Multiple DSS Instances
Use Case As an organization, we utilize three distinct DSS instances to manage our data analytic and ML workflows: * Self-Service and Data Products Consumption Instance: For end-users to consume data products, and work independently by having access to curated data. * Design and Development Instance: For designing and…
-
Ability to zip files from one folder to another
A business user in my team is trying to upload daily pulls to an SFTP. These files are created by separate Snowflake queries, then merged using a Merge Folder recipe. The business user would like to be able to zip these files into a single folder before uploading to SFTP (3rd party requirement). Currently they are using a…
-
Multiple Tabs Within a Project
Hi, My name is Yusuf Afolabi, I work for Caterpillar as a data scientist. I use Flow Zone a lot and it has been very helpful. Recently, I have been seen situations where navigating to a specific Flow Zone becomes problematic. Think of having like 10 different Flow Zone(s) in a project: you would have to scroll back and…
-
Feature Upgrade Request for Dataiku - related Vector DB, PII detection
Currently, we are proposing Dataiku as a Generative AI platform to one of our key clients. During the solution evaluation process, the client identified two key functionalities that are not yet supported by Dataiku. If these features are added, I am confident that they will significantly contribute to securing a new logo.…
-
Remove the 30 char table/column name limitation for Oracle datasets
Oracle 19c has extended the table/column name length limit from 30 char to about 1000, but DSS (ver 9) still honors this old 30 char length limit for Oracle datasets. Hope this limit can be removed in future versions since everybody is on 19c or higher now.
-
Add PEP8 validation for Python code
It would be very helpful to have PEP8 formatting validation of Python code integrated into the UI in places where Python code is used. As this is a standard that our code, and I guess many peoples', needs to abide by. Most useful examples would be Python code recipes (maybe as an extra validate option) and Project…
-
Update Developer Guide - DSS Messaging Channel Send()
Specifically looking at this page: https://developer.dataiku.com/latest/api-reference/python/messaging-channels.html#dataikuapi.dss.messaging_channel.DSSMailMessagingChannel.send The “append” method for a list only allows for one argument, whereas the script above is trying to provide 3. My team and I recommend adding in a…
-
Support Plotly version 6 (and future ones) in the Jupyter Notebooks.
Plotly latest stable version in now 6.0.0. However, the release notes announced: Drop support for Jupyter Notebook version 6 and earlier [#4822]. The minimum supported version is now 7.0.0. and the current latest version of Dataiku uses a jupyter notebook with version 6.4.9-dku-13.4-0 This is an idea that might be already…
-
Shift + Scroll for horizontal scrolling
Please add the ability to press Shift + Scrollwheel for horizontal scrolling on the data exploration screen. When exploring larger datasets, it is so much easier to use this method of scrolling through the columns compared to moving your cursor to the vertical scrollbar and moving it manually. Also, most other…
-
Allow Scenario Trigger on dataset change for Google Sheets
The idea of a "Trigger on dataset change" is excellent, but it doesn't support all dataset types. It would help us a lot if it could trigger on dataset changes in Google Sheets.
-
Allow user email configuration in profile for on-demand email alerts when recipes complete
Hi, I know Dataiku has scenarios and I use them daily. But when I am debugging flow recipes it would be great if I could configure Dataiku how to send me an email in my profile. Then in a recipe I could just switch a radio button 'EMAIL ON COMPLETION' to ON and walk away. When I am developing in dataiku, many of my flows…
-
Add Venn diagram and UpSet plot to Charts
I'm encountering some use cases where I want to easily visualize the number of records belonging to one or several groups and their overlap where group membership is spread over multiple 1/0 columns. Would be super handy to have Venn diagrams in the Charts or, sometimes even better, UpSet plots.
-
I want to adapt OpenVPN's functionality to the API
Do you have a request to extend OpenVPN functionality to APIs, rather than just DBs and storage with connectors? We would like to use OpenVPN to connect via API from an application operating in a closed NW environment, but the current functionality does not allow us to connect. If you have the same request, we would be…
-
Scenario steps documented in Project Documentation
I see that the Scenarios are not documented in the auto created project documentation. This feature will greatly help to document how our automations are orchestrated
-
Dashboard Improvements on Reference Lines
Looking for 2 Dashboard enhancements on the Reference Lines tuning Reference Line Value Currently, if "Constant" is chosen as the source, a manual value must be entered. It would be beneficial to allow the use of a global variable in the value field. Ability add an aggregation of a different dataset column Have a dynamic…
-
Select Columns Outside of Join Recipe
I would like to be able to select the columns of data outside of a join recipe. A couple of examples: 1 - Usage of "unmatched rows". The column selection occurs after the join does not apply to data that isn't joined. In this case I am using both sets of data so need the option to select columns from both sets. 2 - Removal…
-
Dashboard Improvements on Reference Lines
Looking for 2 Dashboard enhancements on the Reference Lines tuning Reference Line Value Currently, if "Constant" is chosen as the source, a manual value must be entered. It would be beneficial to allow the use of a global variable in the value field. Ability add an aggregation of a different dataset column Have a dynamic…
-
Ability to package environment/local variables with an API service
It would be very helpful if Dataiku allowed for packaging variables (either environment or local variables) with the capability to remap local variables as part of the deployment. Ideally there would also be an option to encrypt a variable. We have several API services that connect to other systems and require environment…
-
Have a dataiku templating engine based on Python mako or jinja
Hi, Python based templating engines like jinja and mako allow users to 'print' text in various formats, using conditional logic statements like if-else and for loops. I think dataiku should offer an off the shelf Python based templating engine that would allow users to upload their template(s) and pass a `context dict` to…
-
Container configuration mapping in bundle deployments
This request is to add mapping options for container configuration in bundle deployments. This would allow for repointing in the event that the container configurations are not named the same in the design node vs automation nodes.
-
Per-user credentials in LLM connections
This request is to add support for per-user credentials in LLM connections. We use OpenAI and set up API keys per project so that we can track spend and budgets at the project level. Currently we have to set up a separate OpenAI connection for each project but ideally we would be able to pass the API key in either through…
-
[Samsung Fire & Marine] Action is needed to prevent logins from sessions logged in from other IPs
If a session logged in from an IP address called A is tampered with by a user logged in from an IP address called B through the developer tool in IE Edge, the user information will be changed. This needs to be improved as it risks allowing regular users to escalate their privileges to administrator status and manipulate…
-
[Samsung Fire & Marine] Need to improve the performance of Join and Group recipes
Samsung Fire & Marine Insurance has been using a statistical analysis tool called SAS for the past several years. This time I'm trying to replace SAS with Dataiku, but there is a major obstacle and that is the performance issue of Join and Group recipes. When performing tests, when performing a join on about 10 million…
-
Extend the "Rebuild Code Studio templates" option to non-admins when updating a code environment
I was pleasantly surprised to discover the "Rebuild Code Studio templates" option in the "Containerized Execution" settings of a code environment. This feature enables the rebuilding of Code Studio templates that rely on a given code environment, effectively killing two birds with one stone. However, after investigating…
-
Improve scenario history : Make them usefull to compare step change
As we know, on DSS we have a Project version control as a builtin "Git-based" version control and we have a kind of lite version of that for any recipe and editable object as known as “History”. Which inside we can check each commit and compare them easily. This seems working well for any kind of recipe with the various…
-
Configurable Timezone Display for Date Columns (Beyond UTC-only)
Current Situation Dataiku DSS has specific behaviors when handling time columns: When it recognizes time-related columns (e.g., date, timestamp_tz, or timestamp_ntz), it displays them as Date columns, rendering them in timestamp format (with both date and time components). A significant limitation is that Date columns…
-
[Samsung Fire & Marine][Security] Protect reusing login session with dss-access-token in browser
We have an security issue that can login with access token value of already logged in other IP browser environment. We need to solve that security issue. ex) the way identify IP myself if dss check Access Token in browser cookie.
-
Make "Table of Contents" visible in Wiki Edit mode
When editing wiki pages, the Table of Contents tree is not visible, making navigation of the wiki exceedingly difficult. I'm often toggling between View and Edit modes in order to get around. Visibility of the Table of Contents tree / section headers while in edit mode would make editing wiki pages much easier.
-
Remote kernal for notebook/recipe
similar like how jupyter notebook capability. reason behind is that some library/capability already in remote server/terminal. executing the code at remote kernal and just obtain result back from it.
-
Document conversion to source RAG
support library like - docling (https://ds4sd.github.io/docling/) - markitdown (https://github.com/microsoft/markitdown)
-
Add Auto Syncing Mode for Code Studios
As an end user of DSS I want to be able to have the ability to auto sync my changes in the code studio to DSS so that I don't lose my work if the code studio crashes or automatically shuts down. Auto syncing would allow me to not be able to lose any work the code studio gets turned off or I forget to sync my changes back…
-
Enhance Dataiku - Snowflake interoperability
I have encountered several challenges involving column name handling and data type management while integrating with Snowflake. I'd pointed out a few things during a mission to integrate the platform on SF. It was no mean feat, especially when it came to managing schemas and types. I noticed that the problem is becoming…
-
Enhance Code Studios Templates APIs to support automated administration
Hi, The current Code Studios Templates APIs (see links below) don't support certain capabilities that we need. We would like to have Python APIs to: Obtain the full list of build IDs related to a Code Studios Template as shown in the Code Studios Template ⇒ Build History ⇒ Show Build drop down. This is needed to be able to…
-
Dataiku Input recipe using Encoding
Please add an option for character encoding when specifying input files; even if I want to specify UTF-8, I can't do so on Dataiku and have to use another tool to convert the character encoding before it can be imported into the Dataiku flow.
-
End to End Possibility for Dev-Ops Implementation with Best Practices
Write now Dataiku possess a lot of unique abilities to develop scalable ML / Deep Learning / GenAI algorithms. In addition to that Dataiku has facilitated collaborative development using flow-zones etc. Even there are a lot of Data quality checks and metrics to facilitate operational efficiency and drift detection. But all…
-
Move objects and Zones on the Flow
I think it would be very useful to be able to move objects and flow zones around in the flow display. It appears Dataiku determines where each recipe, dataset, etc go in the flow and I cannot edit that. I have used Alteryx in the past and it had that ability, which I liked. It allows me to organize the flow however I see…
-
Vertical Scrolling for Datasets
It would boost my productivity significantly if I could use "Shift" + "Scrollwheel" to vertically scroll. Instead of finding the small scrollbar in the bottom of the dataset each time.
-
Automated alerts from Unified Monitoring on bundle or API endpoint failure
We find the Unified Monitoring (UM) feature extremely useful as it allows us to see the health of our bundle and real-time prediction APIs. However, the is no way to be alerted if a deployment fails or if an API endpoint is down. We currently have some Python scripts that scrape the data from UM and then identify any…
-
googlesheets plugin feature: Ignore top n rows on import
Reading a google sheet with the plugin currently requires that header columns are in row 1. In the wild, a lot of users don't build sheets like that and the data begins some rows down the sheet. I suggest to add a feature of ignoring a number of top rows to correctly set the header row and table data.
-
Show all data points in Charts even when "Automatic" date range and Zoom are enabled
As per the title, the Chart seems to truncate at the 3rd or 2nd last data point when the Automatic date range and Zoom are enabled (see first screenshot). If I instead select the actual granularity of my data (e.g. Day from the X axis Date Range drop-down menu) then the last data points appear on the chart, BUT I lose the…
-
automatically remove obsolete versions of code envs on Automation Nodes
This product idea addresses the issue discussed here: Remove old versioned environments and kernels after importing a bundle. Recently, we faced an issue with one of our automation nodes. New deployments were failing because there was no space left on disk. Upon investigation, we discovered that a code environment was…
-
Allow configure CORS on API Deployed Settings
When an API is deployed with Dataiku, CORS security layer does not allow to consume the service, since it is hosted in a different server, then web browsers throws CORS error. Please add a entry on Deployment Settings to allow us to disable CORS configure according our needs. Settings references for Apiman:…
-
Calculate a single metric/check via the Dataiku API
Hi, It is currently not possible to calculate a single metric or check via the Dataiku API while this is possible via the GUI. The following APIs exists: dataset.compute_metrics() dataset.run_checks() but they will calculate all enabled metrics/checks which may take a lot of time. So this idea is to provide an API to allow…
-
Edit default metrics and checks as a project-wide setting
When creating a new dataset, I practically always edit the default metrics and checks to run row counts after build. Ideally, I could define this from the project settings so that every new dataset created automatically has my desired metrics and checks configured. Of course, this doesn't apply to column-specific values,…
-
Add Seldon to deployment options
One of the deployment options in our company is Seldon (Seldon, MLOps for the Enterprise.). It would be great if Dataiku had the option to deploy directly to Seldon, the way deployment to K8, AWS, Databricks or Azure is now possible. Seldon in general deploys MLflow artefacts.
-
Add support for Snowflake key-pair authentication
Currently the dataiku only support "User/Password" and "OAuth" for Snowflake connection. Snowflake has Key-pair authentication https://docs.snowflake.com/en/user-guide/key-pair-auth, which I would like to use for my service accounts.
-
Field grouping
Hello, A lot of time, when you have a dataset, you want to know if there is a group of fields that works together. That can help to normalize (like de-joining) your data model for dataviz, performance issue or simplify your analysis. Exemple <style></style> order_id item_id label model_id length color amount 1 1 A 10 15…
-
Add/Delete rows button for VisualEdit Plugin
The visual edit plugin is really good and does what we need, however I am having to code a dash webapp that uses the visual edit class and add my own functions for adding and deleting rows in the editable dataset, it would be great if you can configure the visual edit plugin to do this instead of having to custom code it.
-
Present Connection Credentials in sorted order
Connection Credentials are currently presented to the user in what seems to be a random order. That combined with the fact that we have about 70 connections defined makes it pretty difficult to find a particular connection. It would be great if the connections were sorted as they are in most other areas of DSS. For…