-
CHAR(1) columns turning into lengths of 2 with spaces in Exports
We have to use data preps from many database tables on different platforms that have columns defined as CHAR(1) to keep them as length's of 1. Otherwise the exports change them to lengths of 2 with spaces added on. So, an indicator column with only "Y" or "N" becomes "Y " (added space) or "N " (added space). Using a data…
-
Auto Detect column types of strings TEXT only and not other
We have to continuously change auto detection column types of: Boolean, State, Country, Email, Phone, Natural Language, and maybe others to TEXT to work with database platforms we use. (Teradata, Snowflake, Databricks, SQL Server, Oracle) Can we add an option where Dataiku recognizes strings as TEXT only thus helping us to…
-
Ability to terminate a custom Python scenario step with a Warning
We would like to have a custom Python scenario step generate a warning based on custom code logic. Generating a step failure is easy as we can just abort the step (as shown in this link) or just raise an exception in Python code. However there is no option to end the custom Python scenario step in a warning outcome. The…
-
Anomaly Detection
In the Visual ML, a new type of problem could be added - Anomaly Detection. It could include algorithms like Isolation Forest, Robust Covariance, Local Outlier Factor, One Class SVM, Gaussian Mixture, Kernel Density, DBSCAN, OPTICS, Elliptice Envelope. etc. I would like to find anomalies in data using multiple algorithms…
-
Dark Mode
Every developer needs a dark mode A dark theme for the flow, datasets, and recipe configs would go a long way toward making Dataiku fit into workflows that involve many other dark mode tools. Dataiku is definitely very bright when swapping from other tools which operate in dark mode. Extensions like Dark Reader do a pretty…
-
Marimo Notebooks Integration in DSS
I'd like to propose the integration of Marimo notebooks alongside the existing Jupyter notebooks in DSS. Marimo is an innovative notebook environment that addresses several limitations of traditional Jupyter notebooks while maintaining compatibility. Here are some key advantages of Marimo notebooks: Code quality : Marimo…
-
Shared Secrets
As we've been developing plugins and for other more exotic use cases, we've seen the need for shared secrets in Dataiku. Teams share account credentials or plugins may rely on some group based credential (e.g. Box JWT tokens for a "team account"). We hack around this using FTP type connections and parsing their secrets or…
-
Invalid Scenario step logic condition should cause scenario failure
I have noted a very dangerous behavior in the latest v12 release although I believe this has been in DSS for a long while. DSS will not cause a scenario failure or even a warning if you have an invalid Scenario step logic. For instance I created a scenario step and set two variables: {"var1": 123, "var2": 456} Then I…
-
ETL
Je propose de développer une solution moderne de Lakehouse pour accélérer la mise en œuvre des projets de Data Science et optimiser le temps. Se connecter aux différentes sources de données, les nettoyer et les mettre dans des formats adaptés à l’analyse et aux modèles de Machine Learning peut souvent être long et…
-
Sharepoint plugin update
Adding a column of "modified by" that is available on sharepoint on the sharepoint plugin. It would help me get my data on who actually uploaded datasets so we can track the users of who is adjusting the files.