-
Censored Regression
It is often the case that modelers encounter censored data, or data that falls >x or <y. In these cases, there are some typical approaches to address the challenge of building a regression model, but currently these are not available in Visual ML. As such, Interval-censored regression, Tobit regression, or censored…
-
Charts : ability to have better sorting for Sankey
Hello, Here my actual Sankey : What I would want is to have a better sort for the 4th and 5 th levels, in order to have the green part above, and then a sort by alphabetical order. As of today here the choice I have : Si what I suggest : -several level sorting -add manual and alphabetic order Best regards, Simon
-
Swagger/OpenAPI for DSS public rest API
Hello, We have this documentation about the public rest API. https://doc.dataiku.com/dss/api/14/rest/ However a standard one would be better so that we can use it in our tools. OpenAPI (ex-Swagger) seems a good candidate for that. Best regards, Simon
-
CHAR(1) columns turning into lengths of 2 with spaces in Exports
We have to use data preps from many database tables on different platforms that have columns defined as CHAR(1) to keep them as length's of 1. Otherwise the exports change them to lengths of 2 with spaces added on. So, an indicator column with only "Y" or "N" becomes "Y " (added space) or "N " (added space). Using a data…
-
Auto Detect column types of strings TEXT only and not other
We have to continuously change auto detection column types of: Boolean, State, Country, Email, Phone, Natural Language, and maybe others to TEXT to work with database platforms we use. (Teradata, Snowflake, Databricks, SQL Server, Oracle) Can we add an option where Dataiku recognizes strings as TEXT only thus helping us to…
-
Ability to terminate a custom Python scenario step with a Warning
We would like to have a custom Python scenario step generate a warning based on custom code logic. Generating a step failure is easy as we can just abort the step (as shown in this link) or just raise an exception in Python code. However there is no option to end the custom Python scenario step in a warning outcome. The…
-
Anomaly Detection
In the Visual ML, a new type of problem could be added - Anomaly Detection. It could include algorithms like Isolation Forest, Robust Covariance, Local Outlier Factor, One Class SVM, Gaussian Mixture, Kernel Density, DBSCAN, OPTICS, Elliptice Envelope. etc. I would like to find anomalies in data using multiple algorithms…
-
Invalid Scenario step logic condition should cause scenario failure
I have noted a very dangerous behavior in the latest v12 release although I believe this has been in DSS for a long while. DSS will not cause a scenario failure or even a warning if you have an invalid Scenario step logic. For instance I created a scenario step and set two variables: {"var1": 123, "var2": 456} Then I…
-
ETL
Je propose de développer une solution moderne de Lakehouse pour accélérer la mise en œuvre des projets de Data Science et optimiser le temps. Se connecter aux différentes sources de données, les nettoyer et les mettre dans des formats adaptés à l’analyse et aux modèles de Machine Learning peut souvent être long et…
-
Sharepoint plugin update
Adding a column of "modified by" that is available on sharepoint on the sharepoint plugin. It would help me get my data on who actually uploaded datasets so we can track the users of who is adjusting the files.