-
Add Seldon to deployment options
One of the deployment options in our company is Seldon (Seldon, MLOps for the Enterprise.). It would be great if Dataiku had the option to deploy directly to Seldon, the way deployment to K8, AWS, Databricks or Azure is now possible. Seldon in general deploys MLflow artefacts.
-
Put stuff in the API logging without sending in the response
We often run into situations where we'd like to log stuff from our internal API workings - like intermediary results for checking - without having to send these out in the response. It would be wonderful if there was an option to send things to the API log without it having to be part of either the request or the response.
-
Configurable Timezone Display for Date Columns (Beyond UTC-only)
Current Situation Dataiku DSS has specific behaviors when handling time columns: When it recognizes time-related columns (e.g., date, timestamp_tz, or timestamp_ntz), it displays them as Date columns, rendering them in timestamp format (with both date and time components). A significant limitation is that Date columns…
-
Data Quality Check: Valid Time Series
When working with time series, it would be nice to have a quality check that ensures that time steps meet a minimum definition (i.e. weekly on Monday), have no duplicates, and have no missing steps.
-
Add Granger Causality tests to the stats worksheet
I'd really like to be able to test granger causality between two or more time series. Would it be possible to add it to the stats page, such that I can pick 2 or more input columns, and the GC can be calculated between each pairing and each ordering, over a specified range of lags?
-
"Fold" processors in visual recipe - Implement In-Database engine
Today, fold processors require the DSS engine because they are not supported as in-database processing, which forces dataiku designers to implement SQL recipes to perform fold operations. Most modern databases support "unpivot" syntax, which enable fold processors to be converted to SQL.…
-
API : load a request from postman/bruno collection
Hello all, Configuring by hand a rest api can be painful. On the other hand, the API world use a lot tools such as Postman or Bruno (an open source clone) which allows easy test, debug... I use it everytime I had to work on a rest API and then I try to translate it to the final tool . Both tools offer "collection", a set…
-
ADBC connectivity : faster columnar storage query
Hello all, ADBC is a database connection standard (like ODBC or JDBC) but specifically designed for columnar storage (so database like DuckDB, Clickhouse, MonetDB, Vertica...). This is typically the kind of stuff that can make Dataiku way faster. more info in Here a benchmark made by the guys at DuckDB : 38x improvement…
-
Properly implement support for Building Flow Zones in Scenarios and the Dataiku API
In Dataiku v12.0.0 a new feature was added that allows users to build flow zones from the flow UI: https://knowledge.dataiku.com/latest/data-preparation/pipelines/tutorial-build-modes.html#build-a-flow-zone This works well however this capability was never added properly to Scenarios and to the Dataiku API. In 12.1.0…
-
Smart indexing: recommend index based on downstream recipes
It would be helpful if on the index selection menu for a dataset, some smart values could be displayed based on downstream recipes, and if in the recipe creation views, upstream datasets could be reindexed to optimize them as well. For example, after I've created two join recipes downstream of a dataset, on the index…