New Year, New Features! Dataiku 11.3 Is Hot Off the Presses

ChristinaH
Dataiker
4 min read 7 0 284K

Happy New Year! I hope you all enjoyed the holidays and are refreshed and excited for 2023. Last month, we delivered a bundle of holiday treats with Dataiku 11.2, but ‘tis the season that keeps on giving. A fresh batch of goodies is already available for you to explore in Dataiku 11.3.

Along with plenty of improvements to existing capabilities, this product update delivers brand-new features for data analysts, data scientists, and ML engineers and operators. Read on to discover our latest additions and as always, check out all the details in the release notes.

For Data Analysts

The “Anti-Join” Dataset

Sometimes, you want to inspect the records that are unmatched in a join operation. In addition to producing the output dataset containing the records that meet the join conditions, the join recipe can now optionally output a dataset that returns the unmatched rows for further analysis.

image9.gif

Share and Export Filtered Views

When interactively analyzing data by applying filters to a dashboard, it can be useful to export a subset of data or share a specific view with the broader team. For dashboards, a new “Copy to URL” button preserves your selected filter parameters in a URL that you can easily send to others. When they click the link, all the filters will be preserved so they can review the dashboard insights in exactly the same state. You can also export the filtered dashboard as a PDF.

image6.png

The same is true for filtered datasets in the explore view - simply head to the actions menu to download the subset of data or export the view as another dataset with the filters preserved.

image10.png

Shortcut to Visual Previews for Images and Geospatial Data

When working with non-tabular data types, it’s useful to visually inspect records in a more interpretable, multimedia format. After all, file paths and geo-coordinates may be practical for storage, but a picture is worth a thousand words.

In addition to the full-table image view released with Dataiku 11.2, you can now use the convenient “preview” action to preview individual images or geolocations. Simply press shift-v on a highlighted cell or use the right-click menu and choose “Preview” to see a pop-up of the selected image or a map plotting the geopoint or geometry. Pro tip: this shortcut is also useful with tabular data for viewing and/or copying the full value of a long string or array.

image1.gif

For Data Scientists

Deep Neural Network as a Native Algorithm

The Deep Neural Network is a new algorithm available in Dataiku AutoML for both regression and classification tasks. Based on the multilayer perceptron (MLP) architecture, this Deep Neural Network leverages state-of-the-art libraries for a robust, efficient, and scalable model.

image5.png

Example of a Multilayer Perceptron architecture, with 3 hidden layers of 4 neurons each

Dataiku’s implementation offers:

  • A searchable architecture and a customizable learning process.
  • Regularization techniques to avoid overfitting.
  • GPU support for training acceleration.

Once trained, a Deep Neural Network can be deployed, evaluated, and scored just like any other algorithm. 

image4.png

Feature-level View and Search in the Feature Store

A new feature-level view in Dataiku’s Feature Store makes it easier to search for and explore specific features you can reuse in your own projects and models. This view gives additional information and context about both the feature itself and the feature group it belongs to.

image8.gif

Visual Time Series Forecasting: Evaluate Beyond Forecast Horizon

When performing time series forecasting, the forecast horizon represents how frequently new forecasts need to be generated. The evaluation period usually corresponds to how frequently the model is expected to be retrained.

If you don’t plan to retrain the model as frequently as the forecast horizon, you can now specify an evaluation period longer than the horizon.

For example, let's say you develop a seven-day sales forecast that generates the predicted sales for the upcoming week, but you know that you won't assess and retrain the model more frequently than once a month. In this case, you could specify an evaluation period that is four horizons long in order to capture and evaluate model performance metrics for the entire month across multiple forecast cycles.

image3.png

For ML Engineers & Operators

Evaluation-Ready Event Logs

The Dataiku event server is a built-in way for teams to capture the prediction logs from all models across Dataiku API nodes and store them in a single location. Prediction logs are especially used to monitor model performance using evaluate recipes and model evaluation stores.

With the Dataiku 11.3 update, the evaluate recipe is now able to automatically process those prediction logs. This removes the need for a preceding prepare recipe and simplifies the setup of production model monitoring.

image11.png

Prediction Drift Detection Without Ground Truth

When a model is deployed into production, it’s prudent to immediately begin monitoring to detect meaningful shifts in its behavior. However, for many use cases, the ground truth is either unavailable or is not available fast enough to provide timely feedback and model performance metrics that alert operators of a degrading model.

Even in the absence of ground truth labels, you can still take advantage of prediction drift analysis in Dataiku’s model evaluation store with just the prediction logs. The fugacity table and density chart show the differences between the current prediction distribution and the reference distribution from when the model was trained.

image2.png

Together with input data drift analysis, which also does not require ground truth, these two tools provide critical information and early warning for ML operators so they can proactively keep live models healthy.

Try it Out for Yourself!

Dataiku 11.3 is available for download or upgrade today, including all of these features and improvements that were developed with users like you in mind. Stay tuned for future product updates as we progress toward Dataiku 12, our next major version!

To get the full details about Dataiku 11.3, check out the full release notes .

LET'S GO

Share: