Dataiku 12.2 Summer Special: A New Wave of Product Features and Enhancements
What’s New with Machine Learning in 12.2?
In July’s release of Dataiku 12.1, we unveiled a series of snackable product updates. This month, we reveal to you Dataiku 12.2! From powerful cloud model integrations to multi-treatments for causal ML to user-defined aggregations and copy/paste for charts, there’s something to please everyone. Read on to learn about new features and enhancements in this minor update that are sure to brighten your summer days.
An All-Inclusive Package With External Cloud Models in Dataiku
Dataiku centralizes the monitoring and governance across the MLOps lifecycle for natively-developed models and projects. However, we recognize that not every model is developed and deployed within Dataiku.
While the platform can already integrate with MLFlow models, Dataiku now can create proxies for external models already deployed with AWS Sagemaker, Azure ML, or Google Vertex so that users with all technical skill sets can see, explain, evaluate, and score new data with cloud models using Dataiku’s visual interface.
Multiple Treatments for Causal ML.
Dataiku 12 brought us Causal ML, driving a shift from purely predictive analytics to the ability to optimize these outcomes given a particular treatment. With Dataiku 12.2, you can now run this analysis for multiple treatments, enabling marketing teams to assess the impact of various campaigns or pharmaceutical clinicians to assess the likely effects of various drug and dosage combinations.
Visualization Enhancements to Pack in Your Beach Bag This Summer!
Dataiku is always working to simplify the process that designers go through to share data products with their audiences and business stakeholders, and with 12.2 this is no different.
Visualization: Radar Charts and Increased Flexibility with Custom Aggregations
In order to provide more flexibility when developing visualizations, both data builders and explorers will be able to leverage ‘user defined aggregation functions’ within the charts tab. The benefit is that these calculations no longer have to be purely defined at the dataset level or added via a separate recipe in the Flow; custom aggregations can be edited and updated without change to the underlying data!
The enhancements do not stop there, with a new visualization offering now available: radar charts. With this chart type, designers or explorers can compare multiple quantitative variables using the familiar drag-and-drop visual interface.
Replicate charts across Datasets
Have you ever needed to create the same visualization across datasets? This new feature enables users to copy charts to the clipboard and paste them to other datasets so charts can be replicated across datasets without the need for any manual rework.
Build Principal Component Analysis (PCA) Outputs Into the Flow
Data teams working with wide datasets want to be able to reduce the dimensionality to better visualize and understand the data. As you may know, PCA cards are available in the interactive statistics worksheets, but what if you wanted to leverage these results directly from the Flow for further analysis? With Dataiku 12.2, you can now export the PCA statistics card as a visual PCA recipe with options for projections, eigenvalue, and eigenvector output datasets.
Get Rid of Those Post-Holiday Blues With These Governance and Visibility Enhancements!
These 3 additions are designed to deliver better visibility into datasets and processes, both within a single project and across the entire AI portfolio, and are useful whether you are a data team member, governance manager, or data engineer.
Popular Dataset Recommendations & Bulk-Adding Datasets to Collections
The revamped data catalog with data collections simplifies the process of finding the right data for analytics and prediction projects. In order to promote discoverability, Dataiku automatically displays the most used and popular datasets so users can easily find and add them to their own projects and data collections.
AI Governance Enhancements from Kanban to Sign-Off
Governance is key to scaling AI across a business. With Dataiku 12.2, we introduce many enhancements to Dataiku Govern to streamline the monitoring and audit capabilities.
Reviewers are provided with additional flexibility within the approval process to edit and refine comments after they have been added, add multiple feedback items per review, and have multiple reviewers for a given feedback group. In addition, reviewers can be notified by email when there is a change to the final approval status to ensure they maintain visibility into a project’s progress.
With Dataiku Advanced, there are additional process-based enhancements:
- A validation step can be leveraged to determine if a sign-off process is mandatory to continue the workflow process.
- Sign-offs can be scheduled as part of regulatory requirements such as the EUAI Act. This way, you can continue to ensure that your models are running as expected in production.
- The Kanban view supports both standard and custom workflows so product owners can get a dashboard view of all initiatives and their statuses in a single page.
Build all record counts in the Flow
Flow views in Dataiku provide a holistic overview of the project from start to finish through different lenses. You are able to inspect and filter a project flow by tags, connections, recipe engines, last modified, and many more views. Now, an additional view will enable you to visualize the flow as it relates to record count of each dataset. Within this view, you will be able to compute the metrics for all datasets at once to quickly visualize how data volumes change throughout your project.
Want to learn more about Dataiku 12.2?
For more information, visit the official release notes to get more details and reference documentation on these product enhancements.