Dataiku 10.0.4 Update: Hundreds of New Features, Enhancements & Improvements
In November of 2021, Dataiku 10.0 delivered exciting new capabilities that accelerate time to value and empower people across diverse functions to engage in data projects and responsibly deliver and manage AI applications.
This week, Dataiku delivered a jam-packed update that includes hundreds of new features, performance enhancements, and improvements across all product capability categories. Continue reading to find out more!
Key Highlights of the Dataiku 10.0.4 Update
Key highlights of Dataiku 10.0.4 include enhancements for data exploration and preparation tasks, new options for coders, and more features and guardrails for visual machine learning.
Data Exploration and Preparation Enhancements
Users of all types will benefit from improvements to the data ingestion, exploration, and navigation experiences. For example, to upload flat files to a project, analysts can simply drag and drop them directly onto the Flow as an alternative to navigating through the data menu. When manipulating and exploring data, it’s common to want to know how many records are in the whole dataset; this value is now prominently displayed, along with a visual indicator of whether you’re seeing the complete data or only a sample in visualizations.
For faster data preparation, additional formula functions streamline common tasks like creating unique IDs and replacing category values with custom labels. Teams managing production pipelines will also appreciate the new preparation processor which enriches datasets with timestamp information about when it was last updated.
New Options for Coders
Coders enjoy the flexibility to use their preferred version of Python in code environments, with official support for Python 3.8, 3.9, and 3.10 as well as Pandas 1.1, 1.2, and 1.3. Additional integrations with MLFlow further support data scientists who may wish to develop models elsewhere and package them using MLFlow, to benefit from the deployment and model management capabilities of the Dataiku platform.
For coders who wish to run complex transformation queries within Snowflake but prefer writing Python over SQL, native integration with Snowpark for Python enables them to easily access this Snowflake feature from within a code recipe or Jupyter notebook in Dataiku. Moreover, for those choosing to interact with projects programmatically, several new functions and methods in Dataiku’s APIs facilitate uploading folders, adding multiple items to a Flow zone, or working with Dataiku remotely from an external IDE.
AutoML Additions
Finally, this latest version delivers new AutoML capabilities for novice and advanced practitioners alike. Additional guardrails and warnings alert modelers to potentially unwanted or unexpected outcomes, such as extremely imbalanced predictions. Sentence embeddings, a powerful and modern way to transform raw text fields into the semantically-meaningful inputs suitable for machine learning, are now a native feature handling option. This means that analysts or data scientists wishing to incorporate text features in their models can easily leverage the most cutting-edge language and domain-specific pre-trained models from Hugging Face, but without needing to custom install packages and code it by hand.
Go Further on Dataiku 10
With Dataiku 10, organizations can involve users across a diverse set of roles to create, deliver, and manage more high-value projects at scale. Ready to dive into the details of what’s new in Dataiku 10.0.4? Check out the full list of resources in the release notes.
If you’d like to go even further on Dataiku 10, click the button below to see key features in action such as the model evaluation store, model comparisons, workspaces for business users, Govern, geospatial analytics, and more!
What’s the most useful feature to you in this product update? Let us know in the comments, and feel free to suggest new ones in the Product Ideas board!