Originally posted by Dataiku on May 11, 2021
Automated machine learning (AutoML) shows great promise in providing more efficient, explainable, and reproducible AI solutions. Organizations might wonder, however, are all AutoML tools created equal? In other words, are the AutoML capabilities offered in AI platforms and other technologies all doing the same thing? The short answer is: no, not all AutoML is created equal.
AutoML involves automating the process of applying machine learning (ML). This includes all the time consuming, iterative tasks included in ML model development. Such automation enables building and analyzing more ML models quicker and more efficiently. Amongst other things, AutoML thus allows for greater access to AI and ML and faster and greater production of reliable results for the business.
Different Degrees of Automation
It’s important to note that automation can be present in varying degrees. The vision for the future of AutoML is one of complete (or nearly complete) automation, but this is still just a vision. The reality of most AutoML tools and systems today is that they are not completely automatic — yet.
Organizations can evaluate systems that offer AutoML capabilities based on how their expectations and requirements match the level of automation present in the tool. These levels can be separated as follows:
Automation of the Data Science Pipeline
The development of AutoML has spurred the application of automation to the whole data-to-insights pipeline, from cleaning the data to tuning algorithms through feature selection and feature creation and even to operationalization. Some of the steps of the data science pipeline that can be automated through AutoML to increase the speed of the process include:
How It Works in Dataiku
Dataiku —the world’s leading AI and machine learning platform that supports agility in organizations’ data efforts via collaborative, elastic, and responsible AI, all at enterprise scale — contains a powerful AutoML engine that allows you to get highly optimized models with minimal intervention.
In Dataiku, you can select between:
The AutoML engine of Dataiku will analyze your dataset, and, depending on your preferences, select the best features handling, algorithms and hyperparameters. Note, however, that in the AutoML mode, you will still be able to define the types of algorithms Dataiku will train. This will let you choose between fast prototypes, interpretable models, or high-performing models with less interpretability.
Dataiku also offers features that go beyond AutoML and toward the automation of the entire data-to-insights pipeline. You can also choose to automate actions and workflows in Dataiku to leverage powerful scheduling capabilities. Preparing data, for example, requires repetitive tasks like flagging invalid rows and parsing to standard date formats, converting currencies, and more. With Dataiku, scenarios and triggers automate repetitive processes by scheduling for periodic execution or triggers based on conditions.
A scenario has two required components:
There are many predefined triggers and steps, making the process of automating Flow updates flexible and easy to do. For greater customization, you can create your own Python triggers and steps.
With automation in place and a strong team of data scientists and citizen data scientists, organizations can manage more projects and scale AI across the enterprise.
Experience an End-to-End Dataiku AutoML Demo
See how you can go from raw data to machine learning models in production using Dataiku's visual AutoML features.