ZS is a management consulting and technology firm focused on transforming global healthcare and beyond. We leverage our leading-edge analytics, plus the power of data, science, and products, to help our clients make more intelligent decisions, deliver innovative solutions and improve outcomes for all. Founded in 1983, ZS has more than 12,000 employees in 35 offices worldwide. To learn more, visit https://www.zs.com/.
ZS partnered with a life sciences and pharmaceutical organization to help orchestrate 100+ data science models projected to bring in annual incremental growth of USD$200M+. They had a clear roadmap, the correct set of people, and the right technology in place to develop state-of-art data science models but lacked the expertise and thought leadership to effectively operationalize and maintain these models in production.
There were instances where data scientists reported challenges with the computation power of the existing platform. Loading python libraries took two to three weeks, making it extremely difficult for the Data Scientists to scale the POC ML (Machine Learning) models. Typically, it took over three months to successfully deploy the model.
This sparked discussions on how to make things better across teams by bringing in greater efficiency and industry best practices. A parallel capability was needed to maximize the benefits of investments in AI.
MLOps (Scaled AI) came into the picture when Dataiku was chosen as the central platform for all model deployments. ZS Scaled AI, ESCoE, and ADS teams worked in conjecture with the Dataiku team (including field engineers) to verify infrastructure architecture and understand the capability roadmap of the platform.
Today, over 10 models have been deployed on the Dataiku platform with added MLOps utilities like model health monitoring, data drift analysis, and many more.
ZS helped the client compare multiple platforms before Dataiku was chosen to be the platform for MLOps. The decision was taken based on the following points:
The team later added a couple of additional steps to make Dataiku more powerful for cases such as:
More than 50 people across multiple teams and various domains collaborated to build a platform estimated to be used by over 1,000 people across the organization once a 100% solution is deployed. Below is a detailed infrastructure diagram to show how the team leveraged Dataiku to its fullest and how it is core to MLOps' capability.
Business Area: Analytics
Use Case Stage: In Production
Direct financial impact:
The lean structure of the platform makes it easy to manage post 100% deployment. By 2023 we estimate a lean team of 3FTEs managing over 100 models, which today is managed by a team of 25+FTEs across the organization. The estimated efficiency gain of ~90% (Indirectly a significant impact – given we might end up replacing the current admin work).
Below are a few details on how we achieved such process efficiency:
1. Platform Automation: Earlier, the model was scored manually by running individual script(s) monotonous and repetitively. ZS saw this as an opportunity and stitched the entire Dataiku recipe flow into a one-click-based execution; it orchestrates the entire process at the click of a single button. We even introduced automated schedulers that trigger the model training using the latest data, validate model accuracy, execute the inferencing, and provide the notification to the stakeholders. Thus, providing the team with end-to-end visibility into the model lifecycle.
2. Model Runtime improvement: Earlier manual execution models involved multiple handoffs and inefficiencies because they were shared as ZIP files for deployment. By enabling One Click Execution of model prediction and One Click Deployment leveraging Azure DevOps, our team reduced model runtime from 48 hours to as little as six hours, thereby drastically reducing the model go-to-market time from six months to six weeks.
Before Dataiku was chosen as the platform for development, data scientists used multiple tools and technology stacks for ad hoc development, including Jupyter Notebook, RStudio, LiveRamp, and more. This distributed technology stack limited the scope of collaboration and model reusability across the teams.
Our team brought in Dataiku as a centralized platform for development and was able to portray its value from the very beginning. With the ease of development using a graphical UI-based implementation, data scientists could easily adapt and migrate to the platform. As a result of this, the utilization/productivity of the data science team went up by almost 40%.
Dataiku also provides some out-of-the-box capabilities which added value at various stages:
Model and Data monitoring dashboards:
Value Range: Dozens of millions of $