Cox Automotive - Fast and Scalable Modeling to Better Understand Customer Behavior

Name: Bill Sung

Title: Senior Data Science Manager

Country: United States

Organization: Cox Automotive

Cox Automotive makes buying, selling, owning and using cars easier for everyone. With our technology, market intelligence, and products and services, Cox Automotive simplifies the trusted exchange and mobility of vehicles and maximizes value for dealers, manufacturers and car shoppers. The global company’s team members and family of brands, including Autotrader®, Clutch Technologies,®, Dealertrack®, Kelley Blue Book®, Manheim®, NextGear Capital®, VinSolutions®, vAuto® and Xtime®, are passionate about helping millions of car shoppers, 45,000 auto dealer clients across five continents and many others throughout the automotive industry thrive for generations to come.


Awards Categories:

  • Best Acceleration Use Case
  • Best Moonshot Use Case
  • Best MLOps Use Case
  • Best Approach for Building Trust in AI


Business Challenge:

Cox Automotive is the leading company in digital retail car buying/selling platform. We have Kelly Blue Book, and many dealer websites under Cox Automotive domain.

As we have many subsidiaries and partners using our platform, we gathered many different types of data ranging from web search criteria to demographic information on our consumers.

Having wide coverage of data is good for data science to understand potential purchasing behavior, but it also caused issues in following areas:

  • Some data suffered from sparsity
  • Manipulating data from different sources (Netezza, Snowflake, MSSQL, PostGres, AWS S3, etc.) were not easy
  • Even though data has been curated and prepared, we are left with machine learning model selection

Not many platforms offer flexible way of comparing different models with different versions. If they have framework to do so, it suffered from customized models that your team built using 3rd party packages.

Finally, automated model training and evaluation was key for the success. We've observed that consumer behavior has been changed dynamically by pandemic and many different political events and environmental disasters. Therefore, we need to continue monitoring the model performance for any drifts in the market.


Business Solution:

We listed challenges we encounter in building models to predict consumer behavior as below:

  • Data from multiple sources require one standard & scalable way to connect them
  • Framework to accept custom model for model performance comparison
  • Automated training and monitoring for any drifts in the market

We are going to share how Dataiku helped us with challenges listed above:

1. Data challenges

Dataiku offers one-stop shop to create a dataset from different data sources ranging from Cloud storage to different SQL platforms. It also offers push-down execution to take advantage of some database manipulation with much simpler visual recipes. The snapshot below shows an examples of manipulation multiple tables from Snowflake.

2. Modeling challenges

Dataiku offers a model comparison tool to assess model performance during very early in the development phase. The automated way to run the model comparison in flow allowed us to check if the new input data was adding any insight to our prediction. The snapshot below shows a rapid model comparison during data exploration phase to see if we have enough predictability on the data for any further iterations.

3. Automated scheduler

When we finalized the model, we need to have the good monitoring / reporting tools. Dataiku offers Scenario to automate the Recipes in the Flow and provides Reporter for monitoring output. The snapshot below is the screenshot for Slack reporter from Scenario to provide any drift in the variables.


Day-to-day Change:

Faster and scalable modeling is key for the success in ever changing digital retail space. Dataiku provides a streamline way from brainstorm phase to production phase. It helped us initiate more projects with new innovation.

Business Area Enhanced: Analytics

Use Case Stage: In Production


Value Generated:

As mentioned before, understanding consumer's behavior is the key success for digital retail even in automotive industry. Our predictive / prescriptive models modeled and deployed by Dataiku help internal businesses in following ways:

  • Marketing team can personalize the advertisement based on our insights,
  • Customer Relationship team can provide better communication to consumers with respect to his or her vehicle purchasing journey,
  • Websites (, and dealers website) can provide recommendations to improve the purchasing experience.


Value Brought by Dataiku:

As we have gathered data on consumers in many different ways, Dataiku offers scalable way to combine the data for the holistic view on a consumer. Also, the faster iteration with the flow help our data science team to explore different algorithms and predictive features for better understanding consumer's behavior.

Value Type:

  • Improve customer/employee satisfaction
  • Increase revenue
  • Reduce cost
  • Increase trust

Value Range: Hundreds of thousands of $

Version history
Publication date:
07-07-2025 11:59 AM
Version history
Last update:
‎07-31-2023 01:59 PM
Updated by: