Aviva – Empowering Decision Making with ML: Sentiment Analysis and Topic Categorization for Customer

chandom Registered, Frontrunner 2022 Winner, Frontrunner 2022 Participant Posts: 2 ✭✭✭✭✭

Team members:

Mitesh Chandorkar, Data Science Architect, with:
Aviva Team - Simon Sinfield
Wipro Team - Richardson Jebasundar, Narinder Saini, Nitesh Panda, Sohel Kadir

Country: United Kingdom

Organization: Aviva

Aviva is a British multinational insurance company headquartered in London, England. It has customers across its core markets of the United Kingdom, Ireland, and Canada. In the United Kingdom, Aviva is the largest general insurer and a leading life and pensions provider. Aviva is also the second largest general insurer in Canada.

Awards Categories:

  • Best Partner Acceleration Use Case
  • Best Acceleration Use Case
  • Best Moonshot Use Case
  • Best MLOps Use Case

Business Challenge:

Customer reviews or feedback are significant part of online journeys of the customers in large organizations with digital capabilities. With growing competition and increased customer expectations, businesses face major challenge in effectively harnessing and analyzing the vast amounts of feedback data generated across various touchpoints.

As a leading insurance provider in UK with millions of customers, Aviva always strive for "Customer First" approach. It was imperative that Aviva understand the customer pain points to make informed decisions and enhance the customer online experience. Traditional methods of manual analysis are usually time-consuming, error-prone, and lack scalability. This can lead to missed opportunities in understanding feedback themes, identifying emerging trends, and thus addressing pain points for improving customer satisfaction.

The key challenges with manual processing were:

  • Unstructured data: One of the primary challenges for the business was the unstructured nature of customer feedback data. Feedback was received from various sources and to extract meaningful information manually was challenging. It resulted in valuable insights being lost.
  • Human errors and time consumption: Manual analysis of customer feedback data was prone to human errors and inconsistencies. Same customer verbatims were interpreted differently by different analysts, impacting reliability of analysis. Moreover, analyzing large volumes of data manually was a time-consuming process.
  • Lack of focus area: The sheer volume of customer feedback data often led to challenges in identifying and prioritizing key focus areas. The business struggled to allocate resources effectively, as it stood without a clear understanding of the most critical themes from customer feedback.
  • Missing categories: While analysts, based on their prior experience, tried to define high-level categories, getting into granular level themes was challenging. Labeling each instance manually would have been taken considerable time and effort.

Wipro's “AI & Automation” leverages its industry and business-focused solutions to solve the above problems using Machine Learning. The ML solution was built to:

  • Perform sentiment analysis and topic categorization without the need for extensive manual annotation.
  • Automate the analysis of unstructured data.
  • Extract valuable insights and uncover hidden patterns from customer feedback data by leveraging unsupervised learning techniques.
  • Enable significant reduction in human errors, ensure consistency in analysis, and free up valuable time for strategic decision-making.
  • Categorize and highlight the most relevant topics facilitating a focused approach to address customer concerns.

Business Solution:

The Machine Learning Solution was built using the Dataiku platform. Aviva has been using the Dataiku platform for many years to build variety of use cases in data and ML area. The key features of this solution include:

  • Seamless ingestion of survey data from an Amazon S3 bucket.
  • Preprocessing of feedback data to consolidate multiple survey sources into a unified and coherent view.
  • Advanced Natural Language Processing capabilities to extract meaningful insights from feedback data.
  • Development and implementation of a sentiment analysis model.
  • Creation of a themes categorization model that categorizes themes or topics.
  • Integration of output from above models resulting in powerful analytics and well-rounded view of customer feedback.
  • Intuitive and visually appealing dashboard to provide actionable insights to stakeholders.

The journey to build the solution was challenging due to:

  • Absence of labelled data: The customer verbatim needed annotations. Using traditional NLP approach of using TF-IDF and clustering using K Means helped get initial understanding of data. Dataiku made this quicker allowing more time for refinement process. Several different models were built using traditional and advanced techniques like word2vec, topic modelling, BERT embedding etc. The output of these models was then clustered further, analyzed, and used for annotation. The annotated data was used for model-building using BERT.
  • Misinterpreted ratings: Focusing on key fields such as customer feedback and score, we identified the need to address the misinterpreted ratings provided by few customers. To tackle this problem, sentiment model was developed.

The Dataiku platform played a crucial role in the NLP preparation, model trainings experiments, and deployment processes using built-in recipes and custom python code. Dataiku enabled us to perform:

  • Integration with various sources: Feedback data is ingested from Amazon S3 and model output data is stored in RDS.
  • Low code / No code data processing: Dataiku's visual recipes helped in building pipelines for data transformation quickly.
  • MLOps capability for experimentation: Multiple models were trained with minimal configuration. Seamless integration with custom model enabled us to embed state-of-the-art models as part of solutions.
  • Data exploration and visualization: Charts, dashboards, and statistics assisted in data exploration, as well as to generate insightful trends on the output data.
  • Orchestration and post-production monitoring: Dataiku simplifies orchestration using job scheduling, scenarios, and monitoring and alerting of jobs in the production environment.

Below is a high-level process flow diagram illustrating how solution works on Dataiku platform.
Figure 1 : High level process flow diagram of solution

Day-to-day Change:

The Machine Learning solution built using Dataiku had a significant impact on day-to-day business processes. It transformed the way customer feedback data was analyzed and utilized. Below are the keyways in which the solution impacted the business:

  • Streamlined analysis: The solution helped reduce the need of manual review and interpretation. It helped save time and resources.
  • Data-driven decision making: The business got deeper insights into the customer sentiments, concerns, and preferences. These insights enabled informed decision-making at various levels.
  • Efficient resource allocation: Depending on the trends, the business could identify key areas of focus to address the critical topics first. This improved resource allocation and productivity due to efficiency in the process.
  • Single version of truth: The solution provided the single source of truth, thus bringing different teams together instead of siloed analysis and different interpretations from individual teams on the data.
  • Improved customer satisfaction: By quickly identifying and resolving negative experiences, the business could enhance the overall customer experience and address key issues. Although the solution does not fix the issues, it aids to identify the issues, thus indirectly contributing to customer satisfaction improvement.

Below are few examples of valuable insights and dashboards. Sensitive information is masked for privacy and security purposes.

The following dashboard presents weekly overview of top ten contributors into positive and negative themes.

Figure 2 : Weekly overview of top contributors

The trend below shows the weekly comparison of positive contribution percentage for years 2022 and 2023.

Figure 3 : Weekly comparison of online experience scores across years

The heatmap below offers valuable insights into the weekly percentage change for each theme, enabling stakeholders to focus on the most significant fluctuations across different weeks.

Figure 4: Heatmap showing weekly changes in themes contribution

Business Area Enhanced: Marketing/Sales/Customer Relationship Management

Use Case Stage: In Production

Value Generated:

The ML solution built using Dataiku generated both tangible and intangible value for the business.

Tangible benefits:

  • Cost savings: Through automation, the solution reduced the need for manual effort required for analyzing customer feedback data. These resulted in eliminating the need for full time employees dedicated to manually reviewing the feedback and thus resulted in cost savings of approximately £10k per month.
  • Time savings: Manual analysis of large volumes of data would typically take weeks or even months, depending on the size and complexity of the data. With the ML solution, this process was streamlined and accelerated, allowing for faster insights and decision-making. The time to generate weekly reports was reduced by at least 50%.

Intangible benefits:

  • Enhanced decision-making: The ML solution enabled more informed decision-making across various aspects of the business, including product functionalities, marketing strategies, impacts from external factors, and customer service improvements. The ability to make data-driven decisions resulted in improved outcomes and customer satisfaction.
  • Improved customer satisfaction: Businesses were able to proactively address customer concerns and pain points. This led to enhanced customer satisfaction. Satisfied customers are more likely to become repeat customers and advocates for the brand, driving long-term business growth.
  • Scalability and adaptability: The ML solution built using Dataiku allowed businesses to scale their customer feedback analysis effortlessly. As the volume of feedback data increased, the ML solution could manage the growth without significant additional resources. The early implementation catered to two surveys which was extended to twenty-seven surveys seamlessly in quick time.

Value Brought by Dataiku:

Dataiku provided additional valuable to this Machine Learning solution in various aspects. Below are listed some of these key aspects:

  • Speed and agility: The key component of this solution was to build the method to annotate the data, which needed multiple models to be trained. Dataiku was extremely useful in building these models in record time. The data preprocessing pipelines and the trend analytics pipeline were built quickly, thus reducing build time for the solution.
  • Enhanced tech stack efficiency: The built-in recipes, visual interfaces, statistical analyses, scenarios, and job orchestration offerings from Dataiku reduced the need for manual coding. This increased the team efficiency and accelerated the development of the solution.
  • Flexibility: While built-in models were great to start, Dataiku has the capability to build custom models and use transfer learning. This helped bringing the diverse types of models into the mix and massively compare the outcomes of different model.
  • Seamless scalability: The Dataiku platform enabled seamless scalability, allowing the solution to manage an increased volume of data. It helped to enhance the solution, e.g., to include twenty-five more surveys data in just few weeks.
  • Analytics capability: The solution uses Dataiku’s windows recipe feature that enables building various trends over time for the identified themes. It enables us to quickly identify the patterns, seasonality, and time-related insights in customer satisfaction data.

Overall, Dataiku brought value to the solution by enhancing team efficiency, flexibility, no-code and low-code using built-in recipes and models, as well as scaling the solution quickly and insightful analytics capabilities.

We are excited to leverage Dataiku’s enterprise grade development tool for Generative AI to further enhance this solution.

Value Type:

  • Improve customer/employee satisfaction
  • Reduce cost
  • Save time
  • Increase trust

Value Range:

Hundreds of thousands of $


Setup Info
      Help me…