Help Shape the Future of Dataiku Join our User Research Program

Schlumberger - Using Dataiku to Democratize AI Within the Organization

Team members:
Valerian Guillot (Nerve Center Data Science Architect), with:
Sampath Reddy
Jean-Marc Pietrzyk
Jimmy Klinger
Eimund Liland

United Kingdom


Schlumberger is a technology company that partners with customers to access energy. Our people, representing over 160 nationalities, are providing leading digital solutions and deploying innovative technologies to enable performance and sustainability for the global energy industry.

Awards Categories:

  • Organizational Transformation
  • AI Democratization & Inclusivity


Democratizing AI within Schlumberger

Schlumberger is investing significantly in research and development to improve our product and services for customers, and has been embarking on digital transformation internally, as well as supporting our customers through their own transformation.

The main challenges Schlumberger has been facing are:

  • Re-skilling cohorts of petro technical experts into digital skills
  • Ensuring prior work on data driven topics are discoverable and reproducible
  • Ensuring that access to data is democratized, with a focus on:
    • Data discoverability
    • Data ease of consumption
    • Rights of use controls
  • Ensuring that solutions designed & prototyped have a clear delivery path to yield business impact


Schlumberger needed a single data science platform to access Schlumberger domain data through no-code & low code interfaces, where prior work is easily discoverable, and whose technology is close to the systems where the insights & models will be deployed to.

Leveraging Dataiku, we have put in place a mechanism where Schlumberger data scientists and technical experts can:

  1. Leverage the Dataiku Data Catalogue to access curated domain views of Schlumberger business systems data
  2. Leveraging Dataiku’s code samples & custom plugins capabilities to access high frequency historical environmental exposure of Schlumberger drilling equipment
  3. Leveraging Dataiku Visual & Cope recipes to build insights & models to improve well construction performance and reliability
  4. Leveraging Dataiku automation and API node capabilities, and its close integration with BI solutions, to easily put models and insights available to wider populations in the field.

To support the internal adoption of Dataiku within Schlumberger, we’ve developed and delivered a number of custom data science classes, focusing on use cases relevant to Schlumberger’s population of technical experts . As usage has scaled out, we’ve leveraged Microsoft’s Yammer to build a technical community helping each other within Schlumberger.



Dataiku, and its close integration with the DELFI E&P cognitive environment has been a key driver in democratizing the use of data science within Schlumberger.

We are measuring the effectiveness of democratization through:

  • The number of active users, where active users are users making technical contributions (e.g. code change, flow change…)
  • The job code of the active users
  • The usage of the data access helpers
  • The number of projects going into production


The graph above shows the growth in the number of users per week making contributions to data science projects, growing 8-fold since early 2019.

The growth has been worldwide, with users all around the world:


The data democratization has been successful in onboarding our existing population of data scientists, as well as technical experts, ranging from maintenance technicians, service quality engineers, well engineers, and more profiles who are now able to speak a common language, and make data-driven decisions such as:

  1. Choosing the types of batteries to include in a downhole tool, by looking at the historical environmental exposure of the tool
  2. Choosing the drilling bottom hole assembly to maximize operational reliability using BI solutions resulting from data flows in Dataiku
  3. Choosing when to replace equipment to reduce the risk of downhole failure using PHM models trained in Dataiku
  4. Optimizing the choices of drilling parameters to maximize performance, and minimize the energy consumption.

The distribution of contributions to data science projects in 2020 shows that ~40% were made by users who are not data scientists. Early 2021 data shows further growth in non-data science contributions:


Dataiku has also enabled collaborative work between data scientists and domain experts, where 35% of the data science projects in Dataiku are collaborative projects (defined as the fraction of projects where the distribution of commits is spread amongst multiple users):


The growth in usage, and the diversity in job code of users has proven the transformative value of dataiku as a collaborative data science platform for Schlumberger. Supporting the growth has been done on three axis:

  1. Domain views and helpers to access data
  2. Custom training (instructor-led, virtual, and self training)
  3. Community-based technical support


Community engagement:

As the user base grew, we put in a place a Bulletin Board where Dataiku practitioners can ask any technical questions on Dataiku, or data science, in order to collectively learn from each other:


Snapshot of the Dataiku bulletin board on Yammer

Community engagement, measured here as the number of messages read on a technical Yammer chat over the last 365 days, shows the community engagement has tripled over the last year:



Easing data access

Accessing time series data from Schlumberger’s operation had historically been a challenge, especially for predictive maintenance purposes, which required being able to trace back the entire history of each piece of equipment.

Leveraging Dataiku plugins and global shared code, domain specific helpers are implemented to retrieve the entire historical exposure of each piece of downhole drilling equipment, using only a single line of code.


The helpers are used up to 12,000 times per day, with approximately 6TB per day of data being analyzed.



Dataiku is a platform that is simple enough where a user can get started on their own. The complexity starts from learning how to leverage Dataiku to access Schlumberger data, and time series data.

In order to effectively train our users, the focus had to be on the ways to leverage Dataiku to access Schlumberger data, based on use cases relevant to the technical experts.

In that effect, custom training manuals were developed:

  1. Predicting the chances of success of a drilling run, and identifying which controllable parameters would improve the chances of success
  2. Accessing the historical time series to identify operating environments of the equipments
  3. Accessing drilling time series data to identify similarities and differences between drilling operations
  4. Accessing historical time series & tool failure information to build failure predictive models.



The cumulative views on the manuals exceeds 5,000. The yield of instructor & virtual classes is on average 42% (fraction of onboarded users still using Dataiku 6 months after the training), where virtual classes had a 50% yield, and instructor led classes 30%.

Version history
Last update:
‎06-11-2021 08:34 PM
Updated by: