SLB - Using Dataiku to Democratize AI Within the Organization
Team members:
Valerian Guillot (Nerve Center Data Science Architect), with:
Sampath Reddy
Jean-Marc Pietrzyk
Jimmy Klinger
Eimund Liland
Country:
United Kingdom
Organization:
SLB
Description:
SLB is a technology company that partners with customers to access energy. Our people, representing over 160 nationalities, are providing leading digital solutions and deploying innovative technologies to enable performance and sustainability for the global energy industry.
Awards Categories:
- Organizational Transformation
- AI Democratization & Inclusivity
Challenge:
Democratizing AI within SLB
SLB is investing significantly in research and development to improve our product and services for customers, and has been embarking on digital transformation internally, as well as supporting our customers through their own transformation.
The main challenges SLB has been facing are:
- Re-skilling cohorts of petro technical experts into digital skills
- Ensuring prior work on data driven topics are discoverable and reproducible
- Ensuring that access to data is democratized, with a focus on:
- Data discoverability
- Data ease of consumption
- Rights of use controls
- Ensuring that solutions designed & prototyped have a clear delivery path to yield business impact
Solution:
SLB needed a single data science platform to access SLB domain data through no-code & low code interfaces, where prior work is easily discoverable, and whose technology is close to the systems where the insights & models will be deployed to.
Leveraging dataiku, we have put in place a mechanism where SLB data scientists and technical experts can:
- Leverage the Dataiku Data Catalogue to access curated domain views of SLB business systems data
- Leveraging Dataiku’s code samples & custom plugins capabilities to access high frequency historical environmental exposure of SLB drilling equipment
- Leveraging Dataiku Visual & Cope recipes to build insights & models to improve well construction performance and reliability
- Leveraging Dataiku automation and API node capabilities, and its close integration with BI solutions, to easily put models and insights available to wider populations in the field.
To support the internal adoption of Dataiku within SLB, we’ve developed and delivered a number of custom data science classes, focusing on use cases relevant to SLB’s population of technical experts . As usage has scaled out, we’ve leveraged Microsoft’s Yammer to build a technical community helping each other within SLB.
Impact:
8x increase in Dataiku usage in last 18 months | 6TB of data analyzed per day | 40% of contributions by non data scientists | 42% yield on training classes |
4x increase in community help in 12 months | 35% of data science projects are collaborative | 720 days since the last day without data science commits | Models & insights used in 70 countries |
Dataiku, and its close integration with the DELFI E&P cognitive environment has been a key driver in democratizing the use of data science within SLB.
We are measuring the effectiveness of democratization through:
- The number of active users, where active users are users making technical contributions (e.g. code change, flow change…)
- The job code of the active users
- The usage of the data access helpers
- The number of projects going into production
The graph above shows the growth in the number of users per week making contributions to data science projects, growing 8-fold since early 2019.
The growth has been worldwide, with users all around the world:
The data democratization has been successful in onboarding our existing population of data scientists, as well as technical experts, ranging from maintenance technicians, service quality engineers, well engineers, etc. who are now able to speak a common language, and make data-driven decisions such as:
- Choosing the types of batteries to include in a downhole tool, by looking at the historical environmental exposure of the tool
- Choosing the drilling bottom hole assembly to maximize operational reliability using BI solutions resulting from data flows in dataiku
- Choosing when to replace equipment to reduce the risk of downhole failure using PHM models trained in Dataiku
- Optimizing the choices of drilling parameters to maximize performance, and minimize the energy consumption.
The distribution of contributions to data science projects (chart below, in 2020-2021) shows that ~70% were made by users who are not data scientists. Early 2021 data shows further growth in non-data science contributions.
Dataiku has also enabled collaborative work between data scientists and domain experts, where 35% of the data science projects in Dataiku are collaborative projects (defined as the fraction of projects where the distribution of commits is spread amongst multiple users)
The growth in usage, and the diversity in job code of users has proven the transformative value of dataiku as a collaborative data science platform for SLB.
Supporting the growth has been done on three axis:
- Domain views and helpers to access data
- Custom training (instructor-led, virtual, and self training)
- Community based technical support
Community engagement:
As the user base growed, we have put in a place a Bulletin Board where Dataiku practitioners can ask any technical questions on Dataiku, or data science, in order to collectively learn from each other:
Snapshot of the Dataiku bulletin board on Yammer.
The community engagement, measured here as the number of messages read on a technical Yammer chat over the last 365 days, shows the community engagement has tripled over the last year:
Easing data access
Accessing time series data from SLB’s operation had historically been a challenge, especially for predictive maintenance purposes, which required being able to trace back the entire history of each piece of equipment.
Leveraging Dataiku plugins and global shared code, domain specific helpers are implemented to retrieve the entire historical exposure of each piece of downhole drilling equipment, using only a single line of code.
The helpers are used up to 12,000 times per day, with approximately 6TB per day of data being analysed.
Training
Dataiku is a platform that is simple enough where a user can get started on his own. The complexity starts from learning how to leverage Dataiku to access SLB data, and time series data.
In order to effectively train our users, the focus had to be on the ways to leverage Dataiku to access SLB data, based on use cases relevant to the technical experts.
In that effect, custom training manuals were developed:
- Predicting the chances of success of a drilling run, and identifying which controllable parameters would improve the chances of success
- Accessing the historical time series to identify operating environments of the equipments
- Accessing drilling time series data to identify similarities and differences between drilling operations
- Accessing historical time series & tool failure information to build failure predictive models.
The cumulative views on the manuals exceeds 5,000. The yield of instructor & virtual classes is on average 42% (fraction of onboarded users still using Dataiku 6 months after the training), where virtual classes had a 50% yield, and instructor led classes 30%.