Valerian Guillot (Nerve Center Data Science Architect), with:
Schlumberger is a technology company that partners with customers to access energy. Our people, representing over 160 nationalities, are providing leading digital solutions and deploying innovative technologies to enable performance and sustainability for the global energy industry.
Democratizing AI within Schlumberger
Schlumberger is investing significantly in research and development to improve our product and services for customers, and has been embarking on digital transformation internally, as well as supporting our customers through their own transformation.
The main challenges Schlumberger has been facing are:
Schlumberger needed a single data science platform to access Schlumberger domain data through no-code & low code interfaces, where prior work is easily discoverable, and whose technology is close to the systems where the insights & models will be deployed to.
Leveraging Dataiku, we have put in place a mechanism where Schlumberger data scientists and technical experts can:
To support the internal adoption of Dataiku within Schlumberger, we’ve developed and delivered a number of custom data science classes, focusing on use cases relevant to Schlumberger’s population of technical experts . As usage has scaled out, we’ve leveraged Microsoft’s Yammer to build a technical community helping each other within Schlumberger.
Dataiku, and its close integration with the DELFI E&P cognitive environment has been a key driver in democratizing the use of data science within Schlumberger.
We are measuring the effectiveness of democratization through:
The graph above shows the growth in the number of users per week making contributions to data science projects, growing 8-fold since early 2019.
The growth has been worldwide, with users all around the world:
The data democratization has been successful in onboarding our existing population of data scientists, as well as technical experts, ranging from maintenance technicians, service quality engineers, well engineers, and more profiles who are now able to speak a common language, and make data-driven decisions such as:
The distribution of contributions to data science projects in 2020 shows that ~40% were made by users who are not data scientists. Early 2021 data shows further growth in non-data science contributions:
Dataiku has also enabled collaborative work between data scientists and domain experts, where 35% of the data science projects in Dataiku are collaborative projects (defined as the fraction of projects where the distribution of commits is spread amongst multiple users):
The growth in usage, and the diversity in job code of users has proven the transformative value of dataiku as a collaborative data science platform for Schlumberger. Supporting the growth has been done on three axis:
As the user base grew, we put in a place a Bulletin Board where Dataiku practitioners can ask any technical questions on Dataiku, or data science, in order to collectively learn from each other:
Snapshot of the Dataiku bulletin board on Yammer
Community engagement, measured here as the number of messages read on a technical Yammer chat over the last 365 days, shows the community engagement has tripled over the last year:
Accessing time series data from Schlumberger’s operation had historically been a challenge, especially for predictive maintenance purposes, which required being able to trace back the entire history of each piece of equipment.
Leveraging Dataiku plugins and global shared code, domain specific helpers are implemented to retrieve the entire historical exposure of each piece of downhole drilling equipment, using only a single line of code.
The helpers are used up to 12,000 times per day, with approximately 6TB per day of data being analyzed.
Dataiku is a platform that is simple enough where a user can get started on their own. The complexity starts from learning how to leverage Dataiku to access Schlumberger data, and time series data.
In order to effectively train our users, the focus had to be on the ways to leverage Dataiku to access Schlumberger data, based on use cases relevant to the technical experts.
In that effect, custom training manuals were developed:
The cumulative views on the manuals exceeds 5,000. The yield of instructor & virtual classes is on average 42% (fraction of onboarded users still using Dataiku 6 months after the training), where virtual classes had a 50% yield, and instructor led classes 30%.