FPT Software - End-to-End Data Science for Answering Client Requests and Democratize Insights
Do Van Nhan, Data Scientist, with:
Nguyen Khanh Bao
Organization: FPT Software
A subsidiary of FPT Corporation – the leading ICT Group in Asia, FPT Software is a global technology and IT services provider with headquarter in Vietnam. Its decades of experiences in the global market have seen FPT Software empowering digital transformation for businesses worldwide, from various industries: Healthcare, BFSI, Manufacturing and Automotive, Communications, Media and Services, Aerospace and Aviation, Logistics and Transportation, Utilities and Energy, Consumer Packaged Goods, and Public Sector.
Best Partner Acceleration Use Case
Best Acceleration Use Case
Best MLOps Use Case
Last year, 2023, was a turbulent year with numerous occurrences such as the Ukraine War, the Energy Crisis, and Food Security. As a result, this period was a global impediment to economic development and practically all enterprises. We, FPT Software, are no exception.
Luckily, we have signed contracts with several significant organizations during this challenging period, many of them are airlines, not only in Europe but also in Japan and China. The difficulties they are interested in are becoming more diversified, ranging from projecting the number of passengers passing through security gates to assessing the distribution of passenger groups based on behavior, optimizing the problem of fuel consumption and making arrangements - staffing, and so on.
On the client side, they want to know how predictive models function, how algorithms work, how to evaluate them, and how to show a dashboard with alerts (alarms and alarms) to an external chat channel. That is why we selected Dataiku as a solution to these issues. "Dataiku is not a platform, it is a solution"
As we mentioned before, we are confronted with a comprehensive challenge:
1. Maintaining databases on both Redshift and Snowflake
Dataiku supports all Redshift functionalities, including reading and writing datasets, executing SQL recipes, and so forth.
Dataiku supports all Snowflake functionalities, including reading and publishing datasets, executing SQL recipes, performing visual recipes in-database, and using the live engine for visualizations. So, what could possibly be better?
2. Solving problems with different algorithms (MLOps)
Namely, time-series forecasting (for predicting the number of customers passing through security gates every 15 minutes), customer segmentation (segmenting customer behavior) and algorithm optimization (for fuel minimization and working time optimization without sacrificing flight quality/consumer experience).
Fortunately, Dataiku's MLOps feature is a perfect fit, beside using the visual recipes and plugins in this case.
For these optimization problems, our team and end-users had a depth experiences while using Dataiku, from the code recipes (some of client required us to give the deterministic source scripts) to using MLOps to handle multiple models.
We've also packed our source code as custom-plugins, and clients are excited about the platform's ability to handle anything from simple to sophisticated. The MLOps enabled deployment, monitoring, and management of ML models and projects in production easier than ever before.
It also provides a common grammar for everyone from advanced data specialists like data scientists to low- or even no-code domain experts, bringing together technical and business teams to cooperate on projects.
3. Create dashboards to perform analytics on predicted customers through the gate, fuel consumption, and distribution of customer groups.
Using Dataiku's Dashboard is perfectly suitable in this case, we can flexibly apply both Jupyter Notebook, Scenarios and Webapp right in one dashboard to visualize the entire working flow for each individual problem.
4. Send warning alert messages if the prediction model is significantly deviated.
We used webhook and content from "Automating the Model Lifecycle" to implement and solve this problem. All of these features from Dataiku is really helpful to us to solve these problems.
Business Area Enhanced: Accounting/Finance
Use Case Stage: In Production
Communicating and connecting with people, especially in marketing, is all about understanding their needs, behaviors, and expectations. We are asked to answers questions such as:
"How to optimize the fuel-consumption for each aircraft"
"How to predict the number of passengers for each gate per day, and for every 15 minutes?"
"How and why customers buy?"
“Is current airlines network planning modeling supported by sufficient and achievable customer segmentation considerations?”
"How to increase and simplify segmentation's positive impact on airlines network planning?"
"How to know, and evaluate if the model is failed"
To answer these, companies should segment their users by shared similarities in order to establish, nurture, and maintain strong relationships, etc. How to analyze and solve these questions, that is why we use Dataiku to find the best solution.
Value Brought by Dataiku:
These problems have helped aviation partners to optimize the number of employees, working hours at security gates, flight diagrams and as well as optimize fuel consumption without affecting service quality or customer experience. A user who does not have much experience as a data scientist can still use Dataiku to experiment with models without having to worry about whether he has a lack of knowledge or not.
Dataiku has an ML Diagnostics function, are designed to identify and help troubleshoot potential problems and suggest possible improvements at different stages of training and building machine learning models. Thus problems like Dataset Sanity Checks, Leakage Detection, Overfitting Detection, Training Reproducibility, etc are no longer difficult for a low-code or even no-code user. With these features, it is easy for them to receive prompts from the platform, and then review the model (in consultation with other users) before issuing the complete reports.
The amount of staff that the customers used (after we completely handed them over) to this process was greatly reduced compared to before they came to Dataiku. The workflow for calculating these models is done automatically, effectively reducing inadvertent human-caused calculation errors over a long period of time. The accuracy of the prediction models is greatly improved without violating any other phenomena such as Overfiting, Data leakage, etc, but also reduces the number of operators, optimizes time and customer experience row. That's what we've been hoping for.