MEWA - Forecasting Animal Diseases to Inform Prevention and Public Action
Amr Mansour, Lead AI Consultant
Mohammad Osman, Data Scientist
Mohammad Abusaad, Data scientist
Country: Saudi Arabia
The Ministry of Environment, Water and Agriculture (MEWA) is a ministry in Saudi Arabia responsible for achieving sustainability of the environment and natural resources in the Kingdom. In its efforts to become a proactive organisation that anticipates threats of animal disease epidemics, a predictive use case to forecast animal diseases was built to predict and be prepared in case of emergencies.
Best Partner Acceleration Use Case
Best Moonshot Use Case
Inside the borders of the Kingdom of Saudi Arabia lies millions of livestock that the ministry keeps a close eye on as they provide a valuable food resource for millions of the country’s residents. Such an important resource’s well-being couldn’t be left to chance as animal disease outbreaks affect the economy, food security and public health. Additionally, some animal diseases can be transmitted to humans, which can have severe consequences for public health if not contained.
Forecasting animal diseases can help prevent or mitigate the impact of the diseases. By predicting the likelihood of disease outbreaks, authorities can take preventive measures such as vaccination programs, biosecurity measures, and movement restrictions, which can help reduce the spread of diseases among animals.
MEWA teams attempted to leverage GIS (Geographic Information System) technologies with the aim of forecasting disease outbreaks. However, their efforts were hindered by the absence of a systematic method for comprehending the trending patterns and seasonality of each livestock disease.
Specifically, significant challenges were faced when attempting to conduct individualised analyses for each disease, specific to different regions and types of livestock. This complex task presented serious difficulties that were formidable to overcome.
To tackle these challenges, the use case was built on the Dataiku platform.
Our use case centres around conducting a time-series analysis of animal diseases, factoring in various dimensions such as region, animal type, and specific diseases, along with their various combinations. Furthermore, it incorporates a dynamic selection process, enabling the user to choose the dimensions on which the predictions will be based.
The flow for the use case is divided into 5 sub-flows that deviate from each other based on the dimensions included (Animals, Regions, Diseases, Offices) to provide more versatility in the predictions output for the business users.
The use case flow was divided into 6 parallel flows as follows:
Flow 1: Disease forecast and analysis per Region per Animal
Flow 2: Disease forecast and analysis per Region
Flow 3: Disease forecast and analysis per Animal
Flow 4: Disease forecast and analysis on the entire Kingdom
Flow 5: Dynamic in which the user selects the animal, the city and the disease or multiple diseases
Data flows in Dataiku
The model undergoes a weekly training regimen to effectively capture new patterns and trends, recalibrate predictions, and identify anomalous behaviour as well as significant shifts in trends. This regular update allows for the most current data to be incorporated, ensuring the precision and relevance of the generated forecasts.
Predictions are visualized on Dataiku using static insights capability allowing the disease experts to navigate through the different diseases and quickly observe the trend analysis, pattern changes, and forecasted number of diseases for the coming 6 months.
Flows 1-4 are run simultaneously by triggering a time-based trigger at the beginning of every week. Each flow then outputs the results on a dashboard that is used to enable data driven decision making. While flow 5 is run through a visual application on command.
The use case is live and runs on the automation node which draws data from the ministry systems and outputs the results on the same database, a visual application was created as well to enable self-service analytics.
Business Area Enhanced: Other - Epidemiology
Use Case Stage: Built & Functional
The main objective of this use case is to enable decision makers inside the Ministry of Environment, Water and Agriculture to make data driven decisions going forward and to become proactive in combating the spread of animal diseases in the kingdom. This objective aligns with Saudi Vision 2030 initiative of “Animal Disease Investigation and Control Program” under the strategic objective 5.4.1 “Ensure Development and Food Security”.
As we’re forecasting diseases for a ministry that keeps track of tens of millions of animals, any small indicator of when and where a certain disease will spread will influence unquantifiable amounts of both health and monetary value. The use case predicts and displays the number of animals affected by a certain disease in a certain region in the kingdom for up to 6 months in advance, which allows users to track trends and identify any rate of changes and outbreaks. This eventually helps the ministry track the effectiveness of the disease prevention and health awareness campaigns.
In addition, the visual application grants a more tailored experience by allowing the business user to forecast the spread of a certain disease in a specific MEWA office instead of the whole region, which allows the business users to pinpoint the disease’s hotspot.
Disease hotspots mapping
Value Brought by Dataiku:
With the help of Dataiku, our team was able to organise a workflow that enhanced collaboration between team members and enabled us to work in parallel on different parts of the project simultaneously.
During the project phases, Dataiku proved to be invaluable in every step along the way.
Data processing - With its easy-to-use visual recipes, such as the prepare recipe, it enabled us to cut down the time needed considerably which let us focus more of our efforts on the other phases that required our attention.
Flexibility - Dataiku’s flexibility in allowing us to write python scripts gave us the chance to use prophet Facebook for generating time-series models.
Partitioning - The ability to easily partition the huge dataset (around 700k observations and 300 partitions) helped us design multiple time series, one for each partition that were run in parallel.
Automation – A scenario was created to run the flows on a weekly basis, this was later automated through the automation node.
Self-service analytics - We utilised the visual application capabilities of Dataiku by creating a self-service analytics app tailored to our use case which otherwise would have required strong front and back end programming skills to achieve.
Overall, Dataiku played its role perfectly by greasing the wheels and always keeping everything needed available and ready.