Name: Emiel Veersma, Data Scientist
Organization: One Acre Fund
As an NGO committed to making a lasting impact, One Acre Fund empowers smallholder farmers across East Africa through asset-based financing and comprehensive agriculture training services, effectively combatting hunger and poverty. With its headquarters situated in Kigali, Rwanda, the organization collaborates closely with farmers residing in rural villages spanning Kenya, Rwanda, Burundi, Tanzania, Uganda, Malawi, Nigeria, Zambia, and Ethiopia. The year 2021 saw One Acre Fund extend its reach to 3.2 million farmers, both directly through a suite of farm services and indirectly by enhancing accessibility to these services across entire regions, often facilitated through partner organizations. The results of this collaboration yield great returns, with One Acre Fund farmers collectively generating hundreds of millions of dollars in new farm profits annually.
One Acre Fund offers smallholder farmers an asset-based loan encompassing a range of essential components:
Each bundled service package holds an approximate value of US$80, inclusive of crop insurance measures that mitigate the risks posed by drought and disease. To access the benefits of the One Acre Fund loan and training, farmers are required to join a village group, facilitated by a dedicated local One Acre Fund field officer. These field officers consistently engage with farmer groups, orchestrating the seamless delivery of farm inputs, conducting training sessions, and overseeing repayment processes. The farmers who directly benefit from these services experience a year-on-year surge of over 40% in farm income, all harnessed from the same plot of land, often leading to the cultivation of surplus produce.
In its commitment to flexibility, One Acre Fund provides farmers with a versatile repayment system. This system empowers farmers to repay their loans in any increment at their convenience during the growing season. Beyond its core program model, One Acre Fund extends opportunities to smallholder farmers to access additional products and services through credit arrangements. These offerings encompass diverse essentials, including solar lights and reusable sanitary pads. Moreover, One Acre Fund offers a comprehensive tree package, bolstering seasonal harvests and imparting stability to farmers while empowering them to act as custodians of the environment. In 2022 alone, the program led to the planting of over 60 million trees, with each dollar invested by donors yielding $12 of asset value for participating farm families.
One Acre Fund's mission spans across diverse countries, each with its distinct program structures. While smaller nations collaborate primarily through partnerships and tree programs, others follow the traditional model. In Kenya and Rwanda, farmers enjoy the convenience of requesting inputs via mobile phones or through visits to One Acre Fund stores. To fuel continuous innovation, country teams frequently develop and trial new programs, resulting in a decentralized approach to data management within One Acre Fund. However, this decentralization poses challenges, particularly for smaller countries striving to uphold stringent data management standards.
As data volumes grew, most teams still relied on manual processing via Google Sheets. For instance, the Ethiopia team registered data for 230,000 farmers and 20 million seedlings of different species. This data had to be manually compared against various records using Google Sheets. Such a process was not only time-consuming but also prone to human errors, making it challenging to gain meaningful insights from the heap of data. Sharing data between teams was also a challenge, preventing valuable information from aiding essential research on smallholder farmers.
In 2022, One Acre Fund’s global data team focused on enhancing the data management infrastructure by establishing a Data Warehouse in Snowflake and seamlessly integrating it with the Dataiku framework. Throughout the current year, teams across the organisation underwent comprehensive training in utilising Dataiku for managing country-specific data. Following the training, both the global tech team and the Ikigai team provided support as they devised various solutions tailored to their specific requirements.
Kenya is the most innovative country programme which also shows in the way they use Dataiku. Noteworthy among their Dataiku applications is their real-time credit scoring system. The credit scoring flow utilises OCR technology, image processing algorithms, and machine learning models to extract relevant data from scanned or captured images of ID documents. This data is verified with a national database by visiting their API endpoint, and finally, additional farmer and purchase information is given and a credit score is calculated by the Credit Scoring Machine Learning model. Based on the resulting credit score, an array of options is presented to the farmers. This credit score is continuously monitored in Dataiku on data drift, performance and potential biases, to make sure it serves the organization’s and more importantly the farmer's needs best.
Ethiopia, Zambia, and Malawi have transitioned from manual Google Sheets processes to a more sophisticated workflow utilizing Kobotoolbox, Commcare, Snowflake, and Dataiku. Under this evolved methodology, a Field Officer gathers comprehensive farmer information by completing one or multiple forms within either Kobo Toolbox or Commcare, depending on the country's preference. This dataset encompasses GPS coordinates, field boundary polygons, farmer demographics, purchase details, and potentially other pertinent data.
This information is integrated into Dataiku through the API plugin, where it is then enriched with payment data from an MSSQL database through join recipes. Once this central dataset is established, a series of validations are undertaken, including but not limited to:
The culminating datasets are then made available within Snowflake and are frequently exported to Google Sheets using sync recipes. These workflow schedules are tailored by the country teams according to their preferences, and updates on these processes are communicated through Slack or email channels. As a result, data refreshing occurs multiple times throughout the day to ensure accuracy and relevancy.
The Agriculture Research Team (ART) supports the organisation with insights and recommendations related to agriculture. These recommendations are based on both data collected in the field, and geospatial information publicly available. Once the data is This data is then analyzed in the lab. Optimized recommendations generated by Machine Learning models are then exposed as API endpoints. The country teams can then integrate these API endpoints in a way that suits them best. This approach makes the ART team agile and allows them to create generic recommendations for the different country teams.
Moreover, geospatial insights are made accessible to all Dataiku users through a tailor-made Streamlit application. This dynamic application draws upon internal data sourced from the Data Warehouse, which is then augmented with freely available satellite imagery retrieved from Google Earth Engine. Within this framework, country teams gain the capability to access a dashboard-driven viewer, leveraging its capabilities to craft maps that align precisely with their specific needs. Since the country teams are using Dataiku, their data is already stored in the Data Warehouse. This inherently facilitates the incorporation of localized geospatial information, streamlining the process and enhancing the utility of the insights generated.
In addition to the aforementioned teams, several other groups within the organization also leverage Dataiku's capabilities. For instance, the Rwanda team employs Dataiku to provide crucial support for their recently established stores. Meanwhile, the Supply Chain Management team harnesses Dataiku's power to make accurate forecasts of fertilizer prices, utilizing Time Series models to aid in this process.
Collectively, a total of 22 teams, boasting over 100 active users, are actively engaged with Dataiku at One Acre Fund. These diverse teams possess the capacity to independently conceptualize, develop, and manage intricate operations. This inherent autonomy allows them to operate with remarkable agility and efficiency, all the while upholding stringent standards of data quality, privacy and performance.
Dataiku has brought changes to data-related processes, encompassing data collection, data processing, and data analysis. Across all these domains, teams have witnessed a reduction in the time spent on repetitive tasks, enabling them to handle larger datasets and undertake more complex operations than ever before.
Previously reliant on laborious methods like Google Sheets or manual handwriting, the data collection has transitioned to a streamlined form-based approach that integrates seamlessly with Dataiku. This transition has yielded multiple benefits: data integrity has been secured through proper backups, users receive automated feedback via data checks, and processing speeds have improved significantly, particularly for large datasets that were previously cumbersome for Google Sheets to handle.
Most users that process data at One Acre Fund don’t have a background in data, so naturally struggle with the vast amounts of data. The global teach team would offer these users a half-day training in Dataiku. After this training, the team would not only know how to manage their data in Dataiku, but the team would already have their data stored in the Data Warehouse (Snowflake) and automatically processed in Dataiku. As a result, this information becomes readily accessible organization-wide with appropriate user permissions in place.
Leveraging the analytical, charting, web application, and machine learning functionalities within Dataiku, teams are now better equipped to analyze and monitor their data. Unlike the past, where data analysis often occurred at the end of a season upon finalizing reports, Dataiku enables real-time identification of erroneous data entries, empowering teams to address such issues during the season. Additionally, the platform facilitates the visualization of geospatial aspects of the data, leading to a better understanding of the challenges faced by farmers among team members.
Business Area Enhanced: Analytics
Use Case Stage: Built & Functional
Across all the teams where Dataiku has been adopted, a consistent pattern emerges: enhanced data insights, more efficient data processing, and increased data quality. This increased customer satisfaction coupled with a reduction in the required staffing levels for various processes.
To illustrate, let's consider the case of Ethiopia. Prior to undergoing Dataiku training, their team of seven members dedicated several hours per day, per staff member, to manually manage data quality. This labor-intensive effort was supplemented by the involvement of 65 temporary volunteers. With the integration of Dataiku, this process changed completely, becoming entirely automated while being overseen by a single individual. Key Performance Indicators (KPIs) are now updated in real-time, and data stands ready for analysis. The changes lead to an improvement of payment timeliness to Tree Nursery Operators (TNOs), facilitated by increased data quality and a reduction in resource requirements.
As a direct result, the Ethiopia team anticipates a surge in TNO retention for the upcoming year, driven by enhanced satisfaction stemming from punctual payments. This shift in approach is also expected to yield more revenue, attributable to improved TNO retention and subsequent satisfaction.
Similarly, in Zambia, the transformation brought about by Dataiku has been really good. The workforce required to process data has been reduced by 90%. Simultaneously, the Zambia team now possesses the capability to furnish farmers with precise planting recommendations, has automated satellite checks in place and is able to detect instances of fraud by identifying field overlaps.
These instances underscore that Dataiku's integration has increased operational efficiency, amplified the effectiveness of decision-making processes, and consequently lead to positive outcomes throughout various teams and countries.
Without Dataiku it would not have been possible for country teams to manage their data and thus for One Acre Fund to maintain this dynamic and country-centric organizational structure. As the number of farmers per country escalated to around 1 million, the conventional operational boundaries were being pushed, posing considerable challenges. However, after a few hours of training, country teams underwent a seamless transition, positioning them for scalable growth while revitalizing their focus on fieldwork. According to a team member, Dataiku “made data cleaning, organizing, and analysis very quick and simple”. Dataiku allows staff members without coding or data expertise to
Where in the past the global data team was in charge of maintaining the core data, now every country team is the owner of it’s own data and able to share the data with colleagues within the Data Warehouse. All this is done with fewer people, while still adhering to the highest data quality standards.