Cascade Bicycle Club - Laying the Foundation For Volunteers Collaboration on Data Insights
Team members: Christopher Shainin, Technology Manager Tom Brown, Volunteer Data Scientist Akshay Kotha, Volunteer Data Scientist Sindhujaa Narasimhan, Volunteer Data Scientist Anas Patankar, Volunteer Data Scientist Sankash Shankar, Volunteer Data Scientist Megan Thomas, Volunteer Data Scientist
Country: United States
Organization: Cascade Bicycle Club
Description: Cascade Bicycle Club, the nation’s largest statewide bicycling nonprofit, serves bike riders of all ages, races, genders, income levels, and abilities throughout the state of Washington. We teach the joys of bicycling, advocate for safe places to ride, and produce world-class rides and events. Our signature programs include the Seattle to Portland, Free Group Rides, the Pedaling Relief Project, the Advocacy Leadership Institute, the Bike Walk Roll Summit, Let's Go, and the Major Taylor Project.
Data Science for Good
AI Democratization & Inclusivity
Cascade Bicycle Club, the nation’s largest statewide bicycle nonprofit, serves bike riders of all ages and abilities throughout the state of Washington. With a mission to improve lives through bicycling, they teach the joys of bicycling, advocate for safe places to ride, and produce world-class rides and events.
In the fall of 2020, Cascade Bicycle Club invited a team of Pro Bono data scientists to help them understand and re-engage riders during and after the COVID-19 pandemic. The intent was to use existing transactional data held in Salesforce to model rider segments, as well as past drivers of engagement and churn behavior to better understand how they could better engage with riders.
At the time of making this offer, Cascade Bicycle club had no infrastructure appropriate for data science work. Cascade was also wary of allowing Personally Identifiable Information (PII) on infrastructure not under Cascade Bicycle Club’s direct control.
How could Cascade Bicycle Club quickly create an enterprise-class data science infrastructure that would allow a small team of volunteer data scientists from across the United States to work together?
The solution had to involve providing familiar data science tools like Python, Jupyter notebooks, R, SQL, as well as access to Salesforce data for analysis, and eliminating the need to move customer data to analysts’ computers.
As we started on this endeavor, we reached out to the Dataiku team about the Ikig.ai program. With a donated license, we were able to provide the platform to a small team of volunteer data scientists to collaborate on data analysis.
Within less than a month, we’ve built out an AWS instance, connected data from Salesforce via a standard plugin, and made it available on Dataiku for collaboration - whereas the whole setup would usually take several months or more for a nonprofit to accomplish.
This was made possible thanks to a team effort involving support from Dataiku, the willingness from Cascade to invest in some additional AWS infrastructure, the willingness of team members to move to a new platform (and move their Jupyter notebooks!).
We were able to gain a quick impact through launching several projects:
Rider segmentation, in order to better understand their objectives and behaviors.
Rider retention and conversely ways to minimize churn.
CRM cleaning through de-deduplication to lay the basis for further analysis.
To work on these, we were able to invite an additional five pro bono data scientists into the process, who were quickly onboarded on Dataiku as we were able to reuse existing Python, notebooks and Dataiku data flows.
Cascade wouldn’t have been able to securely leverage data science tools and techniques without a central platform. Dataiku has provided a home for data science operations for the organization, around three main pillars:
1. Enable collaboration between team members & volunteers
Dataiku DSS provides a controlled environment to enable volunteers from around the United States an opportunity to collaborate on a common set of data and work in an environment with standard data science tools. Furthermore, thanks to its versatility, the platform allows each contributor to leverage the technologies and techniques they’re most familiar with - which has been pivotal in allowing volunteers to help as a side activity. This project provides a basic roadmap showing that nonprofit organizations can find creative ways to build infrastructure and leverage data science skills in order to participate in today’s data science revolution.
2. Facilitate reusability of past projects & workflows
The visual interface allows everyone to view the workflow of other participants and assess where they can contribute their time and expertise. It also makes it easy to onboard new volunteers, as we did with a second round of contributors, and enable them to gain a quick understanding of projects conducted, as well as reuse parts of it for their own endeavors (thanks to copy/pasting steps of the flow & duplicating projects!).
3. Adopting a data-driven approach
As we were able to conduct our first data science projects in Dataiku in a short time, and already show an impact on the organization, we’re planting the seeds of a data science culture at Cascade Bicycle Club - and laying the foundation for further engagement by staff and future groups of volunteers. This project becomes a template that can be reproduced by others wishing to leverage data science at the scale of a nonprofit organization.