Thanks to all of you who joined us for the first event of the London User Group! Josh Cooper (Data Scientist at Dataiku) presented a image classification project. Users also had the opportunity to connect with each other as well as quiz Josh on the intricacies of the project.
With today’s abundance of recyclable materials, it’s more important than ever to make sure that we’re getting rid of our waste responsibly. We took a look at a Dataiku DSS project that tries to help with that.
Josh showed how to get a pre-trained image classification model retrained using owned data, and how to get this model to identify different recyclable materials on the fly, through a laptop camera, using a custom webapp.
Here is the recording for those of you who couldn't make the event!
Be sure to join the London User Group to be informed of upcoming events and chat with fellow DSS users based in London!
As you may know, Dataiku User Groups are led by volunteer users who contribute their time and communication skills to enable fellow users to learn from each other. If you’d like to run this group, please fill out this quick form and we’ll get back. Thanks for your interest!
Congratulations on a wonderful regional user group meeting.
I’d love to have an opportunity to play around with this project. Are you able to export this project and share the project in a way that I could reproduce this in my own environment?
How long does retraining take? I see you are running the flow presentation from a Macintosh. Many transfer learning machine learning projects work best with Nvidia GPU. Macintosh computers don’t play nice with Nvidia. I’m wondering where and how you are retraining.
Hi @tgb417 - it's great to hear you found it useful!
I've attached the project below, but unfortunately I wasn't able to bundle either the models or training images due to copyright restrictions. So to reproduce, you'd need to do the following things:
- Populate the training set with images you find of the relevant materials,
- Download the image recognition model of your choice using the plugin macro.
- You would also need to deploy to an API node the api endpoint that's been designed in python in API Designer inside the project, and edit the webapp to point to this endpoint.
I retrained the model with a few epochs on my macbook, and that took about 20 minutes, but there is also a GPU version of the plugin that would be able to leverage the powerful parallel computation you'd get with a GPU.