Survey banner
The Dataiku Community is moving to a new home! We are temporary in read only mode: LEARN MORE

NLP Used for Prediction - Watch on Demand

1 min read 8 3 2,642
In this online event, Katie Gross (Lead Data Scientist, Dataiku) walked through a project built in Dataiku DSS using NLP on data from popular cooking sites, to predict whether a recipe is likely to be highly rated.


Presentation abstract:
Want to judge whether your recipe will be a hit? Or in general, what user-generated content is likely to lead to high engagement? We developed a workflow in Dataiku DSS that uses NLP to predict which recipes are likely to be highly rated. We’ll walk you through how we webscraped text recipes from popular recipe-sharing sites like Allrecipes and Epicurious, cleansed and prepared the data, and built a machine learning model to predict ratings of future recipes.
What's discussed:
  • NLP is a field of AI that enables machines to read, understand, and derive meaning from human languages.
  • Utilizing a Text Featurization Pipeline to convert text into features of a machine learning model: this includes Preprocess Text (normalize, remove stop words, stem, and tokenize) so "I was running to the river and jumped over a log" is processed to ["i", "run", "river", "jump", "log"].
  • Vectorizing the text (converting to numeric features) utilizing either Count Vectorization or Term Frequency-Inverse Document Frequency (TF-IDF).
  • Deep dive of recipe reviews in Dataiku DSS.
Katie Gross is a Data Scientist at Dataiku, where she helps clients across industries develop AI solutions using Dataiku DSS. Previously, she worked as a data scientist at a marketing science firm, Schireson and spent several months as a freelance data scientist. Prior to her data science life, Katie spent three years as a CPG consultant at Nielsen. Katie holds a BA in Economics from Colgate University.
Any questions on the presentation? Resources to share on NLP? Feel free to continue the discussion below! 
Dataiker Alumni

If you are interested in additional NLP resources check these out:


Want to share any additional NLP resources? Comment below!

Level 1

Details are very good to understand.

Level 1

Helpful session