Survey banner
Share your feedback on the Dataiku documentation with this 5 min survey. Thanks! TAKE THE SURVEY

Web Scraping with Dataiku

MichaelG
Community Manager
Community Manager
Web Scraping with Dataiku

In this session, Matt showed how he automated some web scraping processes. He went through examples of how he can find an API to get the data, download a webpage to extract his content, or simulate a navigation. And he shared code samples using python packages like requests, beautifulsoup, or selenium. 

Hosted by Matthieu Scordia: Data scientist @ Dataiku for the last 8 years, my goal is to help our customer building advanced data science project. Now based in Singapore, I'm covering all the APAC area for the implementation of data science project using Dataiku DSS.

 

I hope I helped! Do you Know that if I was Useful to you or Did something Outstanding you can Show your appreciation by giving me a KUDOS?

Looking for more resources to help you use DSS effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as ‘Accepted Solution’ to help others like you!
0 Kudos
3 Replies
ArielCopeland
Level 1

Any tips on data cleansing after scraping?

0 Kudos
arjun
Level 1

Use beautifulsoup for loading data and extracting information you want using class names

0 Kudos
PeterObrien
Level 1

Ah, got it! Even though it's been a couple of years since Matt's session, I'm sure his insights are still super relevant in the ever-evolving world of data science. Thanks for sharing, and for anyone looking to dive deeper into data cleansing, this resource might still come in handy: https://www.nannostomus.com/data-wrangling/data-cleansing/. It's a game-changer!

0 Kudos