ImportError: No module named bs4?

Herve
Herve Partner, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 58 Partner

Trying out your "Scrape the web" python code sample

I got the error

"ImportError: No module named bs4?"

Tagged:

Answers

  • Mattsco
    Mattsco Dataiker, Registered Posts: 125 Dataiker

    Hi Hervé,

    You need to install the beautifulsoup4 (bs4) python package.
    You can do this in the code env section of the administration page.
    In your python code env, in "packages to install" you add :
    beautifulsoup4

  • Herve
    Herve Partner, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 58 Partner
    edited July 17

    Adding BeautifulSoup to python code env solved the import

    Code samples then failed at line

    soup = BeautifulSoup(page, 'html5lib')

    FeatureNotFound: Couldn't find a tree builder with the features you requested: html5lib. Do you need to install a parser library?
  • Herve
    Herve Partner, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 58 Partner

    I also added html5lib and lxml to my python code env and verified it with

    import html5lib
    import lxml

Setup Info
    Tags
      Help me…