Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi, I have been trying to create a python recipe that web scrapes with Selenium, but have been facing this error:
I have added the chromedriver as an input to the code and these are the libraries in my environment:
selenium
chromedriver-py
webdriver-manager
chromedriver-binary-auto
chromedriver-binary
This is the code I have been using:
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
## Will throw a need permissions error
#from webdriver_manager.chrome import ChromeDriverManager
#driver = webdriver.Chrome(ChromeDriverManager().install())
# Read recipe inputs
chrome_driver = dataiku.Folder("a4dI41U2")
chrome_driver_info = chrome_driver.get_info()
driver_path = chrome_driver.get_path() + '/chromedriver'
# Compute recipe outputs
chrome_options = Options()
chrome_options.binary_location = driver_path
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--headless')
chrome_options.add_argument("--disable-dev-shm-usage")
chrome_options.add_argument("--disable-extensions")
driver = webdriver.Chrome(executable_path=driver_path, chrome_options=chrome_options)
driver.quit()
What should I do to solve the error?
Thanks in advance!
Hi,
Can you confirm the version of google-chrome installed and your OS and chrome-driver version?
google-chrome --version && which google-chrome
cat /etc/*release
Likely the version of the chrome-driver and google-chrome are not compatible or google-chrome was not installed correctly. The error indicates issues starting google-chrome headless.
Your code seems fine and is similar to the example I've tested before here:
https://community.dataiku.com/t5/Using-Dataiku/Using-selenium-with-a-python-recipe/m-p/19661
Thanks
I think I probably did not install Google Chrome! I am looking to deploy the webscraper using dataiku, so I am wondering where should I be downloading Google Chrome to on dataiku?
Hi,
You can install google-chrome on the DSS instance directly only requirement is really that it's available in the PATH for DSS.
Note if you plan on running on containerized execution you will need to add google-chrome to the base image https://doc.dataiku.com/dss/latest/containers/custom-base-images.html#add-a-dockerfile-fragment
The following steps should work ( for CentOS/RH)
wget https://dl.google.com/linux/direct/google-chrome-stable_current_x86_64.rpm
sudo yum install ./google-chrome-stable_current_*.rpm
Check the installation by running: google-chrome --version as the DSS user
Thanks