Survey banner
Share your feedback on the Dataiku documentation with this 5 min survey. Thanks! TAKE THE SURVEY

Spark Setup

Alwaleed
Level 1
Spark Setup

Hi all ,,

I need to use the sparkSql and spark for python I installed the spark and it shown in the administration settings .

but when I run the sparkSQL it raised this error 

 

Cannot run program "spark-submit" (in directory "/data/design/jobs/DC

 

please anyone can help or send an article to follow the configuration 

Thanks in advance ,,, 

 

0 Kudos
1 Reply
CatalinaS
Dataiker

Hi,

 

This issue usually occurs when spark integration was not re-run after a recent upgrade and it can't find spark-submit.

Please re-run spark integration using the standalone archive downloaded from Dataiku DSS download site for your DSS version:

https://downloads.dataiku.com/public/studio/12.2.0/dataiku-dss-spark-standalone-12.2.0-3.4.1-generic...

You can run spark integration with the following command using the version of your DSS version:

/data/bin/dssadmin install-spark-integration -standaloneArchive PATH_TO/dataiku-dss-spark-standalone-12.2.0-3.4.1-generic-hadoop3.tar.gz

This is explained here Setup Spark

0 Kudos

Labels

?
Labels (1)

Setup info

?
Tags (1)
A banner prompting to get Dataiku