Check out the first Dataiku 8 Deep Dive focusing on Productivity on October 29th Read More

Spark and HDFS Integration

Level 3
Spark and HDFS Integration

Hi All,  

Thanks for great support. 

I would like to install DSS in different server where Spark and HDFS are not exist. is it possible to integrate DSS with remote server where spark and hdfs are exist ? and is it possible to submit spark job remotely ? 

 

Thanks 

Kind regards

Bader

0 Kudos
2 Replies
Dataiker
Dataiker

Hi,

I'm assuming you're talking about "regular Hadoop" here (i.e. Cloudera / Hortonworks / MapR / EMR / Dataproc).

Even if your machine itself is not a part of the cluster, it will still need to have client libraries, binaries and configurations installed locally in order to talk to the cluster. It is not possible to submit jobs to a "completely-separated" cluster without anything installed locally

0 Kudos
Level 3
Author

Could you please list the required configuration and lib in order to separate the dss server and the cluster 

0 Kudos