Dataiku + Spark on Blob Datasets
yashpuranik
Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Neuron 2023 Posts: 69 Neuron
Hi Folks,
Curious about this link: https://doc.dataiku.com/dss/latest/spark/datasets.html#other. This mentions HDFS and S3 as better suited for Spark computation. I am curious why Blob Storage is not included as well. Is this a case of incomplete documentation? Or is Dataiku still working on implementing support for Spark + Azure Blob Storage?
Yash
Tagged:
Best Answer
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,352 DataikerHi,
The documentation is not updated. Dataiku will work on Azure Blob.
You will need to set the HDFS interface in the connection settings of the Azure blob connection:
Thanks,