Dataiku + Spark on Blob Datasets
yashpuranik
Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Neuron 2023 Posts: 69 Neuron
Hi Folks,
Curious about this link: https://doc.dataiku.com/dss/latest/spark/datasets.html#other. This mentions HDFS and S3 as better suited for Spark computation. I am curious why Blob Storage is not included as well. Is this a case of incomplete documentation? Or is Dataiku still working on implementing support for Spark + Azure Blob Storage?
Yash
Tagged:
Best Answer
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
Hi,
The documentation is not updated. Dataiku will work on Azure Blob.
You will need to set the HDFS interface in the connection settings of the Azure blob connection:
Thanks,