Saving spark dataframe as Parquet using standalone Spark on Local server

Hi Team,
My Dataiku server has not been integrated with Hadoop cluster but I have standalone spark installed in the DSS server. While creating a new dataset, the only file format that is available for me is csv. I wanted to know, whether it is possible to save my datasets as 'parquet' into my local DSS server.
Best Answer
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,329 Neuron
My bad, this is supported but only on HDFS, S3, GCS and Azure Blob storage connections. You are using a File System connection which is not supported.
Answers
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,329 Neuron
You can save datasets as parquet but you will need to handle them manually using Dataiku managed folders. In other words if you want to use Dataiku datsets in your flow you are stuck with the Dataiku format.
-
Hey Turribeach, Thanks for your response. But I am not talking about Dataiku view. It is about the file format in which I can save the dataframe in my local server. Right now, I can see only csv as an option (screenshot provided) but I believe parquet can also be used.
Let me know if we can enable parquet also in this option.