Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi
Currently when we write into Dataiku file system we only csv and avro format.
How can I enable parque format in Dataiku DSS running on linux platform on EC2 instance.
I need steps for that. Also we don't have any HDFS connection setup as well.
Regards,
Ankur.
Hi Ankur,
To support parquet files on non-Hadoop install You will need to install hadoop integration with the standalone libraries for parquet to work. Please review: https://doc.dataiku.com/dss/latest/connecting/formats/parquet.html#applicability to see the restrictions related to parquet.
The steps to install:
https://doc.dataiku.com/dss/latest/containers/setup-k8s.html#optional-setup-spark
Download the standalone libs from( if you are on a different version change the version in the URL) : https://downloads.dataiku.com/public/studio/9.0.5/dataiku-dss-hadoop-standalone-libs-generic-hadoop3...
./bin/dssadmin install-hadoop-integration -standaloneArchive /PATH/TO/dataiku-dss-hadoop3-standalone-libs-generic...tar.gz
Let me know if you have any issues.
Hi Ankur,
To support parquet files on non-Hadoop install You will need to install hadoop integration with the standalone libraries for parquet to work. Please review: https://doc.dataiku.com/dss/latest/connecting/formats/parquet.html#applicability to see the restrictions related to parquet.
The steps to install:
https://doc.dataiku.com/dss/latest/containers/setup-k8s.html#optional-setup-spark
Download the standalone libs from( if you are on a different version change the version in the URL) : https://downloads.dataiku.com/public/studio/9.0.5/dataiku-dss-hadoop-standalone-libs-generic-hadoop3...
./bin/dssadmin install-hadoop-integration -standaloneArchive /PATH/TO/dataiku-dss-hadoop3-standalone-libs-generic...tar.gz
Let me know if you have any issues.
Thanks @AlexT for prompt response. I will use the above steps you mentioned and then Accept it as solution once I was able to configure the parque format.
Thank you.
Regards,
Ankur,