Enabling parquet format in Dataiku DSS

Options
Ankur30
Ankur30 Partner, Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer Posts: 40 Partner

Hi

Currently when we write into Dataiku file system we only csv and avro format.

How can I enable parque format in Dataiku DSS running on linux platform on EC2 instance.

I need steps for that. Also we don't have any HDFS connection setup as well.

Regards,

Ankur.

Tagged:

Best Answer

Answers

  • Ankur30
    Ankur30 Partner, Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer Posts: 40 Partner
    Options

    Thanks @AlexT
    for prompt response. I will use the above steps you mentioned and then Accept it as solution once I was able to configure the parque format.

    Thank you.

    Regards,

    Ankur,

  • somepunter
    somepunter Registered Posts: 20 ✭✭✭
    Options

    thanks for this,

    @Ankur30
    what did you use as storage option? S3?

    the documentation mentions:

    Parquet datasets can be stored on the following cloud storage and hadoop connections: HDFS, S3, GCS, Azure Blob storagebut

    @AlexT

    I'm curious whether it can be written to local / network filesystems

Setup Info
    Tags
      Help me…