Survey banner
The Dataiku Community is moving to a new home! We are temporary in read only mode: LEARN MORE

HDFS - Force Parquet as default settings for recipe output

Solved!
Charly
Level 2
HDFS - Force Parquet as default settings for recipe output

Greetings !

I'm currently on a platform with Dataiku 11.3.1 and writing datasets on HDFS. IT requires all dataset to be written in Parquet, but the default setting is on CSV (Hive) and it can generate errors.

Is there a way to configure the connection to force the default settings to be Parquet ?

Best regards,

0 Kudos
1 Solution
AlexT
Dataiker

Hi @Charly ,
You can configure the instance level preferred format from the Administration -> "Prefered storage formats" and place PARQUET_HIVE as the first option 

Screenshot 2024-05-21 at 1.14.04 PM.png
This can also be controlled at project level by overriding the global Datasets creation settings. 

View solution in original post

2 Replies
AlexT
Dataiker

Hi @Charly ,
You can configure the instance level preferred format from the Administration -> "Prefered storage formats" and place PARQUET_HIVE as the first option 

Screenshot 2024-05-21 at 1.14.04 PM.png
This can also be controlled at project level by overriding the global Datasets creation settings. 

Charly
Level 2
Author

Thanks @AlexT , I was searchning in the individual connection.

Have a nice day !

0 Kudos

Labels

?
Labels (1)
A banner prompting to get Dataiku