We're excited to announce that we're launching the second installment of Dataiku Product Days Register Now

Need more dataset file format while exporting dataset to local such as parquet

Actually to do unit testing on the final and intermediate datasets, need more dataset file formats such as parquet, Avro, sas7bdat, ORC, etc while exporting datasets to the local system for large datasets, as CSV format can't handle more than 1 million records.

5 Comments
natejgardner
Level 6

It would also be great if more file formats were allowed for import.

Microsoft Access, SQLite, and edb come to mind as most frequently needed.  

PANKAJ
Level 3
Level 3

@natejgardner 

Yes, I agree with you on more file format options for importing dataset will also be very helpful.

AshleyW
Dataiker
Dataiker

Hi @natejgardner ,

FYI Microsoft Access and SQLite are supported filte formats for importing data into DSS. I've provided links to the relevant referenc documentation and community articles. If there are file formats that we don't support yet that you'd like to see made available in DSS, feel free to add that as a separate post on the Product Ideas board.

Best,

Ashley

CoreyS
Community Manager
Community Manager
Status changed to: Needs Info
 
natejgardner
Level 6

Thanks @AshleyW , unfortunately these approaches require the files to already be exposed on the network or manually uploaded to the Dataiku server. But most teams I've worked with that generate these will just send them as file attachments. Ideally, they could be uploaded and processed as true flat files the same way Excel and CSV files are. Even when teams do upload their flat file databases to a network location, if they use a Windows file share, if the Dataiku instance doesn't have saml authentication configured, there's no way to authenticate. It'd be a big time saver when working with these sorts of files if these drivers that convert flat file databases into sql connections could also be embedded into the file parsing system directly so Access and SQLite become supported as file formats as well.