Survey banner
The Dataiku Community is moving to a new home! We are temporary in read only mode: LEARN MORE

Comma Delimited Files issue

nshapir2
Level 2
Comma Delimited Files issue

I have a .txt file in an s3 connection being parsed in excel style. There are 7 columns. The DSS is creating new columns off the actual data rather than the column names because some entries have commas. This is odd because I have other data where this IS NOT happening. How can I fix this?

 

 

0 Kudos
3 Replies
Turribeach

A CSV file is a comma separated values file. If your fields have commas it’s normal for a process loading the file to assume the comma means another field. So you either remove the commas from all your data columns or enclose them with double quotes which will prevent DSS from thinking the field commas define another field. 

0 Kudos
nshapir2
Level 2
Author

Isn't it typical for the DSS to parse at the column and use that to determine headers then leave the body alone? 

0 Kudos

Different programs parse CSVs in different ways. What it is not typical is to have commas in CSV fields when the fields don't use double quotes (or any other character) to enclose the data.

0 Kudos