Comma Delimited Files issue

Noah
Noah Registered Posts: 43 ✭✭✭✭

I have a .txt file in an s3 connection being parsed in excel style. There are 7 columns. The DSS is creating new columns off the actual data rather than the column names because some entries have commas. This is odd because I have other data where this IS NOT happening. How can I fix this?

Answers

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,024 Neuron

    A CSV file is a comma separated values file. If your fields have commas it’s normal for a process loading the file to assume the comma means another field. So you either remove the commas from all your data columns or enclose them with double quotes which will prevent DSS from thinking the field commas define another field.

  • Noah
    Noah Registered Posts: 43 ✭✭✭✭

    Isn't it typical for the DSS to parse at the column and use that to determine headers then leave the body alone?

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,024 Neuron

    Different programs parse CSVs in different ways. What it is not typical is to have commas in CSV fields when the fields don't use double quotes (or any other character) to enclose the data.

Setup Info
    Tags
      Help me…