How do Preserve chienese text format during CSV to dataiku load?

Niladri
Niladri Registered Posts: 10 ✭✭

I'm using Dataiku version 13.1. I have a text dataset with around 2400 rows, mostly it's in english but around 100 rows contains chines character. My data is in csv format. I need to perform GenAI task on my dataset & load back to it to CSV

Chinese characters are converting to english characters while loading data from CSV to Dataiku

Can you suggest steps to keep original format of data as per CSV during whole process?

Best Answer

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,248 Neuron
    Answer ✓

    When you upload a CSV file, which is what I am assuming you are doing as you are not saying how you are loading this dataset, you can click on configure format and then on "Show advanced options" and you will be able to specify the charset (aka encoding) of the file. You will need to check with the file producer to find out what character encoding they used to create the file.

Answers

Setup Info
    Tags
      Help me…