Spanish accent in column names & records
Hello,
I have a input folder in that we have 29 csv files. In those files we have columns which has Spanish accent in their name. & we have some columns which has records with Spanish accent. So while reading the CSV files, we are facing issue of count mismatch. so how we get the proper records with spanish accent?
Operating system used: Windows
Answers
-
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,601 Neuron
I know that you can import non-latian (ASCII) characters in data records as CSV files.
However, I've never tried using characters like ñ and Ñ for column headers.
If it is not working I'd put in a support ticket. The support team may know about some work arounds. This may also be considered as a defect and might be able to change. You could also suggest this as a product idea. https://community.dataiku.com/t5/Product-Ideas/idb-p/Product_Ideas
If you have to work fast. I might try in import the data without headers, just drop the first row of each file. You will get default column names but you may be able to get the data into DSS to start working on it.
-
LouisDHulst Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Neuron, Registered, Neuron 2023 Posts: 54 Neuron
Hi,
Could you show an example of data loss? I was able to import a csv file with both ñ and á in the column headers and in the records using the visual interface and with a python script reading from a folder without any data loss.
-
I have the same case, I'm reading a json file and the keys have accents.
I used the command below but it didn't work.
locale.setlocale(locale.LC_TIME, 'pt_BR.utf8')