exporting dataset from Dataiku to Excel changes the cells format from Text to Number
When I export a dataset to Excel or CSV, for the cells which are numbers, in Excel file, the cell format is changed to "Number", even if in the dataset the value type was Text (string).
How can be stopped this change to happen when exporting data to Excel?
Operating system used: Windows
Operating system used: Windows
Answers
-
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron
Welcome to the Dataiku Community.
There are a number of different ways to save MS Excel files from Dataiku. Can you clarify which method you are using?
In the meantime. This thread might be of interest.
https://community.dataiku.com/t5/General-Discussion/Fill-Pattern/m-p/21112
Let us know how you are getting along.
--Tom
-
Well, first, I used a scenario which sends an email having attachment the dataset converted to Excel.
Secondly, I did the same, but converted to CSV. I got the same result.
Thirdly I download it manually from the flow as Excel then as csv.
Every time Dataiku change the Numbers stored as text to Number Format.
The reason to keep them as text is that the numbers have 0 in front of the number and it will dissapear, which I dont want to.
Many Thanks for your help
-
Hi,
This issue seems to be related to the way Excel imports files. I tested on my side on a column with ids and leading 0s, and 0s are preserved at export. However, excel auto-converts such columns to integers at import.
You can export the data as csv and use Excel's import wizard to prevent it from converting the column to number types automatically. You can have a look at Microsoft's documentation here: https://support.microsoft.com/en-us/office/text-import-wizard-c5b02af6-fda1-4440-899f-f78bafe41857#:~:text=Go%20to%20the%20Data%20tab,Import%20Wizard%20dialog%20will%20open.
Hope this will help fixing your issue.
Best,
-
Manuel Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 193 ✭✭✭✭✭✭✭
Hi,
Like Alexander says, Excel behaves like this.
Within Excel you can prefix values with ' (for example '0787432432), to assume the value as string instead of number.
Within DSS, perhaps you can concatenate this character to the values, immediately before creating the export file.
I hope this helps.
Best regards
-
As I mentioned initially in the issue description, I already testes also to export as CSV and I got the same problem.
It is strange that for you this issue is not reproduced when export to excel.
As I see is the Dataiku is the one who change the format of data, not the Excel.
-
Yes, this solution with adding prefix values ' (for example '0787432432) which assume the value as string instead of number it is a possible solution which implies further receipe to be added.Thanks
-
I made a check using a wordpad to open the csv file received from Dataiku and the 0 was in front of the number.
So, you are right, Excel is changing the format of the cells.
-
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron
It seems like we are making some progress. Thanks for getting back with further details. Thanks to @Manuel
and @AlexandreL
for jumping in on this one.One additional point that may help with opening your CSV files in MS Excel. There are actually two ways to open a CSV file in MS Excel
1. By double clicking in windows file explorer, or the Macintosh finder. When opening in this way MS Excel does it’s best to interpret the .csv and we don’t want that to happen
2. By opening MS Excel first and going to File Open and selecting your CSV file you are put through a 4 or 5 step process to import your file. On the second or third page of the dialog box you are offered the opportunity to declare what kind of data you have in each column. “general” is the default. In this case that is not want for these columns with pretended 0s. We want to choose text.
-
Hi Tom
Thanks for the tips. I'll test them.
All the best to you
-
I agree that, this if you want to avoid losing leading "0" you need to apply Excel workaround as you mentioned.
Thank you for your help