exporting dataset from Dataiku to Excel changes the cells format from Text to Number

mirel29
mirel29 Registered Posts: 7 ✭✭✭

When I export a dataset to Excel or CSV, for the cells which are numbers, in Excel file, the cell format is changed to "Number", even if in the dataset the value type was Text (string).

How can be stopped this change to happen when exporting data to Excel?

New Bitmap Image.jpg


Operating system used: Windows


Operating system used: Windows

Answers

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron

    @mirel29

    Welcome to the Dataiku Community.

    There are a number of different ways to save MS Excel files from Dataiku. Can you clarify which method you are using?

    In the meantime. This thread might be of interest.

    https://community.dataiku.com/t5/Using-Dataiku/Schema-changes-from-double-to-integer-all-the-time/m-p/14932/highlight/true

    https://community.dataiku.com/t5/General-Discussion/Fill-Pattern/m-p/21112

    Let us know how you are getting along.

    --Tom

  • mirel29
    mirel29 Registered Posts: 7 ✭✭✭

    Well, first, I used a scenario which sends an email having attachment the dataset converted to Excel.

    Secondly, I did the same, but converted to CSV. I got the same result.

    Thirdly I download it manually from the flow as Excel then as csv.

    Every time Dataiku change the Numbers stored as text to Number Format.

    The reason to keep them as text is that the numbers have 0 in front of the number and it will dissapear, which I dont want to.

    Many Thanks for your help

  • AlexandreL
    AlexandreL Dataiker, Registered Posts: 36 Dataiker

    Hi,

    This issue seems to be related to the way Excel imports files. I tested on my side on a column with ids and leading 0s, and 0s are preserved at export. However, excel auto-converts such columns to integers at import.

    You can export the data as csv and use Excel's import wizard to prevent it from converting the column to number types automatically. You can have a look at Microsoft's documentation here: https://support.microsoft.com/en-us/office/text-import-wizard-c5b02af6-fda1-4440-899f-f78bafe41857#:~:text=Go%20to%20the%20Data%20tab,Import%20Wizard%20dialog%20will%20open.

    Hope this will help fixing your issue.

    Best,

  • Manuel
    Manuel Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 193 ✭✭✭✭✭✭✭

    Hi,

    Like Alexander says, Excel behaves like this.

    Within Excel you can prefix values with ' (for example '0787432432), to assume the value as string instead of number.

    Within DSS, perhaps you can concatenate this character to the values, immediately before creating the export file.

    I hope this helps.

    Best regards

  • mirel29
    mirel29 Registered Posts: 7 ✭✭✭

    As I mentioned initially in the issue description, I already testes also to export as CSV and I got the same problem.

    It is strange that for you this issue is not reproduced when export to excel.

    As I see is the Dataiku is the one who change the format of data, not the Excel.

  • mirel29
    mirel29 Registered Posts: 7 ✭✭✭

    Yes, this solution with adding prefix values ' (for example '0787432432) which assume the value as string instead of number it is a possible solution which implies further receipe to be added.Thanks

  • mirel29
    mirel29 Registered Posts: 7 ✭✭✭

    I made a check using a wordpad to open the csv file received from Dataiku and the 0 was in front of the number.

    So, you are right, Excel is changing the format of the cells.

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron

    @mirel29

    It seems like we are making some progress. Thanks for getting back with further details. Thanks to @Manuel
    and @AlexandreL
    for jumping in on this one.

    One additional point that may help with opening your CSV files in MS Excel. There are actually two ways to open a CSV file in MS Excel

    1. By double clicking in windows file explorer, or the Macintosh finder. When opening in this way MS Excel does it’s best to interpret the .csv and we don’t want that to happen

    2. By opening MS Excel first and going to File Open and selecting your CSV file you are put through a 4 or 5 step process to import your file. On the second or third page of the dialog box you are offered the opportunity to declare what kind of data you have in each column. “general” is the default. In this case that is not want for these columns with pretended 0s. We want to choose text.

  • mirel29
    mirel29 Registered Posts: 7 ✭✭✭

    Hi Tom

    Thanks for the tips. I'll test them.

    All the best to you

  • mirel29
    mirel29 Registered Posts: 7 ✭✭✭

    I agree that, this if you want to avoid losing leading "0" you need to apply Excel workaround as you mentioned.

    Thank you for your help

Setup Info
    Tags
      Help me…