Dates imported from Excel as days since 01 JAN 1900
Hi, I am a new user, and I am doing a proof-of-concept trial with Dataiku 7.0.0. I am importing data containing dates from Excel. The dates in the Excel cells are formatted to appear like this:
2019-01-01 00:37:55 UTC
The text that is entered to be displayed in the cell looks like this:
1/1/2019 12:37:55 AM
When I try to import the data into Dataiku, it interprets that date column as text and returns 43466.02633101852. I understand that is Excel's way of storing the date as the number of days since January 1, 1900. I can go into Excel, change the format of the date column to the custom format of mm/dd/yyyy hh:mm, save a new Excel file, and import that into Dataiku, and then Dataiku recognizes the date.
Questions:
1. Is there a way to coax Dataiku to read the date-time value as a date that can then be parsed, starting from the original Excel file?
2. If not, is there a function to create a date value from the "Date" column that Dataiku interpreted as the value 43466.02633101852?
Thank you for your help.
Best Answers
-
Hi,
Thanks for providing the sample file.
Indeed DSS won't interpret the custom format "2019-01-01 00:37:55 UTC" as a valid date format (we are relying on the Apache POI librairies for that). So the available options I would suggest are the following:
- either change the custom format in excel to be the default one for date-time as you already tried: i.e. "1/1/2019 0:37:55"
- or if your excel document is only made of a single sheet, export the corresponding sheet in CSV instead and after import in DSS parse the date string thanks to "Parse date" step in a "Prepare recipe" where you'll be able to define the custom format
- or convert, thanks to a "Formula" step in a Prepare recipe, the number to the corresponding date. Based on Dates And Times In Excel, the formula, in case the corresponding column is entitled "Date", would be:
inc(inc(asDate("1900-01-01", "yyyy-MM-dd"), floor(Date)-2, "days"),floor(60*60*24*(Date-floor(Date))), "seconds")
I hope this helps.
Regards,
-
Thanks MIckael! For my application, it would be easy to change the format in Excel before trying to import the data. However, I can imagine a situation in which I would not be able to change an Excel date format because, for example, that would break the code of another user who was using the same Excel file for a different purpose. However, the INC function solved that problem. Thank you!
Answers
-
Hi,
Would that be possible for you to upload a sample of the Excel file that you want to load into DSS ?
Thanks in advance.
Regards,
-
Hi Mickael. I tried to upload an Excel file to use as an example, but this board does not allow .xlsx files to be uploaded. A .txt or .csv file would not preserve the type of formatting that originally prompted the question.
-
Hi @JMH829
- just added the capability to upload .xlsx files if you can try again. Let us know if you encounter any other issues - thanks for posting on Community! -
Thanks, Lisa! An example Excel file is now attached to this message. Column A has the dates in UTC format as described previously.
-
See my response to Lisa. I uploaded an Excel file with UTC dates as an example.