Dataiku doesn't recognize Excel General format correctly
Hi,
I'm uploading Excel sheets to Dataiku and all my fields are in the General format. The dataset is formatted as a table.
When I'm uploading it to dataiku all the data is set as string however there are suggestion of what the correct datatype should be (decimal, integer, date etc).
I tried:
- changing the data format in Excel--> dataiku still doesn't recognize them as the correct datatype
. manually setting the datatype in Dataiku (works there but when I export to Redshift and try to use the dataset in tableau it still doesn't recognize date as a date (it is set as string)
Is there any way to make Dataiku recognize data properly?
Operating system used: Windows 10
Answers
-
CoreyS Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Registered Posts: 1,150 ✭✭✭✭✭✭✭✭✭
Hi @annabekesi
welcome to the Dataiku Community and thank you for your question. While you wait for a more detailed response, I wanted to provide you with some resources.- Excel to Dataiku DSS Quick Start (Dataiku Academy)
- The Excel to Dataiku Playbook (ebook)
- Moving From Spreadsheets to Dataiku for Financial Modeling (Dataiku Blog)
- The Challenges and (Awesome) Benefits of Switching From Spreadsheets to Dataiku (Dataiku Blog)
You may have already found these, but if not, I hope this helps!
-
Manuel Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 193 ✭✭✭✭✭✭✭
Hi,
From your description it is unclear what exact problem you are facing. Perhaps you can add an example.
Your text seems to imply you know this already, but just to be certain: when you are uploading the file, you can navigate to the schema tab and manually set the column types.
On dates, if the source column uses a non-ISO-8601 or non-RFC-822 format, it needs to be converted to a standard non-ambiguous format after you upload. For this, you use the Parse Date processor in a Prepare recipe. This should fix any problems handling the date downstream. Check this page for more information.
I hope this helps.