Parse Date

one column has the date format like "2022-11-04 07:44:32.634000+00:00". select "Parse date…" in prepare recipe and show "Aww ...Smart Date could not find any format matching your data. Please try a custom format". Open Formula editor panel, input "concat(datePart(START_DATE,"year"),"_",datePart(START_DATE,"month")), and preview show "
| ||||
how can I parse the date correctly? |
Best Answer
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,531 Neuron
That can be done too. If the string has a dot you could use the substring() function to get the part up to the dot and concat it to the part after the plus.
Answers
-
Hi,
I hope that you are doing well.
So to clarify, this error occurs when attempting to to use the date parsing processor on : 2022-11-04 07:44:32.634000+00:00 ?
?
Was this error showing on the original dataset or occurred later on after data manipulation?
Would you mind providing a screenshot of the prepare recipe steps as well as a screen shot of the behavior at run?
You mentioned both parse date and formula steps, during which step the error is occurring?
Best,
Yasmine
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,531 Neuron
The format that you want to use in the Parse Date processor is: yyyy-MM-dd HH:mm:ss.SSSSSSZ
-
thanks for your date format. but some date could not find the correct format as above. my purpose is to parse the date as yyyy-mm-dd, and remove HH:mm:ss.SSSSSSZ to save the size, the other purpose is to add one column to show the date as yyyy_Q1(Q2~4). how can I do it?
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,531 Neuron
You have date/times which are in different formats. Some have the microseconds (the .999999 part) and some don't. You should really ask the data provider to give you the data in a consistent date/time format. If you can't do that you could try to inject .000000 in the rows that don't have the microseconds. For instance in your screen shot sample you should convert 2023-10-17 09:33:50+00:00 to 2023-10-17 09:33:50.000000+00:00 so that it can be parsed with the same date/time format. You can use the contains() formula combined with the if() formula to check if the value has a dot and if doesn't use the replace() formula to replace "+" by ".000000+" and that should do the trick.
-
thanks for your inputs. it's a good idea to inject the missing part in the date without the microseconds. if I like to remove the part of time and microsecond because I don't need this part of data, how can do it?