Survey banner
Switching to Dataiku - a new area to help users who are transitioning from other tools and diving into Dataiku! CHECK IT OUT

Convert time stored as six digit number to time stamp and find difference in hrs of two such time

Solved!
RAKS
Level 1
Convert time stored as six digit number to time stamp and find difference in hrs of two such time

I have a dataset where date stored in format yyyyMMdd and time stores as hhmmss (see image)

 i need to calculate the difference in hrs between Datetime combined of both type of columns , can someone help, already tried through various methods regex, python code no success.


Operating system used: windows 11

0 Kudos
1 Solution
RoyE
Dataiker

Hi RAKS,

 

Assuming that there are no issues with the rows in your dataset, this can be accomplished through a prepare recipe. Please see below for more details.

Step 1: Concatenate columns

Screen Shot 2023-06-11 at 13.30.51.png

This will output a string with the format of yyyyMMddHHmmss. You will use this in the next step.

Step 2: Format the string into a proper timestamp

Screen Shot 2023-06-11 at 13.32.03.png

This will output a new column, timestamp_start_formatted in a proper date format. 

Do the above for your start time and stop time before continuing to calculate the time between columns.

Step 3: Calculate the time between two columns.

Screen Shot 2023-06-11 at 13.33.15.png

As you can see, this step will calculate the time in minutes from column timestamp_start_formatted until column timestamp_stop_formatted. 

You can change the column names to whatever your column's names are/want to be.

A sample output from the dataset that you provided:

Screen Shot 2023-06-11 at 13.34.57.png

Hope this helps!

View solution in original post

2 Replies
RoyE
Dataiker

Hi RAKS,

 

Assuming that there are no issues with the rows in your dataset, this can be accomplished through a prepare recipe. Please see below for more details.

Step 1: Concatenate columns

Screen Shot 2023-06-11 at 13.30.51.png

This will output a string with the format of yyyyMMddHHmmss. You will use this in the next step.

Step 2: Format the string into a proper timestamp

Screen Shot 2023-06-11 at 13.32.03.png

This will output a new column, timestamp_start_formatted in a proper date format. 

Do the above for your start time and stop time before continuing to calculate the time between columns.

Step 3: Calculate the time between two columns.

Screen Shot 2023-06-11 at 13.33.15.png

As you can see, this step will calculate the time in minutes from column timestamp_start_formatted until column timestamp_stop_formatted. 

You can change the column names to whatever your column's names are/want to be.

A sample output from the dataset that you provided:

Screen Shot 2023-06-11 at 13.34.57.png

Hope this helps!

RAKS
Level 1
Author

Thanks RoyE, it works 

0 Kudos