Survey banner
The Dataiku Community is moving to a new home! We are temporary in read only mode: LEARN MORE

Convert time stored as six digit number to time stamp and find difference in hrs of two such time

Solved!
RAKS
Level 1
Convert time stored as six digit number to time stamp and find difference in hrs of two such time

I have a dataset where date stored in format yyyyMMdd and time stores as hhmmss (see image)

 i need to calculate the difference in hrs between Datetime combined of both type of columns , can someone help, already tried through various methods regex, python code no success.


Operating system used: windows 11

0 Kudos
1 Solution
RoyE
Dataiker

Hi RAKS,

 

Assuming that there are no issues with the rows in your dataset, this can be accomplished through a prepare recipe. Please see below for more details.

Step 1: Concatenate columns

Screen Shot 2023-06-11 at 13.30.51.png

This will output a string with the format of yyyyMMddHHmmss. You will use this in the next step.

Step 2: Format the string into a proper timestamp

Screen Shot 2023-06-11 at 13.32.03.png

This will output a new column, timestamp_start_formatted in a proper date format. 

Do the above for your start time and stop time before continuing to calculate the time between columns.

Step 3: Calculate the time between two columns.

Screen Shot 2023-06-11 at 13.33.15.png

As you can see, this step will calculate the time in minutes from column timestamp_start_formatted until column timestamp_stop_formatted. 

You can change the column names to whatever your column's names are/want to be.

A sample output from the dataset that you provided:

Screen Shot 2023-06-11 at 13.34.57.png

Hope this helps!

View solution in original post

2 Replies
RoyE
Dataiker

Hi RAKS,

 

Assuming that there are no issues with the rows in your dataset, this can be accomplished through a prepare recipe. Please see below for more details.

Step 1: Concatenate columns

Screen Shot 2023-06-11 at 13.30.51.png

This will output a string with the format of yyyyMMddHHmmss. You will use this in the next step.

Step 2: Format the string into a proper timestamp

Screen Shot 2023-06-11 at 13.32.03.png

This will output a new column, timestamp_start_formatted in a proper date format. 

Do the above for your start time and stop time before continuing to calculate the time between columns.

Step 3: Calculate the time between two columns.

Screen Shot 2023-06-11 at 13.33.15.png

As you can see, this step will calculate the time in minutes from column timestamp_start_formatted until column timestamp_stop_formatted. 

You can change the column names to whatever your column's names are/want to be.

A sample output from the dataset that you provided:

Screen Shot 2023-06-11 at 13.34.57.png

Hope this helps!

RAKS
Level 1
Author

Thanks RoyE, it works 

0 Kudos