min between 2 dates

Solved!
EdBerth
Level 2
min between 2 dates

Hello,


I'm trying to find the 1st date between 2 date fields.

I was thinking of using a min formula but there may be missing values โ€‹โ€‹in these fields and the formula doesn't seem to work in this case.

Is there another solution other than an "if then" formula ?

0 Kudos
1 Solution
konathan
Level 3

Hi @EdBerth,

 

If I understood correctly, based on your latest comment, you need an automated way (minimize user input in case new date columns are added in the dataset) to find the minimum date among multiple date values (not just between two). One way to achieve this is through the Python function step in a Prepare recipe which applies a Python function to each row in the dataset. What the function in the attached screenshot does, is that, per row, it takes the values of all the columns, sorts them, and finally, returns the second element of this list.

 

Note: The first element of this list will be None because this is the value that the outcome column originally has, since it is a placeholder column and it is empty at the beginning.

 

I hope this helps!

 

-Konstantina

View solution in original post

0 Kudos
3 Replies
Turribeach

Is there another solution other than an "if then" formula ? => Not sure why you think the if approach is not suitable?

min(if(isNull(val("date_col1")), val("date_col2"), val("date_col1")), if(isNull(val("date_col2")), val("date_col1"), val("date_col2")))

 

0 Kudos
EdBerth
Level 2
Author
I am not against using the if approach but it can quickly become complex if there are more than 2 fields in date format.
That's why I was looking for another approach.

0 Kudos
konathan
Level 3

Hi @EdBerth,

 

If I understood correctly, based on your latest comment, you need an automated way (minimize user input in case new date columns are added in the dataset) to find the minimum date among multiple date values (not just between two). One way to achieve this is through the Python function step in a Prepare recipe which applies a Python function to each row in the dataset. What the function in the attached screenshot does, is that, per row, it takes the values of all the columns, sorts them, and finally, returns the second element of this list.

 

Note: The first element of this list will be None because this is the value that the outcome column originally has, since it is a placeholder column and it is empty at the beginning.

 

I hope this helps!

 

-Konstantina

0 Kudos

Labels

?
Labels (1)
A banner prompting to get Dataiku