min between 2 dates

EdBerth
EdBerth Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 15 ✭✭

Hello,


I'm trying to find the 1st date between 2 date fields.

I was thinking of using a min formula but there may be missing values ​​in these fields and the formula doesn't seem to work in this case.

Is there another solution other than an "if then" formula ?

Tagged:

Best Answer

  • Konstantina
    Konstantina Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 25 ✭✭✭✭✭
    Answer ✓

    Hi @EdBerth
    ,

    If I understood correctly, based on your latest comment, you need an automated way (minimize user input in case new date columns are added in the dataset) to find the minimum date among multiple date values (not just between two). One way to achieve this is through the Python function step in a Prepare recipe which applies a Python function to each row in the dataset. What the function in the attached screenshot does, is that, per row, it takes the values of all the columns, sorts them, and finally, returns the second element of this list.

    Note: The first element of this list will be None because this is the value that the outcome column originally has, since it is a placeholder column and it is empty at the beginning.

    I hope this helps!

    -Konstantina

Answers

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,002 Neuron
    edited July 17

    Is there another solution other than an "if then" formula ? => Not sure why you think the if approach is not suitable?

    min(if(isNull(val("date_col1")), val("date_col2"), val("date_col1")), if(isNull(val("date_col2")), val("date_col1"), val("date_col2")))

  • EdBerth
    EdBerth Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 15 ✭✭
    edited July 17
    I am not against using the if approach but it can quickly become complex if there are more than 2 fields in date format.
    That's why I was looking for another approach.

  • JBaesen
    JBaesen Registered Posts: 3 ✭✭✭

    Hello !

    I had the same issue with 5 columns containing dates (the 5 columns might have blanks but there was always at least two filled with dates).

    It's not very pretty but here's how I managed it :

    min(
    if(isBlank(DATE1),asDate("2099-01-01", "yyyy-MM-dd"),asDate(DATE1)),
    if(isBlank(DATE2),asDate("2099-01-01", "yyyy-MM-dd"),asDate(DATE2)),
    if(isBlank(DATE3),asDate("2099-01-01", "yyyy-MM-dd"),asDate(DATE3)),
    if(isBlank(DATE4),asDate("2099-01-01", "yyyy-MM-dd"),asDate(DATE4)),
    if(isBlank(DATE5),asDate("2099-01-01", "yyyy-MM-dd"),asDate(DATE5)))

    I also need the max so I guess I'll use the same trick but with the date "1900-01-01" as default.

    Hope this helps !

Setup Info
    Tags
      Help me…