How to merge 2 tables, by dates comparison et precedent constraints

philibert
philibert Registered Posts: 2

hello.

I have 2 tables, the first one shows vehicules consumption records

|ID_VHL| date | Type_Of_Record| Value_Of_Record|
| A | 05/01/2024 | hour | 10 |
| A | 15/01/2024 | km | 2 |
| A | 05/08/2024 | hour | 20 |
| A | 15/09/2024 | km | 28 |

and the second one represents the distribution of these vehicles, variable over time, with the start of the allocation period

|ID_VHL| date | Owner|
| A | 01/01/2024 | x |
| A | 15/07/2024 | y |

I want to add for each record in the first table the owner, i.e, the owner with the most recent date in the second table before the date record , regarding this vehicle.

|ID_VHL| date | Type_Of_Record| Value_Of_Record|Owner
| A | 05/01/2024 | hour | 10 | x
| A | 15/01/2024 | km | 2 |x
| A | 05/08/2024 | hour | 20 |y
| A | 15/09/2024 | km | 28 |y

I suppose I must use window recipe, but ….

Thanks a lot

Operating system used: windows

Answers

  • philibert
    philibert Registered Posts: 2

    Of course, with a python recipe, would be easy but maybe there is a smarter solution

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,161 Neuron
    edited November 15

    A join Recipe and a Window recipe will do it. First join the two datasets as follows:

    You will get some duplicates since there could be many records on the allocation table with the same consumption date or before it:

    Now create a window recipe to get rid of the duplicates:

    In aggregations select row number and in Post-filter filter for rownumber == 1:

    Results:

Setup Info
    Tags
      Help me…