Join 2 datasets matching a date into an interval

Dataiker, Alpha Tester Posts: 535 Dataiker
Hello

I have to datasets that I want to join. The first dataset as a DATE column, but the second dataset as a BEGIN_DATE and END_DATE column.

I want to join rows from dataset2 into dataset1 analysis.

Is there a way to join the 2 datasets directly or a way to enrich dataset2 and add rows matching BEGIN_DATE en END_DATE?

Best regards

Geoff.

Answers

  • Dataiker, Alpha Tester Posts: 535 Dataiker
    edited July 2024

    I succeeded in doing the "join" (actually it is not, but for my data it is). The goal is to tag each record from DATASET1 for being/not being in a specific time period. Time periods are defined in DATASET2.

    I did it using Python.

    For those interested, here is the script (As i'm not a Python expert, more elegant code can be proposed):


    # -*- coding: utf-8 -*-
    import dataiku
    import pandas as pd, numpy as np
    from dataiku import pandasutils as pdu

    # Input datasets
    DATASET1 = dataiku.Dataset("DATASET1")
    DATASET1_df = histo_MAG_JF_ZV.get_dataframe()
    DATASET2 = dataiku.Dataset("DATASET2")
    DATASET2_df = soldes_prepared.get_dataframe()

    # Output datasets
    DATASET1_outuput = dataiku.Dataset("DATASET1_outuput")

    # Pour chaque ligne du tableau de Soldes
    for d in DATASET2.iter_rows():
    # FOR DEBUG
    #print d

    begin_date = d['begin']
    end_date = d['end']

    cond1 = (DATASET1_df['Date'] >= begin_date)
    cond2 = (DATASET1_df['Date'] <= end_date)

    matching_records = DATASET1_df[cond1][cond2]

    # Add begin/end date to output dataset
    matching_records['begin_date'] = begin_date
    matching_records['end_date'] = end_date

    DATASET1_outuput.write_with_schema(matching_records)

    Next I left join DATASET1 with DATASET1_output to recover non mathing records, as the following analysis needs both matching and non matching records.

    Geoff

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.