How to get max date from the date column when define variable in scenario

Witw
Witw Registered Posts: 2

Hi,

I try to automate the flow to pull the data from the max date I have in the databased and back Xday.

not sure how to use MAX with 'my date column' in the defining variable in the scenario.

I tried

max('my date column')

doesn't seem to be correct

Tagged:

Answers

  • Sarina
    Sarina Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 317 Dataiker
    edited July 17

    Hi @Witw
    ,

    Indeed formulas are row-wise transformations, and max within a formula will simply return the max of 2 passed values, i.e.:

    <span class="pre">max(-1,</span><span> </span><span class="pre">3)</span> returns <span class="pre">3</span>

    What I think you want instead is to store the max value of a column in a scenario variable. You can use dataset metrics for this, so that the maximum value for your column will be computed every time the dataset gets built:

    Screenshot 2023-07-13 at 4.51.57 PM.png

    Then, in your scenario you would use a Python step to read in the metric and store it into a scenario variable:

    import dataiku 
    from dataiku.scenario import Scenario
    
    client = dataiku.api_client()
    
    # object for this scenario
    scenario = Scenario()
    
    project = client.get_default_project()
    dataset = project.get_dataset('<YOUR_DATASET_ID>')
    
    # use dataset.get_last_metric_values().get_all_ids() in a notebook to see the different metric names to use in place of 
    # "col_stats:MAX:order_date"
    max_value_date = dataset.get_metric_history('col_stats:MAX:order_date')['lastValue']['value']
    
    scenario.set_scenario_variables(max_value = max_value_date)
    


    You can run the following in a notebook to determine the value for your metric that you'll fill into get_metric_history:

    Screenshot 2023-07-13 at 5.08.22 PM.png
    Thanks,
    Sarina

Setup Info
    Tags
      Help me…