How to get max date from the date column when define variable in scenario

Witw
Level 1
How to get max date from the date column when define variable in scenario

Hi,

I try to automate the flow to pull the data from the max date I have in the databased and back Xday. 

not sure how to use MAX with 'my date column' in the defining variable in the scenario.

I tried

max('my date column')

doesn't seem to be correct

0 Kudos
1 Reply
SarinaS
Dataiker

Hi @Witw,

Indeed formulas are row-wise transformations, and max within a formula will simply return the max of 2 passed values, i.e.:

max(-1, 3) returns 3

What I think you want instead is to store the max value of a column in a scenario variable. You can use dataset metrics for this, so that the maximum value for your column will be computed every time the dataset gets built:

Screenshot 2023-07-13 at 4.51.57 PM.png

Then, in your scenario you would use a Python step to read in the metric and store it into a scenario variable:

import dataiku 
from dataiku.scenario import Scenario

client = dataiku.api_client()

# object for this scenario
scenario = Scenario()

project = client.get_default_project()
dataset = project.get_dataset('<YOUR_DATASET_ID>')

# use dataset.get_last_metric_values().get_all_ids() in a notebook to see the different metric names to use in place of 
# "col_stats:MAX:order_date"
max_value_date = dataset.get_metric_history('col_stats:MAX:order_date')['lastValue']['value']

scenario.set_scenario_variables(max_value = max_value_date)


You can run the following in a notebook to determine the value for your metric that you'll fill into get_metric_history:

Screenshot 2023-07-13 at 5.08.22 PM.png
Thanks,
Sarina 

Labels

?
Labels (1)

Setup info

?
Tags (2)
A banner prompting to get Dataiku