Survey banner
Switching to Dataiku - a new area to help users who are transitioning from other tools and diving into Dataiku! CHECK IT OUT

How to get max date from the date column when define variable in scenario

Level 1
How to get max date from the date column when define variable in scenario


I try to automate the flow to pull the data from the max date I have in the databased and back Xday. 

not sure how to use MAX with 'my date column' in the defining variable in the scenario.

I tried

max('my date column')

doesn't seem to be correct

0 Kudos
1 Reply

Hi @Witw,

Indeed formulas are row-wise transformations, and max within a formula will simply return the max of 2 passed values, i.e.:

max(-1, 3) returns 3

What I think you want instead is to store the max value of a column in a scenario variable. You can use dataset metrics for this, so that the maximum value for your column will be computed every time the dataset gets built:

Screenshot 2023-07-13 at 4.51.57 PM.png

Then, in your scenario you would use a Python step to read in the metric and store it into a scenario variable:

import dataiku 
from dataiku.scenario import Scenario

client = dataiku.api_client()

# object for this scenario
scenario = Scenario()

project = client.get_default_project()
dataset = project.get_dataset('<YOUR_DATASET_ID>')

# use dataset.get_last_metric_values().get_all_ids() in a notebook to see the different metric names to use in place of 
# "col_stats:MAX:order_date"
max_value_date = dataset.get_metric_history('col_stats:MAX:order_date')['lastValue']['value']

scenario.set_scenario_variables(max_value = max_value_date)

You can run the following in a notebook to determine the value for your metric that you'll fill into get_metric_history:

Screenshot 2023-07-13 at 5.08.22 PM.png



Setup info

A banner prompting to get Dataiku