Access of the partition value in visual recipes

Options
tomas
tomas Registered, Neuron 2022 Posts: 120 ✭✭✭✭✭

Hi,

I would like to put the partition value (used in the recipe) to the filter.

image.png

But I get

An invalid argument has been encountered : Unknown DSS variable: DKU_DST_DATE

Answers

  • Ignacio_Toledo
    Ignacio_Toledo Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 411 Neuron
    Options

    Hi @tomas
    . What is exactly your use case?

    If you are using partitions, when running a recipe over a partitioned dataset, the data included in the recipe (to get distinct rows in your case) will have already been 'filtered' to only keep the date of the partition that is being run.

    Or in other words, when you run a recipe over a partitioned dataset, the process will always include a filter 'date_partition = ${DKU_DST_DATE}' for all the dates included in your partition selection.

    So I wonder what is exactly the use case that you want to solve.

  • tomas
    tomas Registered, Neuron 2022 Posts: 120 ✭✭✭✭✭
    Options

    Yes I know, but imagine a dataset where is a string column (YYYY-MM-DD format) containing date values is part of the data. But the dataset is NOT partitioned, data is in one or more parquet files, no partition structure in folders. And the visual recipe is doing group by and aggregation into a partitioned table. So and you want to process every single partition value (DAY) in a such way that dataiku takes only the given day, aggregates it and stores into a particular partition.

  • Ignacio_Toledo
    Ignacio_Toledo Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 411 Neuron
    Options

    Then I was in fact making a wrong assumption, as I thought the input dataset was also partitioned.

    However, even if the source data is not partitioned into folders, you can create the dataset and manually set the partition in dataiku, maybe that could help?

    Cheers!

  • Paw
    Paw Registered Posts: 2 ✭✭✭
    Options

    Hey,

    Same problem like Tomas, in input I have an hive database not partitioned and I would like to put the partition value (used in the recipe) to the filter.
    Have you any update for this topic?

    PS: My input table is large and i can not partitioned this one
    PS2 : In impala recipe, it's working

  • Paw
    Paw Registered Posts: 2 ✭✭✭
    Options

    Update, it's working. 2022-10-06 10_46_16-Window.jpg

  • CoreyS
    CoreyS Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Registered Posts: 1,150 ✭✭✭✭✭✭✭✭✭
    Options

    Thank you for sharing this update with the rest of the Community, @Paw
    !

Setup Info
    Tags
      Help me…