Access of the partition value in visual recipes
Hi,
I would like to put the partition value (used in the recipe) to the filter.
But I get
An invalid argument has been encountered : Unknown DSS variable: DKU_DST_DATE
Answers
-
Ignacio_Toledo Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 415 Neuron
Hi @tomas
. What is exactly your use case?If you are using partitions, when running a recipe over a partitioned dataset, the data included in the recipe (to get distinct rows in your case) will have already been 'filtered' to only keep the date of the partition that is being run.
Or in other words, when you run a recipe over a partitioned dataset, the process will always include a filter 'date_partition = ${DKU_DST_DATE}' for all the dates included in your partition selection.
So I wonder what is exactly the use case that you want to solve.
-
Yes I know, but imagine a dataset where is a string column (YYYY-MM-DD format) containing date values is part of the data. But the dataset is NOT partitioned, data is in one or more parquet files, no partition structure in folders. And the visual recipe is doing group by and aggregation into a partitioned table. So and you want to process every single partition value (DAY) in a such way that dataiku takes only the given day, aggregates it and stores into a particular partition.
-
Ignacio_Toledo Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 415 Neuron
Then I was in fact making a wrong assumption, as I thought the input dataset was also partitioned.
However, even if the source data is not partitioned into folders, you can create the dataset and manually set the partition in dataiku, maybe that could help?
Cheers!
-
Hey,
Same problem like Tomas, in input I have an hive database not partitioned and I would like to put the partition value (used in the recipe) to the filter.
Have you any update for this topic?
PS: My input table is large and i can not partitioned this one
PS2 : In impala recipe, it's working -
Update, it's working.