Get acces to partition target within Python Recipe

s-cordo · ‎01-11-2021

Hi,

I would like to have access within my Python recipe to the partition that will be built. In fact, i would like to adapt my code considering the partiton targeted.

The recipe takes as an input a non-partitioned hdfs dataset and as an output a file-based partitioned hdfs dataset which partition is categorical.

I tried to use what is described there , but the function

dataset.get_write_partition()

didn't work for me.

Existing a way to accomplish what i want to ?

Thanks 🙂

PS : I'm using DSS 7.0

dimitri · ‎01-11-2021

Hi @s-cordo

You can access the partition name you want to build using the dku_flow_variables Python dictionary that you can access using dataiku.dku_flow_variables.
In your example, as your partitioning dimension name is thematique_name, you should be able to access its value using

dataiku.dku_flow_variables["DKU_DST_thematique_name"]

dataset.get_write_partition() is deprecated, we'll update the link you shared, thanks for the heads up!

Have a great day!

View solution in original post

dimitri · ‎01-11-2021

Hi @s-cordo

You can access the partition name you want to build using the dku_flow_variables Python dictionary that you can access using dataiku.dku_flow_variables.
In your example, as your partitioning dimension name is thematique_name, you should be able to access its value using

dataiku.dku_flow_variables["DKU_DST_thematique_name"]

dataset.get_write_partition() is deprecated, we'll update the link you shared, thanks for the heads up!

Have a great day!

s-cordo · ‎01-11-2021

Hi @dimitri ,

Thanks for your answer.

However, I did not succeed to apply your solution.

²

Maybe

dataiku.dku_flow_variables["DKU_DST_thematique_name"]

doesn't exist for DSS 7.0 ?

I tried

dataiku.get_flow_variables()

that looked equivalent, but i got a None value even if my partition seems to be well-defined on the flow :

Thanks for your help

dimitri · ‎01-11-2021

This is because you run the script from a notebook. Since the partition identifiers to build are configured on the recipe, they cannot be accessed from a notebook, and the dku_flow_variables dictionary is only defined when running from the recipe.

Note that both dataiku.dku_flow_variables and dataiku.get_flow_variables() will work and return the same result from the recipe, even with DSS 7.0.

Hope it helps!

s-cordo · ‎01-11-2021

Indeed, it worked like a charm inside my python recipe 🙂

Thanks a lot !

Sign up to take part

Get acces to partition target within Python Recipe

Get acces to partition target within Python Recipe