How to calculate number of datasets used in merge recipe

shak99
Level 1
How to calculate number of datasets used in merge recipe

Hi all,

Is it possible to determine how many datasets are being ingested by a stack recipe and then use that value later on in a prepare recipe? 

Thanks!


Operating system used: Windows

0 Kudos
4 Replies
Jurre
Level 5

Hi @shak99 , welcome!

In a visual Stack-recipe an option is available to include an origin column, indicating the original dataset.  See attached screenshot. -Hope this helps! merge.jpg

0 Kudos
shak99
Level 1
Author

Thanks @Jurre , I understand how to accomplish this visually however I would like to do this programmatically. Therefore, for instance, can I utilise a python recipe to determine how many datasets are being used as inputs for a 'stack' recipe?

 

edit: In the post I mentioned 'merge recipe' I in reality meant a 'stack recipe'

0 Kudos
JuanE
Dataiker

Hello @shak99

You can do this by leveraging the DSS Python API. Hereโ€™s a sample code snippet to get you started:

import dataiku

client = dataiku.api_client()
project = client.get_project("PROJECTKEY") # Select your Project accordingly
stack_recipe = project.get_recipe("the_stack_recipe_name") # Select your stack recipe accordingly
stack_recipe_settings = stack_recipe.get_settings()
number_of_inputs = len(stack_recipe_settings.get_recipe_inputs()['main']['items'])

 

I hope that helps.

Jurre
Level 5

Hi @shak99 ,

I'm not that familiar with python to fully answer your question but possibly something with adding multiIndex keys would do the trick. However i strongly suggest that you wait a little for pythonista's to join the conversation here. Cheers, Jurre

0 Kudos