Join us on Wednesday, June 3rd for a deep dive into Customer Predictive Analytics Learn more

Recipe get_json_payload does not describe the input datasets

Level 4
Recipe get_json_payload does not describe the input datasets

Hi, 


I would like to read the recipe (join recipe) inputs and columns, using the .get_json_payload() method. I have an issue with identifying the input datasets. For example the definition is:




{'table2': 1,
'table1': 0,
'conditionsMode': 'AND',
'rightLimit': {'decisionColumn': {}},
'type': 'LEFT',
'outerJoinOnTheLeft': True,
'on': [{'column1': {'name': 'subscriber_id', 'table': 0},
'column2': {'name': 'subscriber_id', 'table': 1},
'maxDistance': 1,
'type': 'EQ'},
{'column1': {'name': 'churn_date', 'table': 0},
'column2': {'name': 'effective_date', 'table': 1},
'type': 'GTE'}]}],


So I see the table 0 is the main table for the lef join (with churn date) and table 1 is the joined table. But in the recipe inputs the order is different, the first item is the joined table and the second item is the main table:




.get_definition_and_payload().get_recipe_inputs()['main']['items']


So how can I get the name of the datasets/tables when I have the index (0,1,2 etc)?



Thanks    



 

0 Kudos
2 Replies
Dataiker
Dataiker

Hello,



In the join definition you pasted, the first index refer to the virtual inputs (a virtual way of representing the input tables before the joins).



The following code would help you get the proper table names:




recipe = <the_recipe_object>

payload_def = recipe.get_definition_and_payload()

# get the definitions
payload = payload_def.get_json_payload()
join_def = payload["joins"][0] # get the first join definition
virtual_inputs = payload["virtualInputs"]

# get the virtual input indexes
main_vi_idx = join_def["table1"]
sec_vi_idx = join_def["table2"]

# get the table input indexes
main_table_idx = virtual_inputs[main_vi_idx]["index"]
sec_table_idx = virtual_inputs[sec_vi_idx]["index"]

# get all the table inputs
inputs = payload_def.get_recipe_inputs()

# retrieve the table input names
main_table_name = inputs["main"]["items"][main_table_idx]["ref"]
sec_table_name = inputs["main"]["items"][sec_table_idx]["ref"]


 



Regards,



Sam



 

Level 4
Author
Quite complicated but works 🙂
Thanks
0 Kudos
Labels (2)