Recipe get_json_payload does not describe the input datasets

Options
tomas
tomas Registered, Neuron 2022 Posts: 120 ✭✭✭✭✭

Hi,

I would like to read the recipe (join recipe) inputs and columns, using the .get_json_payload() method. I have an issue with identifying the input datasets. For example the definition is:


{'table2': 1,
'table1': 0,
'conditionsMode': 'AND',
'rightLimit': {'decisionColumn': {}},
'type': 'LEFT',
'outerJoinOnTheLeft': True,
'on': [{'column1': {'name': 'subscriber_id', 'table': 0},
'column2': {'name': 'subscriber_id', 'table': 1},
'maxDistance': 1,
'type': 'EQ'},
{'column1': {'name': 'churn_date', 'table': 0},
'column2': {'name': 'effective_date', 'table': 1},
'type': 'GTE'}]}],

So I see the table 0 is the main table for the lef join (with churn date) and table 1 is the joined table. But in the recipe inputs the order is different, the first item is the joined table and the second item is the main table:


.get_definition_and_payload().get_recipe_inputs()['main']['items']

So how can I get the name of the datasets/tables when I have the index (0,1,2 etc)?

Thanks

Tagged:

Best Answer

  • spicquenot
    spicquenot Dataiker Posts: 4 Dataiker
    Answer ✓
    Options

    Hello,

    In the join definition you pasted, the first index refer to the virtual inputs (a virtual way of representing the input tables before the joins).

    The following code would help you get the proper table names:


    recipe = <the_recipe_object><BR /><BR />payload_def = recipe.get_definition_and_payload()<BR /><BR /># get the definitions<BR />payload = payload_def.get_json_payload()<BR />join_def = payload["joins"][0] # get the first join definition<BR />virtual_inputs = payload["virtualInputs"]<BR /><BR /># get the virtual input indexes<BR />main_vi_idx = join_def["table1"]<BR />sec_vi_idx = join_def["table2"]<BR /><BR /># get the table input indexes<BR />main_table_idx = virtual_inputs[main_vi_idx]["index"]<BR />sec_table_idx = virtual_inputs[sec_vi_idx]["index"]<BR /><BR /># get all the table inputs<BR />inputs = payload_def.get_recipe_inputs()<BR /><BR /># retrieve the table input names<BR />main_table_name = inputs["main"]["items"][main_table_idx]["ref"]<BR />sec_table_name = inputs["main"]["items"][sec_table_idx]["ref"]

    Regards,

    Sam

Answers

Setup Info
    Tags
      Help me…