Accessing root node

weaam7
weaam7 Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 11 ✭✭✭

Hello,

I am wondering if there is a way to access the root node of any specific/selected node in the project flow?

(Root, not the direct predecessor/s)


Operating system used: Windows

Tagged:

Answers

  • Sarina
    Sarina Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 317 Dataiker
    edited July 17

    Hi @weaam7
    ,

    Do you mean the initial upstream dataset for a specific node in the flow? i.e. in this example if you selected the dataset "Orders_by_Country_Category" for example, you would want the "Orders" and "Customers" folders returned?

    Screen Shot 2023-03-06 at 4.28.41 PM.pngYou can recurse through the flow using the flow API, to get all source items for any given "search node". Here is an example:

    import dataiku 
    
    def get_predessor(node, roots):
        if len(node['predecessors']) < 1:
            roots.append(node['ref'])
            return 
        else:
            for node_key in node['predecessors']:
                child_node = graph.nodes[node_key]
                get_predessor(child_node, roots)
    
    roots = []
    
    client = dataiku.api_client()
    # gets our current project graph 
    project = client.get_default_project()
    flow = project.get_flow()
    graph = flow.get_graph()
    
    # replace with your search dataset/recipe ID
    search_item = 'Orders_by_Country_Category'
    search_node = graph.nodes[search_item]
    
    # call our function to recurse through the graph and return all roots of the given search node 
    n = get_predessor(search_node, roots)
    roots


    For the example flow above this returns the folder IDs for the two source folders in my flow:

    Screen Shot 2023-03-06 at 5.07.40 PM.png

    Let me know if you have any questions about this!

    Thanks,
    Sarina 

Setup Info
    Tags
      Help me…