Survey banner
Share your feedback on the Dataiku documentation with this 5 min survey. Thanks! TAKE THE SURVEY

Accessing root node

weaam7
Level 2
Accessing root node

Hello,

I am wondering if there is a way to access the root node of any specific/selected node in the project flow? 

(Root, not the direct predecessor/s) 


Operating system used: Windows

0 Kudos
1 Reply
SarinaS
Dataiker

Hi @weaam7,

Do you mean the initial upstream dataset for a specific node in the flow? i.e. in this example if you selected the dataset "Orders_by_Country_Category" for example, you would want the "Orders" and "Customers" folders returned?

Screen Shot 2023-03-06 at 4.28.41 PM.pngYou can recurse through the flow using the flow API, to get all source items for any given "search node". Here is an example:

import dataiku 

def get_predessor(node, roots):
    if len(node['predecessors']) < 1:
        roots.append(node['ref'])
        return 
    else:
        for node_key in node['predecessors']:
            child_node = graph.nodes[node_key]
            get_predessor(child_node, roots)

roots = []

client = dataiku.api_client()
# gets our current project graph 
project = client.get_default_project()
flow = project.get_flow()
graph = flow.get_graph()

# replace with your search dataset/recipe ID
search_item = 'Orders_by_Country_Category'
search_node = graph.nodes[search_item]

# call our function to recurse through the graph and return all roots of the given search node 
n = get_predessor(search_node, roots)
roots

 
For the example flow above this returns the folder IDs for the two source folders in my flow:

Screen Shot 2023-03-06 at 5.07.40 PM.png

Let me know if you have any questions about this!

Thanks,
Sarina 

0 Kudos