How to find shared-in datasets of a project with Python API?

Haoran
Haoran Registered Posts: 8 ✭✭✭

Thanks for your time at the beginning.

I have a project and I want to know which datasets are shared-in from other projects (black icons) with Python API.

image.png

I look through the previous Q&A, only find the way to find those which are shared-out to other project:

def find_exposed_datasets(project):
result = []
raw = project.get_settings().get_raw()
exposed_objects = raw['exposedObjects']['objects']
for obj in exposed_objects:
if obj['type'] == 'DATASET':
result.append(obj)
# # print out share-to projects
# for item in obj['rules']:
# result.append(item['targetProject'])
return result


Thanks for your help!

Operating system used: Win11 enterprise

Best Answers

Answers

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023, Circle Member Posts: 2,626 Neuron

    I don't believe it's possible to get this at instance level so you will to call client.list_project_keys() and loop through every project exposed objects to collect the whole list so you can then do a lookup per project.

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023, Circle Member Posts: 2,626 Neuron

    Indeed that does what I said you should do which is to loop through every project. Personally I wouldn't call this in a function since it can take some time to run in large instances. I will build a small flow that refreshes a table every hour or so. Then you can call your function or report using this metadata.

Setup Info
    Tags
      Help me…