Printing Dataset Size

Mrcello89
Mrcello89 Registered Posts: 5

Good morning,

We would like to print the size of all the datasets in a specific flow zone through Python code in order to monitor the diskspace already taken.

Do you know if there is a way to accomplish this task?


Operating system used: Linux RedHat

Tagged:

Best Answer

Answers

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,090 Neuron
    edited July 17
    import dataiku
    dataiku_client = dataiku.api_client()
    project_handle = dataiku_client.get_default_project()
    dataset_list = project_handle.list_datasets()
    
    for dataset in dataset_list:
        dataset_size = 0
        dataset_handle = project_handle.get_dataset(dataset['name'])
        zone_name = dataset_handle.get_zone().name
        if dataset_handle.get_info().info.get('status'):
            if dataset_handle.get_info().info['status'].get('size'):
                if dataset_handle.get_info().info['status']['size'].get('totalValue'):
                    dataset_size = dataset_handle.get_info().info['status']['size']['totalValue']
        print("Dataset: " + dataset['name'] + " - Zone: " + zone_name + " - Size: " + str(dataset_size))
  • Mrcello89
    Mrcello89 Registered Posts: 5

    Hi!!!

    Great, it works perfectly.....Thank you so much!!

Setup Info
    Tags
      Help me…