Printing Dataset Size

Solved!
Mrcello89
Level 2
Printing Dataset Size

Good morning, 

We would like to print the size of all the datasets in a specific flow zone through Python code in order to monitor the diskspace already taken.

 

Do you know if there is a way to accomplish this task?


Operating system used: Linux RedHat

0 Kudos
1 Solution

You are welcome, please mark the reply as accepted solution.

View solution in original post

0 Kudos
3 Replies
Turribeach
import dataiku
dataiku_client = dataiku.api_client()
project_handle = dataiku_client.get_default_project()
dataset_list = project_handle.list_datasets()

for dataset in dataset_list:
    dataset_size = 0
    dataset_handle = project_handle.get_dataset(dataset['name'])
    zone_name = dataset_handle.get_zone().name
    if dataset_handle.get_info().info.get('status'):
        if dataset_handle.get_info().info['status'].get('size'):
            if dataset_handle.get_info().info['status']['size'].get('totalValue'):
                dataset_size = dataset_handle.get_info().info['status']['size']['totalValue']
    print("Dataset: " + dataset['name'] + " - Zone: " + zone_name + " - Size: " + str(dataset_size))
Mrcello89
Level 2
Author

Hi!!!

 

Great, it works perfectly.....Thank you so much!!

0 Kudos

You are welcome, please mark the reply as accepted solution.

0 Kudos