Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Good morning,
We would like to print the size of all the datasets in a specific flow zone through Python code in order to monitor the diskspace already taken.
Do you know if there is a way to accomplish this task?
Operating system used: Linux RedHat
You are welcome, please mark the reply as accepted solution.
import dataiku
dataiku_client = dataiku.api_client()
project_handle = dataiku_client.get_default_project()
dataset_list = project_handle.list_datasets()
for dataset in dataset_list:
dataset_size = 0
dataset_handle = project_handle.get_dataset(dataset['name'])
zone_name = dataset_handle.get_zone().name
if dataset_handle.get_info().info.get('status'):
if dataset_handle.get_info().info['status'].get('size'):
if dataset_handle.get_info().info['status']['size'].get('totalValue'):
dataset_size = dataset_handle.get_info().info['status']['size']['totalValue']
print("Dataset: " + dataset['name'] + " - Zone: " + zone_name + " - Size: " + str(dataset_size))
Hi!!!
Great, it works perfectly.....Thank you so much!!
You are welcome, please mark the reply as accepted solution.