delete dataset without clearing the metrics values
Tomas
Registered, Neuron 2022 Posts: 121 ✭✭✭✭✭
Hi,
I am using the python public API to clear the dataset. This deletes the data, but also the history of the metric values.
Is it possible to delete just the data? And one more related question, when the dataset is linked to the folder, the clear() method does not remove the data. Do I have to look up for the folder and do the clear() on the folder object?
prj = client.get_project( pkey )
ds = prj.get_dataset( dsname )
ds.clear()
Thanks
I am using the python public API to clear the dataset. This deletes the data, but also the history of the metric values.
Is it possible to delete just the data? And one more related question, when the dataset is linked to the folder, the clear() method does not remove the data. Do I have to look up for the folder and do the clear() on the folder object?
prj = client.get_project( pkey )
ds = prj.get_dataset( dsname )
ds.clear()
Thanks
Tagged:
Best Answer
-
Hi,
* Clearing a dataset does not clear metrics history. On a partitioned dataset, most metrics become temporarily invisible since they are per-partition, and there are no partitions to show. However, if you export a metrics dataset, you'll see the history
* A "files on folder" dataset is only a view so by design does not remove files; You may indeed clear the folder instead.
Answers
-
This is a non-partitioned dataset. The metrics are visible after the clear() action via UI, but not accessible via the python API
ds.get_metric_history( 'records:COUNT_RECORDS' )['values']
# returns ERROR: com.dataiku.dip.server.controllers.NotFoundException: Metric: 'records:COUNT_RECORDS' does not exist on dataset
After rebuilding the dataset the history is available again via the API.