API Returns Only Enabled Metrics After First Dataset Upload
When I upload a dataset for the first time, dataset.get_settings().get_raw().get('metrics') returns enabled metrics only be default. After toggling any metric in the UI (enable/disable), the API then returns ALL metrics including disabled ones.
Is there a way to programmatically initialize all metrics without UI interaction, so the API returns all available metrics immediately after upload?
Answers
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023, Circle Member Posts: 2,630 NeuronSeen this in a few DSS API objects where Dataiku fakes the state of the object when you call the API based on same hardcoded default but only actually persist the object when you make a change (DSS User Profiles behave that way). One way around it is to make a change to the object and save it via the API. Try adding a dummy metric and saving the object:
Make sure you get a new object handle after the save.
-
Thanks for the insight! I tried adding a dummy metric and refreshing the object handle, but it still only returns enabled metrics.
The only thing that works is manually toggling metrics in the UI. Is there a specific API call that replicates what the UI does when you click the metrics toggle?
I also tried modifying the existing default metrics (changing their configuration or enabled state), but that didn't force the full metrics list to appear .
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023, Circle Member Posts: 2,630 NeuronCan you share you API code please? Use a Code Block (the </> icon).
-
col_stats_probe = { 'type': 'col_stats', 'enabled': True, 'computeOnBuildMode': 'No', 'meta': { 'name': 'Columns statistics', 'level': 2 }, 'configuration': { 'aggregates': [] } } dataset_definition = dataset('test').get_definition() ds_metrics_probe = dataset_definition['metrics']['probes'] if not any(p["type"] == "col_stats" for p in ds_metrics_probe): ds_metrics_probe.append(col_stats_probe) for probe in ds_metrics_probe: if probe['type'] == 'basic': probe['type'] = False dataset('test').set_definition(dataset_definition)
I tried get_settings also with save settings -
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023, Circle Member Posts: 2,630 NeuronSo I tested on my side as well and even if you modify an existing metric via the API then the disabled ones don't get added. Looking at the browser console on the metrics dataset screen I can see they get pushed by the GUI itself when you save as the screen calls an internal non-public API passing the 4 disabled missing metrics. So if you want them added you will need to add them via the API yourself. Here is code snippet that adds the 4 missing disabled metrics for a new uploaded dataset:
import dataiku client = dataiku.api_client() project = client.get_default_project() dataset = project.get_dataset('test') dataset_settings = dataset.get_settings() metrics = dataset_settings.get_raw().get('metrics')['probes'] metric_names_list = [item['meta']['name'] for item in metrics] metric_added = False if 'Columns statistics' not in metric_names_list: dataset_settings.get_raw().get('metrics')['probes'].append({'type': 'col_stats', 'enabled': True, 'computeOnBuildMode': 'NO', 'meta': {'name': 'Columns statistics', 'level': 2},'configuration': {'aggregates': []}}) metric_added = True if 'Most frequent values' not in metric_names_list: dataset_settings.get_raw().get('metrics')['probes'].append({'type': 'adv_col_stats', 'enabled': True, 'computeOnBuildMode': 'NO', 'meta': {'name': 'Most frequent values', 'level': 3}, 'configuration': {'aggregates': [], 'numberTopValues': 10}}) metric_added = True if 'Columns percentiles' not in metric_names_list: dataset_settings.get_raw().get('metrics')['probes'].append({'type': 'percentile_stats', 'enabled': True, 'computeOnBuildMode': 'NO', 'meta': {'name': 'Columns percentiles', 'level': 4}, 'configuration': {'aggregates': []}}) metric_added = True if 'Data validity' not in metric_names_list: dataset_settings.get_raw().get('metrics')['probes'].append({'type': 'verify_col', 'enabled': True, 'computeOnBuildMode': 'NO','meta': {'name': 'Data validity', 'level': 4}, 'configuration': {'aggregates': []}}) metric_added = True if metric_added: dataset_settings.save()Note I added them all as enabled, but that was just to test they show on the GUI on refresh. I suspect the metrics may vary depending on the dataset type. If that's the case you will need to inspect the metric probes for the different dataset type with:
dataset_settings.get_raw().get('metrics')['probes']And see how they get defined. Then add them manually for that specific dataset type.