Auditing projects by tags and statuses
Once we properly tag projects (using global tags), what options do we have as an admin to audit which projects have certain tags? Certain project statuses? As a user, we can search by these in the project-list page, but is there any way to export/share/archive the results? Thanks!
Best Answer
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,166 Neuron
This should do:
import datetime import dataiku import pandas as pd, numpy as np from dataiku import pandasutils as pdu client = dataiku.api_client() project_keys = client.list_project_keys() df_project_data = pd.DataFrame(columns=['project_key', 'project_name', 'project_status', 'project_tags', 'has_active_scenarios', 'project_owner', 'project_created_by', 'project_created_on', 'project_last_modified_on']) for project_key in project_keys: project = client.get_project(project_key) all_scenarios = project.list_scenarios() project_status = project.get_settings().get_raw().get('projectStatus') project_summary = project.get_summary() project_name = project_summary['name'] project_owner = project_summary['ownerLogin'] if project_summary.get('creationTag', ''): project_created_by = project_summary['creationTag']['lastModifiedBy']['login'] project_created_on = datetime.datetime.utcfromtimestamp(int(project_summary['creationTag']['lastModifiedOn']) / 1000).strftime("%d-%b-%Y %H:%M:%S") else: project_created_by = '' project_created_on = '' project_last_modified_on = datetime.datetime.utcfromtimestamp(int(project_summary['versionTag']['lastModifiedOn']) / 1000).strftime("%d-%b-%Y %H:%M:%S") project_tags = list(project.get_tags()['tags'].keys()) has_active_scenarios = '' for scenario in all_scenarios: scn_id = scenario['id'] scn_settings = project.get_scenario(scenario.get('id')).get_settings() scn_settings_raw = scn_settings.get_raw() if scn_settings_raw['active'] == True: active_triggers = [x for x in scn_settings_raw.get('triggers') if x['active'] == True] if active_triggers: has_active_scenarios = 'Yes' break data_record = pd.DataFrame.from_dict({'project_key': [project_key], 'project_name': [project_name], 'project_status': [project_status], 'project_tags': [project_tags], 'has_active_scenarios': [has_active_scenarios], 'project_owner': [project_owner], 'project_created_by': [project_created_by], 'project_created_on': [project_created_on], 'project_last_modified_on': [project_last_modified_on]}) df_project_data = pd.concat([df_project_data, data_record], ignore_index=True)
Answers
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,166 Neuron
Project tags can be searched using the top right Search DSS box. Search for a tag, then click on the DSS Items tag and then you can search projects by tag (see below). I am not aware of any way to search by project status. But in any case both projects tags and status are available in the Python API so it's fairly easy to build a dataset with all of them and then use the Explore, Dashboards, export to Excel to do any filtering you desire.
-
Thanks for the reply. Are there any example scripts available to pull the metadata for the projects on an instance using the Python API? Project name, tags, status, project creator, create date, last update date?
-
THANK YOU! This worked almost flawlessly for me. The only minor issue I ran into was that the project_summary['creationTag'] didn't exist for at least one project on our instance. I got around that with a try/except and gave it a default value on the exception.
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,166 Neuron
You are welcome. I fixed the creationTag issue with an If. It happens for very old projects that don't have the creationTag and it looks like Dataiku did not bother to at least create the tag when the projects were migrated.