Auditing projects by tags and statuses

gskoff
gskoff Partner, Registered Posts: 8 Partner

Once we properly tag projects (using global tags), what options do we have as an admin to audit which projects have certain tags? Certain project statuses? As a user, we can search by these in the project-list page, but is there any way to export/share/archive the results? Thanks!

Tagged:

Best Answer

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,166 Neuron
    edited July 17 Answer ✓

    This should do:

    import datetime
    import dataiku
    import pandas as pd, numpy as np
    from dataiku import pandasutils as pdu
    
    client = dataiku.api_client()
    project_keys = client.list_project_keys()
    
    df_project_data = pd.DataFrame(columns=['project_key', 'project_name', 'project_status', 'project_tags', 'has_active_scenarios', 'project_owner', 'project_created_by', 'project_created_on', 'project_last_modified_on'])
    
    for project_key in project_keys:
        project = client.get_project(project_key)
        all_scenarios = project.list_scenarios()
        project_status = project.get_settings().get_raw().get('projectStatus')
        project_summary = project.get_summary()
        project_name = project_summary['name']
        project_owner = project_summary['ownerLogin']
        if project_summary.get('creationTag', ''):
            project_created_by = project_summary['creationTag']['lastModifiedBy']['login']
            project_created_on = datetime.datetime.utcfromtimestamp(int(project_summary['creationTag']['lastModifiedOn']) / 1000).strftime("%d-%b-%Y %H:%M:%S")
        else:
            project_created_by = ''
            project_created_on = ''
        project_last_modified_on = datetime.datetime.utcfromtimestamp(int(project_summary['versionTag']['lastModifiedOn']) / 1000).strftime("%d-%b-%Y %H:%M:%S")
        project_tags = list(project.get_tags()['tags'].keys())
        has_active_scenarios = ''
        for scenario in all_scenarios:
            scn_id = scenario['id']
            scn_settings = project.get_scenario(scenario.get('id')).get_settings()
            scn_settings_raw = scn_settings.get_raw()
            if scn_settings_raw['active'] == True:
                active_triggers = [x for x in scn_settings_raw.get('triggers') if x['active'] == True]
                if active_triggers:
                    has_active_scenarios = 'Yes'
                    break
        data_record = pd.DataFrame.from_dict({'project_key': [project_key], 'project_name': [project_name], 'project_status': [project_status], 'project_tags': [project_tags], 'has_active_scenarios': [has_active_scenarios],
                                              'project_owner': [project_owner], 'project_created_by': [project_created_by], 'project_created_on': [project_created_on], 'project_last_modified_on': [project_last_modified_on]})
        df_project_data = pd.concat([df_project_data, data_record], ignore_index=True)

Answers

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,166 Neuron

    Project tags can be searched using the top right Search DSS box. Search for a tag, then click on the DSS Items tag and then you can search projects by tag (see below). I am not aware of any way to search by project status. But in any case both projects tags and status are available in the Python API so it's fairly easy to build a dataset with all of them and then use the Explore, Dashboards, export to Excel to do any filtering you desire.

    Screenshot 2024-01-22 at 19.06.59.png

  • gskoff
    gskoff Partner, Registered Posts: 8 Partner

    Thanks for the reply. Are there any example scripts available to pull the metadata for the projects on an instance using the Python API? Project name, tags, status, project creator, create date, last update date?

  • gskoff
    gskoff Partner, Registered Posts: 8 Partner

    THANK YOU! This worked almost flawlessly for me. The only minor issue I ran into was that the project_summary['creationTag'] didn't exist for at least one project on our instance. I got around that with a try/except and gave it a default value on the exception.

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,166 Neuron

    You are welcome. I fixed the creationTag issue with an If. It happens for very old projects that don't have the creationTag and it looks like Dataiku did not bother to at least create the tag when the projects were migrated.

Setup Info
    Tags
      Help me…