New Tag for New Plugins

tgb417
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron

User Story:

As an Administrator of several Dataiku Instances that make extensive use of the public plugins. I would like to know which plugins are "New" and which ones have been around for a while. This would make my periodic review of available plugins easier a more effient now that there are nearly 150 plugins.

COS:

  • Some sort of tag on New-ish Plugins showing that they are New-ish
  • The New Tag is cleared on a published periodic basis. (Maybe once a quarter.)
  • This is different from the Upgrade Notices on Plugins. This new tag is designed for folks to find fresh new plugins that may help their business use cases.

Notes:

This could simply be a new plugin Tag New, that gets reviewed by the Dataiku team every time a version update is done.

0
0 votes

New · Last Updated

Comments

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,978 Neuron

    What version of Dataiku are you on Tom? I will be able to share something around this soon but wanted to be sure you could use it too.

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron

    I’m using version 12.2.2

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,978 Neuron
    edited July 17

    Hi Tom, when you posted this idea I immediately knew I wanted something like this as indeed it is not easy to keep track of what plugins are available, when they get updated, when new ones come out. Given that Product Ideas have a low probability of making into the product (only 30 of ~450 reported here have been delivered) and I wanted this solved I started to look for possible solutions. I knew DSS fetched the list of plugins somewhere so while I could have asked Dataiku Support to see if they would tell me I armed myself with the excellent Proxyman and I was able to intercept the SSL traffic and catch the URL that DSS uses to fetch the plugins:

    https://update.dataiku.com/dss/11/plugins/list.json

    The Plugins URL is version specific so for v12 it would use 12 in the URL. It doesn't work on v10 on below so I suspect this is a new URL used by v11 and above only. The URL got me a nice JSON which with a little bit of Python I produced two datasets: plugins and plugin_releases.


    plugin_releases.PNGplugins.PNG

    I was wondering how to share this project with others but I just realised it will much easier for me to share the Python recipe which will be here for ever and will not depend on any file sharing tools so here it goes:

    # -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
    from IPython.display import display, HTML
    display(HTML("<style>.container { width:100% !important; }</style>"))
    
    # -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
    import dataiku
    from dataiku import pandasutils as pdu
    import pandas as pd
    import requests, json
    from sys import platform
    
    # -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
    # Specify your CA bundle if your OS supports it
    if platform == "linux" or platform == "linux2":
        # linux  
        os.environ["REQUESTS_CA_BUNDLE"] = '/etc/ssl/certs/ca-bundle.crt'
    elif platform == "darwin":
        # OS X
        print("Use pip to install certifi")
    elif platform == "win32":
        # Windows
        print("Use pip to install certifi")
    
    # -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
    client = dataiku.api_client()
    dataiku_version = client.get_instance_info().raw['dssVersion'].split(".")[0]
    
    if int(dataiku_version) < 11:
        raise Exception('This only works for Dataiku v11 and above')
    
    # -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
    dataiku_plugins_url = f'https://update.dataiku.com/dss/{dataiku_version}/plugins/list.json'
    headers = {'Content-Type': 'application/json'}
    
    # -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
    status_code = 0
    status_reason = ''
    status_text = ''
        
    try:
        response = requests.get(dataiku_plugins_url, headers=headers, verify=True, timeout=(1, 3))
        
        status_code = response.status_code
        status_reason = response.reason
        status_text = str(status_code) + ' - ' + str(status_reason)
    
        # Raise an exception if the response status code is not successful
        response.raise_for_status()
        
    except requests.exceptions.BaseHTTPError:
        error_text = "Base HTTP Error: " + status_text
    except requests.exceptions.HTTPError:
        error_text = "HTTP Error: " + status_text
    except requests.exceptions.Timeout:
        error_text = "The request timed out"
    except requests.exceptions.ConnectionError:
        error_text = "Connection Error"
    except requests.exceptions.RequestException:
        error_text = "Unknown error occurred: " + str(status_code)    
        
    # If the request was successful
    if status_code == 200:
        # Parse the response as JSON
        json_object = response.json()
        
        # Debug: Print the whole JSON object
        # print(json.dumps(json_object, indent=4))
    
    # -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
    df_plugins = pd.DataFrame(columns=['ID', 'Label', 'Description', 'Author', 'Icon', 'Size', 'Store_Version', 'URL', 'Download_URL', 'Support_Level', 'License_Info', 'Downloadable', 'tutorials', 'sampleProjects', 'javaPreparationProcessors', 'javaFormulaFunctions', 'customDatasets', 
                                       'customCodeRecipes', 'customPythonProbes', 'customPythonChecks', 'customSQLProbes', 'customFormats', 'customExporters', 'customPythonSteps', 'customPythonTriggers', 'customRunnables', 'customWebApps', 'customFSProviders', 'customDialects', 
                                       'customJythonProcessors', 'customPythonClusters', 'customParameterSets', 'customFields', 'customJavaPolicyHooks', 'customWebAppExpositions', 'customPythonPredictionAlgos', 'customStandardWebAppTemplates', 'customBokehWebAppTemplates', 
                                       'customShinyWebAppTemplates', 'customRMarkdownReportTemplates', 'customPreBuiltNotebookTemplates', 'customPythonNotebookTemplates', 'customRNotebookTemplates', 'customScalaNotebookTemplates', 'customPreBuiltDatasetNotebookTemplates', 
                                       'customPythonDatasetNotebookTemplates', 'customRDatasetNotebookTemplates', 'customScalaDatasetNotebookTemplates'])
    
    df_plugin_releases = pd.DataFrame(columns=['ID', 'Label', 'Version', 'Release_Date_Time', 'Release_Notes'])
    
    for item in json_object['items']:
    
        for release in item['revisions']:
            plugin_release_record = pd.DataFrame.from_dict({'ID': [item['id']], 'Label': [item['meta'].get('label', '')], 'Version': [release.get('version', '')], 'Release_Date_Time': [release.get('releaseTime', '')], 'Release_Notes': [release.get('releaseNotes', '')]})
            df_plugin_releases = pd.concat([df_plugin_releases, plugin_release_record], ignore_index=True, sort=False)
        
        plugin_record = pd.DataFrame.from_dict({'ID': [item['id']], 'Label': [item['meta'].get('label', '')], 'Description': [item['meta'].get('description', '')], 'Author': [item['meta'].get('author', '')], 'Icon': [item['meta'].get('icon', '')], 'Size': [item['size']],
                                             'Store_Version': [item['storeVersion']], 'URL': [item['meta'].get('url', '')], 'Download_URL': [item['downloadURL']], 'Support_Level': [item['meta'].get('supportLevel', '')], 'License_Info': [item['meta'].get('licenseInfo', '')],
                                             'Downloadable': [item['storeFlags'].get('downloadable', '')], 'tutorials': [len(item['content']['tutorials'])], 'sampleProjects': [len(item['content']['sampleProjects'])], 'javaPreparationProcessors': [len(item['content']['javaPreparationProcessors'])],
                                             'javaFormulaFunctions': [len(item['content']['javaFormulaFunctions'])], 'customDatasets': [len(item['content']['customDatasets'])], 'customCodeRecipes': [len(item['content']['customCodeRecipes'])],
                                             'customPythonProbes': [len(item['content']['customPythonProbes'])], 'customPythonChecks': [len(item['content']['customPythonChecks'])], 'customSQLProbes': [len(item['content']['customSQLProbes'])], 'customFormats': [len(item['content']['customFormats'])],
                                             'customExporters': [len(item['content']['customExporters'])], 'customPythonSteps': [len(item['content']['customPythonSteps'])], 'customPythonTriggers': [len(item['content']['customPythonTriggers'])], 'customRunnables': [len(item['content']['customRunnables'])],
                                             'customWebApps': [len(item['content']['customWebApps'])], 'customFSProviders': [len(item['content']['customFSProviders'])], 'customDialects': [len(item['content']['customDialects'])], 'customJythonProcessors': [len(item['content']['customJythonProcessors'])],
                                             'customPythonClusters': [len(item['content']['customPythonClusters'])], 'customParameterSets': [len(item['content']['customParameterSets'])], 'customFields': [len(item['content']['customFields'])],
                                             'customJavaPolicyHooks': [len(item['content']['customJavaPolicyHooks'])], 'customWebAppExpositions': [len(item['content']['customWebAppExpositions'])], 'customPythonPredictionAlgos': [len(item['content']['customPythonPredictionAlgos'])],
                                             'customStandardWebAppTemplates': [len(item['content']['customStandardWebAppTemplates'])], 'customBokehWebAppTemplates': [len(item['content']['customBokehWebAppTemplates'])], 'customShinyWebAppTemplates': [len(item['content']['customShinyWebAppTemplates'])],
                                             'customRMarkdownReportTemplates': [len(item['content']['customRMarkdownReportTemplates'])], 'customPreBuiltNotebookTemplates': [len(item['content']['customPreBuiltNotebookTemplates'])],
                                             'customPythonNotebookTemplates': [len(item['content']['customPythonNotebookTemplates'])], 'customRNotebookTemplates': [len(item['content']['customRNotebookTemplates'])], 'customScalaNotebookTemplates': [len(item['content']['customScalaNotebookTemplates'])],
                                             'customPreBuiltDatasetNotebookTemplates': [len(item['content']['customPreBuiltDatasetNotebookTemplates'])], 'customPythonDatasetNotebookTemplates': [len(item['content']['customPythonDatasetNotebookTemplates'])],
                                             'customRDatasetNotebookTemplates': [len(item['content']['customRDatasetNotebookTemplates'])], 'customScalaDatasetNotebookTemplates': [len(item['content']['customScalaDatasetNotebookTemplates'])]})
    
        df_plugins = pd.concat([df_plugins, plugin_record], ignore_index=True, sort=False)
    
    # -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
    df_plugin_releases['Release_Date_Time'] = pd.to_datetime(df_plugin_releases['Release_Date_Time'],unit='ms')
    
    # -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
    
    
    # -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
    # Recipe outputs
    plugins = dataiku.Dataset("plugins")
    plugins.write_with_schema(df_plugins)
    plugin_releases = dataiku.Dataset("plugin_releases")
    plugin_releases.write_with_schema(df_plugin_releases)

    So to add this to a project, add a Python recipe, set two outputs as follows: plugins and plugin_releases and click Create Recipe. Run it and you will have the two new datasets populated. Now you have an easy way to explore Dataiku plugins and see when they get changed/released. Obviously the Plugins URL has not been formaly published by Dataiku but considering every DSS v11 and v12 is using this URL I would think it's pretty safe to use, even if unsupported. Also if this project breaks is not the end of the world, we are not trying to predict anything here, it's an information tool.

    In our case I think I am going to build a scenario to check for new plugin releases daily or weekly, and then post a notification on a Team's channel so our users and myself get notified when new plugin versions get released.

    Hope it helps!

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron

    @Turribeach

    Thanks for sharing. I’ll try to reproduce this over the coming few days.

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron

    One of the things about this data is that all of the dates for a single plugin are sometimes exactly the same for all of the version of the plugin.

    It appears that prior to 2/21/2023 all of the versions are the same.

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron

    On Mac OS I did not need seem to need to run:

    Use pip to install certifi

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,978 Neuron

    One of the things about this data is that all of the dates for a single plugin are sometimes exactly the same for all of the version of the plugin.

    It appears that prior to 2/21/2023 all of the versions are the same.

    >> My educated guess is that this data wasn't being collected previously and that this year the data structure was defined and created with v11. So anything historic they didn't bother to correct the dates but going forward it looks like the releases have the correct dates.

    On Mac OS I did not need seem to need to run:

    Use pip to install certifi

    >> MacOS does not have an OS certificate store that Python can use. Mac Apps use the Keychain but I believe this is not available to Python. There are a number of Python packages that provide a root certifcate store bundle for Python packages to use. I guess you already have it installed or have one of the other packages as part of another package requirement, check with "pip list".

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron

    @Turribeach
    ,

    Thanks for the wonderful work around. I've already found a plug-in that I did not know about that may be useful to me.

    That all said. It would be nice if the Dataiku team would consider upgrades to the UI that makes new plugin goodness more clear to users. Particularly plugins that we don't already have installed.

Setup Info
    Tags
      Help me…