New Tag for New Plugins

0 Kudos

User Story:

As an Administrator of several Dataiku Instances that make extensive use of the public plugins.  I would like to know which plugins are "New" and which ones have been around for a while.  This would make my periodic review of available plugins easier a more effient now that there are nearly 150 plugins.

COS:

  • Some sort of tag on New-ish Plugins showing that they are New-ish
  • The New Tag is cleared on a published periodic basis.  (Maybe once a quarter.)
  • This is different from the Upgrade Notices on Plugins.  This new tag is designed for folks to find fresh new plugins that may help their business use cases.

Notes:

This could simply be a new plugin Tag New, that gets reviewed by the Dataiku team every time a version update is done.  

--Tom
8 Comments

What version of Dataiku are you on Tom? I will be able to share something around this soon but wanted to be sure you could use it too. 

What version of Dataiku are you on Tom? I will be able to share something around this soon but wanted to be sure you could use it too. 

I’m using version 12.2.2

 

--Tom

I’m using version 12.2.2

 

Hi Tom, when you posted this idea I immediately knew I wanted something like this as indeed it is not easy to keep track of what plugins are available, when they get updated, when new ones come out. Given that Product Ideas have a low probability of making into the product (only 30 of ~450 reported here have been delivered) and I wanted this solved I started to look for possible solutions. I knew DSS fetched the list of plugins somewhere so while I could have asked Dataiku Support to see if they would tell me I armed myself with the excellent Proxyman  and I was able to intercept the SSL traffic and catch the URL that DSS uses to fetch the plugins:

https://update.dataiku.com/dss/11/plugins/list.json

The Plugins URL is version specific so for v12 it would use 12 in the URL. It doesn't work on v10 on below so I suspect this is a new URL used by v11 and above only. The URL got me a nice JSON which with a little bit of Python I produced two datasets: plugins and plugin_releases. 


plugin_releases.PNGplugins.PNG

 

I was wondering how to share this project with others but I just realised it will much easier for me to share the Python recipe which will be here for ever and will not depend on any file sharing tools so here it goes:

 

 

# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
from IPython.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))

# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
import dataiku
from dataiku import pandasutils as pdu
import pandas as pd
import requests, json
from sys import platform

# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
# Specify your CA bundle if your OS supports it
if platform == "linux" or platform == "linux2":
    # linux  
    os.environ["REQUESTS_CA_BUNDLE"] = '/etc/ssl/certs/ca-bundle.crt'
elif platform == "darwin":
    # OS X
    print("Use pip to install certifi")
elif platform == "win32":
    # Windows
    print("Use pip to install certifi")

# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
client = dataiku.api_client()
dataiku_version = client.get_instance_info().raw['dssVersion'].split(".")[0]

if int(dataiku_version) < 11:
    raise Exception('This only works for Dataiku v11 and above')

# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
dataiku_plugins_url = f'https://update.dataiku.com/dss/{dataiku_version}/plugins/list.json'
headers = {'Content-Type': 'application/json'}

# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
status_code = 0
status_reason = ''
status_text = ''
    
try:
    response = requests.get(dataiku_plugins_url, headers=headers, verify=True, timeout=(1, 3))
    
    status_code = response.status_code
    status_reason = response.reason
    status_text = str(status_code) + ' - ' + str(status_reason)

    # Raise an exception if the response status code is not successful
    response.raise_for_status()
    
except requests.exceptions.BaseHTTPError:
    error_text = "Base HTTP Error: " + status_text
except requests.exceptions.HTTPError:
    error_text = "HTTP Error: " + status_text
except requests.exceptions.Timeout:
    error_text = "The request timed out"
except requests.exceptions.ConnectionError:
    error_text = "Connection Error"
except requests.exceptions.RequestException:
    error_text = "Unknown error occurred: " + str(status_code)    
    
# If the request was successful
if status_code == 200:
    # Parse the response as JSON
    json_object = response.json()
    
    # Debug: Print the whole JSON object
    # print(json.dumps(json_object, indent=4))

# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
df_plugins = pd.DataFrame(columns=['ID', 'Label', 'Description', 'Author', 'Icon', 'Size', 'Store_Version', 'URL', 'Download_URL', 'Support_Level', 'License_Info', 'Downloadable', 'tutorials', 'sampleProjects', 'javaPreparationProcessors', 'javaFormulaFunctions', 'customDatasets', 
                                   'customCodeRecipes', 'customPythonProbes', 'customPythonChecks', 'customSQLProbes', 'customFormats', 'customExporters', 'customPythonSteps', 'customPythonTriggers', 'customRunnables', 'customWebApps', 'customFSProviders', 'customDialects', 
                                   'customJythonProcessors', 'customPythonClusters', 'customParameterSets', 'customFields', 'customJavaPolicyHooks', 'customWebAppExpositions', 'customPythonPredictionAlgos', 'customStandardWebAppTemplates', 'customBokehWebAppTemplates', 
                                   'customShinyWebAppTemplates', 'customRMarkdownReportTemplates', 'customPreBuiltNotebookTemplates', 'customPythonNotebookTemplates', 'customRNotebookTemplates', 'customScalaNotebookTemplates', 'customPreBuiltDatasetNotebookTemplates', 
                                   'customPythonDatasetNotebookTemplates', 'customRDatasetNotebookTemplates', 'customScalaDatasetNotebookTemplates'])

df_plugin_releases = pd.DataFrame(columns=['ID', 'Label', 'Version', 'Release_Date_Time', 'Release_Notes'])

for item in json_object['items']:

    for release in item['revisions']:
        plugin_release_record = pd.DataFrame.from_dict({'ID': [item['id']], 'Label': [item['meta'].get('label', '')], 'Version': [release.get('version', '')], 'Release_Date_Time': [release.get('releaseTime', '')], 'Release_Notes': [release.get('releaseNotes', '')]})
        df_plugin_releases = pd.concat([df_plugin_releases, plugin_release_record], ignore_index=True, sort=False)
    
    plugin_record = pd.DataFrame.from_dict({'ID': [item['id']], 'Label': [item['meta'].get('label', '')], 'Description': [item['meta'].get('description', '')], 'Author': [item['meta'].get('author', '')], 'Icon': [item['meta'].get('icon', '')], 'Size': [item['size']],
                                         'Store_Version': [item['storeVersion']], 'URL': [item['meta'].get('url', '')], 'Download_URL': [item['downloadURL']], 'Support_Level': [item['meta'].get('supportLevel', '')], 'License_Info': [item['meta'].get('licenseInfo', '')],
                                         'Downloadable': [item['storeFlags'].get('downloadable', '')], 'tutorials': [len(item['content']['tutorials'])], 'sampleProjects': [len(item['content']['sampleProjects'])], 'javaPreparationProcessors': [len(item['content']['javaPreparationProcessors'])],
                                         'javaFormulaFunctions': [len(item['content']['javaFormulaFunctions'])], 'customDatasets': [len(item['content']['customDatasets'])], 'customCodeRecipes': [len(item['content']['customCodeRecipes'])],
                                         'customPythonProbes': [len(item['content']['customPythonProbes'])], 'customPythonChecks': [len(item['content']['customPythonChecks'])], 'customSQLProbes': [len(item['content']['customSQLProbes'])], 'customFormats': [len(item['content']['customFormats'])],
                                         'customExporters': [len(item['content']['customExporters'])], 'customPythonSteps': [len(item['content']['customPythonSteps'])], 'customPythonTriggers': [len(item['content']['customPythonTriggers'])], 'customRunnables': [len(item['content']['customRunnables'])],
                                         'customWebApps': [len(item['content']['customWebApps'])], 'customFSProviders': [len(item['content']['customFSProviders'])], 'customDialects': [len(item['content']['customDialects'])], 'customJythonProcessors': [len(item['content']['customJythonProcessors'])],
                                         'customPythonClusters': [len(item['content']['customPythonClusters'])], 'customParameterSets': [len(item['content']['customParameterSets'])], 'customFields': [len(item['content']['customFields'])],
                                         'customJavaPolicyHooks': [len(item['content']['customJavaPolicyHooks'])], 'customWebAppExpositions': [len(item['content']['customWebAppExpositions'])], 'customPythonPredictionAlgos': [len(item['content']['customPythonPredictionAlgos'])],
                                         'customStandardWebAppTemplates': [len(item['content']['customStandardWebAppTemplates'])], 'customBokehWebAppTemplates': [len(item['content']['customBokehWebAppTemplates'])], 'customShinyWebAppTemplates': [len(item['content']['customShinyWebAppTemplates'])],
                                         'customRMarkdownReportTemplates': [len(item['content']['customRMarkdownReportTemplates'])], 'customPreBuiltNotebookTemplates': [len(item['content']['customPreBuiltNotebookTemplates'])],
                                         'customPythonNotebookTemplates': [len(item['content']['customPythonNotebookTemplates'])], 'customRNotebookTemplates': [len(item['content']['customRNotebookTemplates'])], 'customScalaNotebookTemplates': [len(item['content']['customScalaNotebookTemplates'])],
                                         'customPreBuiltDatasetNotebookTemplates': [len(item['content']['customPreBuiltDatasetNotebookTemplates'])], 'customPythonDatasetNotebookTemplates': [len(item['content']['customPythonDatasetNotebookTemplates'])],
                                         'customRDatasetNotebookTemplates': [len(item['content']['customRDatasetNotebookTemplates'])], 'customScalaDatasetNotebookTemplates': [len(item['content']['customScalaDatasetNotebookTemplates'])]})

    df_plugins = pd.concat([df_plugins, plugin_record], ignore_index=True, sort=False)

# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
df_plugin_releases['Release_Date_Time'] = pd.to_datetime(df_plugin_releases['Release_Date_Time'],unit='ms')

# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE


# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
# Recipe outputs
plugins = dataiku.Dataset("plugins")
plugins.write_with_schema(df_plugins)
plugin_releases = dataiku.Dataset("plugin_releases")
plugin_releases.write_with_schema(df_plugin_releases)

 

 

So to add this to a project, add a Python recipe, set two outputs as follows: plugins and plugin_releases and click Create Recipe. Run it and you will have the two new datasets populated. Now you have an easy way to explore Dataiku plugins and see when they get changed/released. Obviously the Plugins URL has not been formaly published by Dataiku but considering every DSS v11 and v12 is using this URL I would think it's pretty safe to use, even if unsupported. Also if this project breaks is not the end of the world, we are not trying to predict anything here, it's an information tool.

In our case I think I am going to build a scenario to check for new plugin releases daily or weekly, and then post a notification on a Team's channel so our users and myself get notified when new plugin versions get released.

Hope it helps!

 

 

Hi Tom, when you posted this idea I immediately knew I wanted something like this as indeed it is not easy to keep track of what plugins are available, when they get updated, when new ones come out. Given that Product Ideas have a low probability of making into the product (only 30 of ~450 reported here have been delivered) and I wanted this solved I started to look for possible solutions. I knew DSS fetched the list of plugins somewhere so while I could have asked Dataiku Support to see if they would tell me I armed myself with the excellent Proxyman  and I was able to intercept the SSL traffic and catch the URL that DSS uses to fetch the plugins:

https://update.dataiku.com/dss/11/plugins/list.json

The Plugins URL is version specific so for v12 it would use 12 in the URL. It doesn't work on v10 on below so I suspect this is a new URL used by v11 and above only. The URL got me a nice JSON which with a little bit of Python I produced two datasets: plugins and plugin_releases. 


plugin_releases.PNGplugins.PNG

 

I was wondering how to share this project with others but I just realised it will much easier for me to share the Python recipe which will be here for ever and will not depend on any file sharing tools so here it goes:

 

 

# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
from IPython.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))

# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
import dataiku
from dataiku import pandasutils as pdu
import pandas as pd
import requests, json
from sys import platform

# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
# Specify your CA bundle if your OS supports it
if platform == "linux" or platform == "linux2":
    # linux  
    os.environ["REQUESTS_CA_BUNDLE"] = '/etc/ssl/certs/ca-bundle.crt'
elif platform == "darwin":
    # OS X
    print("Use pip to install certifi")
elif platform == "win32":
    # Windows
    print("Use pip to install certifi")

# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
client = dataiku.api_client()
dataiku_version = client.get_instance_info().raw['dssVersion'].split(".")[0]

if int(dataiku_version) < 11:
    raise Exception('This only works for Dataiku v11 and above')

# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
dataiku_plugins_url = f'https://update.dataiku.com/dss/{dataiku_version}/plugins/list.json'
headers = {'Content-Type': 'application/json'}

# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
status_code = 0
status_reason = ''
status_text = ''
    
try:
    response = requests.get(dataiku_plugins_url, headers=headers, verify=True, timeout=(1, 3))
    
    status_code = response.status_code
    status_reason = response.reason
    status_text = str(status_code) + ' - ' + str(status_reason)

    # Raise an exception if the response status code is not successful
    response.raise_for_status()
    
except requests.exceptions.BaseHTTPError:
    error_text = "Base HTTP Error: " + status_text
except requests.exceptions.HTTPError:
    error_text = "HTTP Error: " + status_text
except requests.exceptions.Timeout:
    error_text = "The request timed out"
except requests.exceptions.ConnectionError:
    error_text = "Connection Error"
except requests.exceptions.RequestException:
    error_text = "Unknown error occurred: " + str(status_code)    
    
# If the request was successful
if status_code == 200:
    # Parse the response as JSON
    json_object = response.json()
    
    # Debug: Print the whole JSON object
    # print(json.dumps(json_object, indent=4))

# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
df_plugins = pd.DataFrame(columns=['ID', 'Label', 'Description', 'Author', 'Icon', 'Size', 'Store_Version', 'URL', 'Download_URL', 'Support_Level', 'License_Info', 'Downloadable', 'tutorials', 'sampleProjects', 'javaPreparationProcessors', 'javaFormulaFunctions', 'customDatasets', 
                                   'customCodeRecipes', 'customPythonProbes', 'customPythonChecks', 'customSQLProbes', 'customFormats', 'customExporters', 'customPythonSteps', 'customPythonTriggers', 'customRunnables', 'customWebApps', 'customFSProviders', 'customDialects', 
                                   'customJythonProcessors', 'customPythonClusters', 'customParameterSets', 'customFields', 'customJavaPolicyHooks', 'customWebAppExpositions', 'customPythonPredictionAlgos', 'customStandardWebAppTemplates', 'customBokehWebAppTemplates', 
                                   'customShinyWebAppTemplates', 'customRMarkdownReportTemplates', 'customPreBuiltNotebookTemplates', 'customPythonNotebookTemplates', 'customRNotebookTemplates', 'customScalaNotebookTemplates', 'customPreBuiltDatasetNotebookTemplates', 
                                   'customPythonDatasetNotebookTemplates', 'customRDatasetNotebookTemplates', 'customScalaDatasetNotebookTemplates'])

df_plugin_releases = pd.DataFrame(columns=['ID', 'Label', 'Version', 'Release_Date_Time', 'Release_Notes'])

for item in json_object['items']:

    for release in item['revisions']:
        plugin_release_record = pd.DataFrame.from_dict({'ID': [item['id']], 'Label': [item['meta'].get('label', '')], 'Version': [release.get('version', '')], 'Release_Date_Time': [release.get('releaseTime', '')], 'Release_Notes': [release.get('releaseNotes', '')]})
        df_plugin_releases = pd.concat([df_plugin_releases, plugin_release_record], ignore_index=True, sort=False)
    
    plugin_record = pd.DataFrame.from_dict({'ID': [item['id']], 'Label': [item['meta'].get('label', '')], 'Description': [item['meta'].get('description', '')], 'Author': [item['meta'].get('author', '')], 'Icon': [item['meta'].get('icon', '')], 'Size': [item['size']],
                                         'Store_Version': [item['storeVersion']], 'URL': [item['meta'].get('url', '')], 'Download_URL': [item['downloadURL']], 'Support_Level': [item['meta'].get('supportLevel', '')], 'License_Info': [item['meta'].get('licenseInfo', '')],
                                         'Downloadable': [item['storeFlags'].get('downloadable', '')], 'tutorials': [len(item['content']['tutorials'])], 'sampleProjects': [len(item['content']['sampleProjects'])], 'javaPreparationProcessors': [len(item['content']['javaPreparationProcessors'])],
                                         'javaFormulaFunctions': [len(item['content']['javaFormulaFunctions'])], 'customDatasets': [len(item['content']['customDatasets'])], 'customCodeRecipes': [len(item['content']['customCodeRecipes'])],
                                         'customPythonProbes': [len(item['content']['customPythonProbes'])], 'customPythonChecks': [len(item['content']['customPythonChecks'])], 'customSQLProbes': [len(item['content']['customSQLProbes'])], 'customFormats': [len(item['content']['customFormats'])],
                                         'customExporters': [len(item['content']['customExporters'])], 'customPythonSteps': [len(item['content']['customPythonSteps'])], 'customPythonTriggers': [len(item['content']['customPythonTriggers'])], 'customRunnables': [len(item['content']['customRunnables'])],
                                         'customWebApps': [len(item['content']['customWebApps'])], 'customFSProviders': [len(item['content']['customFSProviders'])], 'customDialects': [len(item['content']['customDialects'])], 'customJythonProcessors': [len(item['content']['customJythonProcessors'])],
                                         'customPythonClusters': [len(item['content']['customPythonClusters'])], 'customParameterSets': [len(item['content']['customParameterSets'])], 'customFields': [len(item['content']['customFields'])],
                                         'customJavaPolicyHooks': [len(item['content']['customJavaPolicyHooks'])], 'customWebAppExpositions': [len(item['content']['customWebAppExpositions'])], 'customPythonPredictionAlgos': [len(item['content']['customPythonPredictionAlgos'])],
                                         'customStandardWebAppTemplates': [len(item['content']['customStandardWebAppTemplates'])], 'customBokehWebAppTemplates': [len(item['content']['customBokehWebAppTemplates'])], 'customShinyWebAppTemplates': [len(item['content']['customShinyWebAppTemplates'])],
                                         'customRMarkdownReportTemplates': [len(item['content']['customRMarkdownReportTemplates'])], 'customPreBuiltNotebookTemplates': [len(item['content']['customPreBuiltNotebookTemplates'])],
                                         'customPythonNotebookTemplates': [len(item['content']['customPythonNotebookTemplates'])], 'customRNotebookTemplates': [len(item['content']['customRNotebookTemplates'])], 'customScalaNotebookTemplates': [len(item['content']['customScalaNotebookTemplates'])],
                                         'customPreBuiltDatasetNotebookTemplates': [len(item['content']['customPreBuiltDatasetNotebookTemplates'])], 'customPythonDatasetNotebookTemplates': [len(item['content']['customPythonDatasetNotebookTemplates'])],
                                         'customRDatasetNotebookTemplates': [len(item['content']['customRDatasetNotebookTemplates'])], 'customScalaDatasetNotebookTemplates': [len(item['content']['customScalaDatasetNotebookTemplates'])]})

    df_plugins = pd.concat([df_plugins, plugin_record], ignore_index=True, sort=False)

# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
df_plugin_releases['Release_Date_Time'] = pd.to_datetime(df_plugin_releases['Release_Date_Time'],unit='ms')

# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE


# -------------------------------------------------------------------------------- NOTEBOOK-CELL: CODE
# Recipe outputs
plugins = dataiku.Dataset("plugins")
plugins.write_with_schema(df_plugins)
plugin_releases = dataiku.Dataset("plugin_releases")
plugin_releases.write_with_schema(df_plugin_releases)

 

 

So to add this to a project, add a Python recipe, set two outputs as follows: plugins and plugin_releases and click Create Recipe. Run it and you will have the two new datasets populated. Now you have an easy way to explore Dataiku plugins and see when they get changed/released. Obviously the Plugins URL has not been formaly published by Dataiku but considering every DSS v11 and v12 is using this URL I would think it's pretty safe to use, even if unsupported. Also if this project breaks is not the end of the world, we are not trying to predict anything here, it's an information tool.

In our case I think I am going to build a scenario to check for new plugin releases daily or weekly, and then post a notification on a Team's channel so our users and myself get notified when new plugin versions get released.

Hope it helps!

 

 

@Turribeach 

Thanks for sharing. I’ll try to reproduce this over the coming few days.  

--Tom

@Turribeach 

Thanks for sharing. I’ll try to reproduce this over the coming few days.  

One of the things about this data is that all of the dates for a single plugin are sometimes exactly the same for all of the version of the plugin.

It appears that prior to 2/21/2023 all of the versions are the same.

 

--Tom

One of the things about this data is that all of the dates for a single plugin are sometimes exactly the same for all of the version of the plugin.

It appears that prior to 2/21/2023 all of the versions are the same.

 

On Mac OS I did not need seem to need to run: 

Use pip to install certifi

 

--Tom

On Mac OS I did not need seem to need to run: 

Use pip to install certifi

 

One of the things about this data is that all of the dates for a single plugin are sometimes exactly the same for all of the version of the plugin.

It appears that prior to 2/21/2023 all of the versions are the same.

>> My educated guess is that this data wasn't being collected previously and that this year the data structure was defined and created with v11. So anything historic they didn't bother to correct the dates but going forward it looks like the releases have the correct dates.

On Mac OS I did not need seem to need to run:

Use pip to install certifi

>> MacOS does not have an OS certificate store that Python can use. Mac Apps use the Keychain but I believe this is not available to Python. There are a number of Python packages that provide a root certifcate store bundle for Python packages to use. I guess you already have it installed or have one of the other packages as part of another package requirement, check with "pip list".

One of the things about this data is that all of the dates for a single plugin are sometimes exactly the same for all of the version of the plugin.

It appears that prior to 2/21/2023 all of the versions are the same.

>> My educated guess is that this data wasn't being collected previously and that this year the data structure was defined and created with v11. So anything historic they didn't bother to correct the dates but going forward it looks like the releases have the correct dates.

On Mac OS I did not need seem to need to run:

Use pip to install certifi

>> MacOS does not have an OS certificate store that Python can use. Mac Apps use the Keychain but I believe this is not available to Python. There are a number of Python packages that provide a root certifcate store bundle for Python packages to use. I guess you already have it installed or have one of the other packages as part of another package requirement, check with "pip list".

@Turribeach,

Thanks for the wonderful work around.  I've already found a plug-in that I did not know about that may be useful to me.

That all said.  It would be nice if the Dataiku team would consider upgrades to the UI that makes new plugin goodness more clear to users.  Particularly plugins that we don't already have installed.

 

--Tom

@Turribeach,

Thanks for the wonderful work around.  I've already found a plug-in that I did not know about that may be useful to me.

That all said.  It would be nice if the Dataiku team would consider upgrades to the UI that makes new plugin goodness more clear to users.  Particularly plugins that we don't already have installed.