Extract Dataset Names under a TAG

Options
sj0071992
sj0071992 Partner, Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2022, Neuron 2023 Posts: 131 Neuron

Hi Team,

Actually my Workflow is very Huge and i want to extract datasets name under a Tag, so is there any way to get the list of datasets under a TAG.

Thanks in Advance

Best Answer

  • dima_naboka
    dima_naboka Dataiker, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts Posts: 28 Dataiker
    Answer ✓
    Options

    Yes, you have several options. For example, save the output as Pandas dataframe and use pandas.DataFrame.to_excel().

    import dataikuimport pandas as pdclient = dataiku.api_client()project = client.get_project(dataiku.get_custom_variables()["projectKey"])datasets = project.list_datasets()result_dict = {'dataset':[],'tags':[]}for index in range(len(datasets)):if datasets[index]['tags']:result_dict['dataset'].append(datasets[index]['name'])result_dict['tags'].append(datasets[index]['tags'])df = pd.DataFrame(data=result_dict)df.to_excel('output1.xlsx')

    This will save XLSX file into DATA_DIR/jupyter-run/dku-workdirs/MY_PROJECT/recipe_name/ folder

    Screenshot 2021-10-19 at 17.02.12.png

    P.s. If you are running on older version of DSS or code env used to run the notebook uses legacy pandas==0.23 you will need to install xlsxwriter into corresponding code env and perform import xlsxwriter

Answers

  • dima_naboka
    dima_naboka Dataiker, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts Posts: 28 Dataiker
    Options

    Hello,

    You can do this from Dataset menu in GUI

    Screenshot 2021-10-19 at 14.44.37.png

    as well as from a project's notebook

    import dataikuclient = dataiku.api_client()project = client.get_project(dataiku.get_custom_variables()["projectKey"])datasets = project.list_datasets()tag_name = 'sql_dataset'for index in range(len(datasets)):if datasets[index]['tags']:if tag_name in datasets[index]['tags']:print "dataset '{}' is tagged with '{}'".format(datasets[index]['name'],tag_name)

    Screenshot 2021-10-19 at 14.49.19.png

  • sj0071992
    sj0071992 Partner, Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2022, Neuron 2023 Posts: 131 Neuron
    Options

    Hi,

    Can we get the dataset name and corresponding tag in an excel sheet?

Setup Info
    Tags
      Help me…