Research by data

Tuong-Vi
Tuong-Vi Partner, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Neuron 2020, Dataiku DSS Adv Designer, Registered, Neuron 2021, Neuron 2022 Posts: 33 Partner

Hello community,

I know there is alerady a data catalog in DSS, but I think it will be very useful to have this functionnality at project level :

Capture_dss.PNG

I can find a particular recipie or dataset, but I would like to see as well where a data is stored and used in the entire Flow.

Have a nice day

1
1 votes

In the Backlog · Last Updated

Comments

  • ElisaS
    ElisaS Dataiker, Registered, Product Ideas Manager Posts: 15 Dataiker

    Hi @Tuong-Vi
    !

    I'm not sure I fully understand the feature request here. You would like to be able to see all available datasets from this search bar (=same results as in the Data Catalog) or is it something else linked to the datasets used in the project ?

    Elisa

  • Tuong-Vi
    Tuong-Vi Partner, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Neuron 2020, Dataiku DSS Adv Designer, Registered, Neuron 2021, Neuron 2022 Posts: 33 Partner

    Hello @ElisaS
    ,

    Actually, at Project level, when I want to see if a particular data exists or is used, i have to choose one dataset, click on it and use the panel bar to search if the column name (for example "product_name") is in the schema :

    Sans titre.png

    some users have told me that it will be useful to have a search bar at project level (or why not another option "data" in the data catalog) to see quickly all the datasets having "product_name" in their schema.

    Hope it will be helpful,

    Tuong-Vi

  • ElisaS
    ElisaS Dataiker, Registered, Product Ideas Manager Posts: 15 Dataiker

    Hi @Tuong-Vi
    ,

    Thanks for the clarification. When you enter a column name in the Catalog's search bar, you will get the datasets that have this column but it may not be very clear in the UI since we don't display the column in question.

    I have added the request to have this at the project level in our backlog. We can't provide a timeline at this point, but be sure to check back for updates!

    Elisa

  • natejgardner
    natejgardner Neuron, Registered, Neuron 2022, Neuron 2023 Posts: 151 Neuron

    I agree, this would be a really helpful feature, especially if computed columns were also searchable here. One of my biggest challenges in large projects with hundreds of datasets is finding my custom columns so they can be copied onto other datasets. But in general, being able to quickly check which datasets in my project have a particular field would narrow down searches a lot. I've run into the need to search for datasets that contain a particular column within my project at least three times in the last month. It'd be really powerful if the search feature could be upgraded to index column-level metadata, like names, types, source, and, if already collected, column-level metrics. My current main project has over 500 datasets, spread over about 20 flow zones, so any indexing and search features to make projects of that scale more navigable are very welcome in my team!

Setup Info
    Tags
      Help me…