Research by data
Hello community,
I know there is alerady a data catalog in DSS, but I think it will be very useful to have this functionnality at project level :
I can find a particular recipie or dataset, but I would like to see as well where a data is stored and used in the entire Flow.
Have a nice day
Comments
-
Hi @Tuong-Vi
!
I'm not sure I fully understand the feature request here. You would like to be able to see all available datasets from this search bar (=same results as in the Data Catalog) or is it something else linked to the datasets used in the project ?Elisa
-
Tuong-Vi Partner, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Neuron 2020, Dataiku DSS Adv Designer, Registered, Neuron 2021, Neuron 2022 Posts: 33 Partner
Hello @ElisaS
,Actually, at Project level, when I want to see if a particular data exists or is used, i have to choose one dataset, click on it and use the panel bar to search if the column name (for example "product_name") is in the schema :
some users have told me that it will be useful to have a search bar at project level (or why not another option "data" in the data catalog) to see quickly all the datasets having "product_name" in their schema.
Hope it will be helpful,
Tuong-Vi
-
Hi @Tuong-Vi
,Thanks for the clarification. When you enter a column name in the Catalog's search bar, you will get the datasets that have this column but it may not be very clear in the UI since we don't display the column in question.
I have added the request to have this at the project level in our backlog. We can't provide a timeline at this point, but be sure to check back for updates!
Elisa
-
I agree, this would be a really helpful feature, especially if computed columns were also searchable here. One of my biggest challenges in large projects with hundreds of datasets is finding my custom columns so they can be copied onto other datasets. But in general, being able to quickly check which datasets in my project have a particular field would narrow down searches a lot. I've run into the need to search for datasets that contain a particular column within my project at least three times in the last month. It'd be really powerful if the search feature could be upgraded to index column-level metadata, like names, types, source, and, if already collected, column-level metrics. My current main project has over 500 datasets, spread over about 20 flow zones, so any indexing and search features to make projects of that scale more navigable are very welcome in my team!