Submit your inspiring success story or innovative use case to the 2022 Dataiku Frontrunner Awards! ENTER YOUR SUBMISSION

Catalog DSS

Solved!
sridarvs
Level 1
Catalog DSS

Hi All,

We are trying to find TEXT is reference anywhere within a DATAIKU project.

Say, for example, I have a DATASET "CUSTOMER" and I need to identify all the recipes/Scenarios using the same. Note that there are cases where on few recipes we don't dataset marked as input in the recipes however, its being referenced in SQL recipes.

Currently, We are using CATALOG as a global search to identify those cases. However, there are cases where we have noticed that Global search is returning few recipes however there is just a part of text in that recipe. Say, CUST in few SQL recipes.

Is there any other utilities/Approaches that Dataiku has to identify and track this case?

 

Regards,

Sridar Venkatesan

0 Kudos
1 Solution
AlexT
Dataiker
Dataiker

Hi @sridarvs ,

Catalog search will not provide exact matches only. So it will match on CUST and CUSTOMER. 

One way would be to use the Python API loop through all projects and recipes and search input/ouput datasets or code matching the criteria depending on where you are looking for the string. 

 
Or maybe simply do a recursive grep :

grep -rw "string" DATADIR/config/projects

Could be be sufficient in your use case. This should provide only the config files, recipes and code matching. 

View solution in original post

0 Kudos
1 Reply
AlexT
Dataiker
Dataiker

Hi @sridarvs ,

Catalog search will not provide exact matches only. So it will match on CUST and CUSTOMER. 

One way would be to use the Python API loop through all projects and recipes and search input/ouput datasets or code matching the criteria depending on where you are looking for the string. 

 
Or maybe simply do a recursive grep :

grep -rw "string" DATADIR/config/projects

Could be be sufficient in your use case. This should provide only the config files, recipes and code matching. 

0 Kudos