Catalog DSS

Options
sridarvs
sridarvs Registered Posts: 3 ✭✭✭

Hi All,

We are trying to find TEXT is reference anywhere within a DATAIKU project.

Say, for example, I have a DATASET "CUSTOMER" and I need to identify all the recipes/Scenarios using the same. Note that there are cases where on few recipes we don't dataset marked as input in the recipes however, its being referenced in SQL recipes.

Currently, We are using CATALOG as a global search to identify those cases. However, there are cases where we have noticed that Global search is returning few recipes however there is just a part of text in that recipe. Say, CUST in few SQL recipes.

Is there any other utilities/Approaches that Dataiku has to identify and track this case?

Regards,

Sridar Venkatesan

Tagged:

Best Answer

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,209 Dataiker
    Answer ✓
    Options

    Hi @sridarvs
    ,

    Catalog search will not provide exact matches only. So it will match on CUST and CUSTOMER.

    One way would be to use the Python API loop through all projects and recipes and search input/ouput datasets or code matching the criteria depending on where you are looking for the string.


    Or maybe simply do a recursive grep :

    grep -rw "string" DATADIR/config/projects

    Could be be sufficient in your use case. This should provide only the config files, recipes and code matching.

Setup Info
    Tags
      Help me…