The Dataiku Frontrunner Awards have launched to recognize your achievements! SUBMIT YOUR ENTRY

Smart indexing: recommend index based on downstream recipes

0 Kudos

It would be helpful if on the index selection menu for a dataset, some smart values could be displayed based on downstream recipes, and if in the recipe creation views, upstream datasets could be reindexed to optimize them as well.

For example, after I've created two join recipes downstream of a dataset, on the index screen in dataset settings, I'd like join columns from each recipe to be highlighted, suggesting index values based on which joins would be optimized. Same goes for filters, windows, groups, etc. If combined with theIndex cardinality UI , this could also help identify which areas need the performance boost the most.

This could also be used to recommend secondary indexes (including value-ordered NUSIs for inequality joins and filters).

Similarly, if I'm creating a recipe, I'd like a button to automatically optimize the indexes of some or all upstream datasets for my recipe. Or in the case of a join recipe, for a specific join.