Discover the winners & finalists of the 2022 Dataiku Frontrunner Awards!READ THEIR USE CASES

Enum transform processor

0 Kudos

Find and replace already fills the role of transforming enum values in tables. In certain enterprise databases, using enums is a really common pattern. As a result, I often need to type the same enum as a find and replace across 10-12 different datasets. While it's true I can go back and copy/paste the values, it'd be great if I could just save them with a name, then optionally indicate when another column in a different dataset follows the same enum to automatically transform it. That way, if I need to edit the enum, I can edit it in one place, and transforming the enum from numbers to meaningful values becomes just a couple keystrokes.

Additionally, in certain databases, there are builtin functions to look up enum maps. For example, in Oracle, xform_enum() is available.For databases where such features are available, it'd be great to save the effort entirely and just type the name of the enum without ever typing the mapping. Alternatively, perhaps an enum column meaning could provide similar functionality.with the added benefit of generating local context menus prompting users to quickly assign the correct enum and convert automatically.Some enums in my company are quite large (50+ mapped values), so being able to lookup those values from the database would be a big time saver.




@natejgardner ,

While we wait for this feature.  I've two proposals for potential workarounds.

1. Create a manually created table, and do a Join to that table.

2. Create a global or project variable and use the variable across the various instances where you need this list of items to be adjusted in the same way.

Does either of these approaches help you? Understanding why they do or don't work for you might be helpful to the Dataiku Development teams.

Thanks @tgb417 ,

The first method works, but is more cumbersome than typing the mappings with find and replace. It's doable, and in some databases, enums are conveniently defined in tables, so that's fine. It provides the advantages I listed, but at a cost of convenience to workflow, which in practice makes it unlikely to be used.

I'm not quite sure how I could use a global or project variable to go about this. Would I need to create one variable for every key/value pair? Or could I create a variable for an entire mapping and reuse it?