Rename columns with Regular Expressions in a Visual Prepare Recipe

Options
tgb417
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,595 Neuron

User Story:

As a Data Analyst or Subject Matter Expert I would like to be able to use a visual prepare recipe to bulk cleanup column names made by the Visual Group, Window, and Pivot recipes. These recipes often and suffixes to column names like "_max", "_concat", "_min". I would like to use a visual prepare recipe step to find a subset of columns with a regular expression like *._count and replace the _count with nothing.

COS

* Don't change the default behavior of the Group by, Window, and, Pivot visual recipes. If you want to attach these issues there dropping those column name suffixes should be an advanced optional parameter.
* I would prefer to see this fixed by making the column re-name visual prepare recipe step smarter allowing the all of the usual, single, multiple, pattern, all options for column selection.
* I would prefer to have the renames be able to be done like the one time regular expression rename where you can find the substring in the column name you want to change and the constant you want to change it with.

Notes:

I recognize that this might be a big ask because of Schema Management. I a recipe step can dynamically rename columns on the fly using patterns and not simply replace constant old with constant new as is currently offered might be challenging.

0
0 votes

Released · Last Updated

Comments

  • Katie
    Katie Dataiker, Registered, Product Ideas Manager Posts: 105 Dataiker
    Options

    Thanks for the feedback @tgb417
    , we've logged this internally and will let you know of any updates.

  • AshleyW
    AshleyW Dataiker, Alpha Tester, Dataiku DSS Core Designer, Registered, Product Ideas Manager Posts: 161 Dataiker
    Options

    Hi,

    Updating this thread to let you know that we've made it easier to mass rename columns in the Prepare in all 12.3.2+ versions of Dataiku. We have some limitations due to schema management w.r.t. dynamic schema--as @tgb417
    rightly pointed out--but here's what's a lot easier to do now:

    • open the 'rename column' processor
    • notice the new 'mass renamings' button that provides many options for mass renaming the columns in your dataset: F/R, prefixed, suffixes, etc
    • apply your settings as needed
    • rename column processor will update with all the renamings.

    Cheers,

    Ashley

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,595 Neuron
    Options

    @AshleyW
    I’ve not found this yet. I’ll keep an eye out for this.

Setup Info
    Tags
      Help me…