How to fold multiple columns with several variables

dzi_mchi
dzi_mchi Registered Posts: 2 ✭✭✭

Hi all,

I have a dataset with multiple columns and multiple variables

When I'm using "fold multiple columns" It doesnt work beacuse first I'm folding it by units - it is giving me correct result, but when I'm adding as a next step fold by sales it is multiplying my rows again. The idea is to fold my 4 variables at the same time. Has anyone an idea ?

fold.png

Thanks,

dzi

Answers

  • Keiji
    Keiji Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 52 Dataiker

    Hello @dzi_mchi
    ,

    You can achieve that by using multiple steps such as a "Fold multiple columns by pattern" step and a "Pivot" step in a Prepare recipe.

    For example, assume you have the following dataset:

    1.png

    First, fold the "xxxx_YYYY" columns with a "Fold multiple columns by pattern" step as follows:

    2.png

    Next, split the column which contains "xxxx_YYYY" values by an underscore (_) with a "Split column" step as follows:

    3.png

    Then, pivot the dataset with a "Pivot" step as follows:

    4.png

    Finally, remove the unnecessary columns and rename columns if necessary as follows:

    5.png

    I hope this would help. Please let us know if you have any further questions.

    Sincerely,
    Keiji

Setup Info
    Tags
      Help me…