How to fold multiple columns with several variables
Hi all,
I have a dataset with multiple columns and multiple variables
When I'm using "fold multiple columns" It doesnt work beacuse first I'm folding it by units - it is giving me correct result, but when I'm adding as a next step fold by sales it is multiplying my rows again. The idea is to fold my 4 variables at the same time. Has anyone an idea ?
Thanks,
dzi
Answers
-
Keiji Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 52 Dataiker
Hello @dzi_mchi
,You can achieve that by using multiple steps such as a "Fold multiple columns by pattern" step and a "Pivot" step in a Prepare recipe.
For example, assume you have the following dataset:
First, fold the "xxxx_YYYY" columns with a "Fold multiple columns by pattern" step as follows:
Next, split the column which contains "xxxx_YYYY" values by an underscore (_) with a "Split column" step as follows:
Then, pivot the dataset with a "Pivot" step as follows:
Finally, remove the unnecessary columns and rename columns if necessary as follows:
I hope this would help. Please let us know if you have any further questions.
Sincerely,
Keiji