Change coloumn names using Meta-Data dataset
I have a meta datset
TAG1 | tagname1 |
TAG2 | tagname2 |
TAG3 | tagname3 |
TAG4 | tagname4 |
TAG5 | tagname5 |
TAG6 | tagname6 |
TAG7 | tagname7 |
TAG8 | tagname8 |
TAG9 | tagname9 |
TAG10 | tagname10 |
TAG11 | tagname11 |
I have another source dataset with
TAG1 | TAG2 | TAG3 | TAG4 | TAG5 | TAG6 | TAG7 | TAG8 | TAG9 | TAG10 | TAG11 |
VAl1 | VAl2 | VAl3 | VAl4 | VAl5 | VAl6 | VAl7 | VAl8 | VAl9 | VAl10 | VAl11 |
VAl1 | VAl2 | VAl3 | VAl4 | VAl5 | VAl6 | VAl7 | VAl8 | VAl9 | VAl10 | VAl11 |
VAl1 | VAl2 | VAl3 | VAl4 | VAl5 | VAl6 | VAl7 | VAl8 | VAl9 | VAl10 | VAl11 |
VAl1 | VAl2 | VAl3 | VAl4 | VAl5 | VAl6 | VAl7 | VAl8 | VAl9 | VAl10 | VAl11 |
VAl1 | VAl2 | VAl3 | VAl4 | VAl5 | VAl6 | VAl7 | VAl8 | VAl9 | VAl10 | VAl11 |
VAl1 | VAl2 | VAl3 | VAl4 | VAl5 | VAl6 | VAl7 | VAl8 | VAl9 | VAl10 | VAl11 |
I wan the resultant dataset to be like this
TAGname1 | TAGname2 | TAGname3 | TAGname4 | TAGname5 | TAGname6 | TAGname7 | TAGname8 | TAGname9 | TAGname10 | TAGname11 |
VAl1 | VAl2 | VAl3 | VAl4 | VAl5 | VAl6 | VAl7 | VAl8 | VAl9 | VAl10 | VAl11 |
VAl1 | VAl2 | VAl3 | VAl4 | VAl5 | VAl6 | VAl7 | VAl8 | VAl9 | VAl10 | VAl11 |
VAl1 | VAl2 | VAl3 | VAl4 | VAl5 | VAl6 | VAl7 | VAl8 | VAl9 | VAl10 | VAl11 |
VAl1 | VAl2 | VAl3 | VAl4 | VAl5 | VAl6 | VAl7 | VAl8 | VAl9 | VAl10 | VAl11 |
VAl1 | VAl2 | VAl3 | VAl4 | VAl5 | VAl6 | VAl7 | VAl8 | VAl9 | VAl10 | VAl11 |
VAl1 | VAl2 | VAl3 | VAl4 | VAl5 | VAl6 | VAl7 | VAl8 | VAl9 | VAl10 | VAl11 |
How can we do this, what kind of join is going to help me do this in dataiku
Answers
-
Hi @sasidharp
The task you have is not really one of joining, but of stacking. I'm sure it can be solved with joins as well, but it will be made even more complex.
First, some definitions.
- Dataset containing TAG and tagname, let's call this tags.
- Dataset containing values, let's call values.
Now, for the actual steps.
1) Invert rows and columns in our tags dataset, this can be achieved with a Prepare recipe.
2) Use a Stack recipe to combine our Values and Tags_inverted datasets. It is very important for step 3 that Tags_inverted is selected first.
3) Finally, a prepare recipe to use the values in our first row (index 0) as new column names. The name of the processor is Use values of a row as column names.
The result will look as you've specified.