Change coloumn names using Meta-Data dataset

sasidharp
sasidharp Registered Posts: 27 ✭✭✭✭

I have a meta datset

TAG1tagname1
TAG2tagname2
TAG3tagname3
TAG4tagname4
TAG5tagname5
TAG6tagname6
TAG7tagname7
TAG8tagname8
TAG9tagname9
TAG10tagname10
TAG11tagname11

I have another source dataset with

TAG1TAG2TAG3TAG4TAG5TAG6TAG7TAG8TAG9TAG10TAG11
VAl1VAl2VAl3VAl4VAl5VAl6VAl7VAl8VAl9VAl10VAl11
VAl1VAl2VAl3VAl4VAl5VAl6VAl7VAl8VAl9VAl10VAl11
VAl1VAl2VAl3VAl4VAl5VAl6VAl7VAl8VAl9VAl10VAl11
VAl1VAl2VAl3VAl4VAl5VAl6VAl7VAl8VAl9VAl10VAl11
VAl1VAl2VAl3VAl4VAl5VAl6VAl7VAl8VAl9VAl10VAl11
VAl1VAl2VAl3VAl4VAl5VAl6VAl7VAl8VAl9VAl10VAl11

I wan the resultant dataset to be like this

TAGname1TAGname2TAGname3TAGname4TAGname5TAGname6TAGname7TAGname8TAGname9TAGname10TAGname11
VAl1VAl2VAl3VAl4VAl5VAl6VAl7VAl8VAl9VAl10VAl11
VAl1VAl2VAl3VAl4VAl5VAl6VAl7VAl8VAl9VAl10VAl11
VAl1VAl2VAl3VAl4VAl5VAl6VAl7VAl8VAl9VAl10VAl11
VAl1VAl2VAl3VAl4VAl5VAl6VAl7VAl8VAl9VAl10VAl11
VAl1VAl2VAl3VAl4VAl5VAl6VAl7VAl8VAl9VAl10VAl11
VAl1VAl2VAl3VAl4VAl5VAl6VAl7VAl8VAl9VAl10VAl11

How can we do this, what kind of join is going to help me do this in dataiku

Answers

  • Liev
    Liev Dataiker Alumni Posts: 176 ✭✭✭✭✭✭✭✭

    Hi @sasidharp

    The task you have is not really one of joining, but of stacking. I'm sure it can be solved with joins as well, but it will be made even more complex.

    First, some definitions.

    - Dataset containing TAG and tagname, let's call this tags.

    Screenshot 2020-09-14 at 09.43.05.png

    - Dataset containing values, let's call values.

    Screenshot 2020-09-14 at 09.44.04.png

    Now, for the actual steps.

    1) Invert rows and columns in our tags dataset, this can be achieved with a Prepare recipe.

    Screenshot 2020-09-14 at 09.45.41.png

    2) Use a Stack recipe to combine our Values and Tags_inverted datasets. It is very important for step 3 that Tags_inverted is selected first.

    Screenshot 2020-09-14 at 09.47.21.png

    3) Finally, a prepare recipe to use the values in our first row (index 0) as new column names. The name of the processor is Use values of a row as column names.

    Screenshot 2020-09-14 at 09.48.13.png

    The result will look as you've specified.

Setup Info
    Tags
      Help me…