Check out the first Dataiku 8 Deep Dive focusing on Productivity on October 29th Read More

How to pivot columns to rows

Level 3
How to pivot columns to rows

I have a file that has a number of variable columns that gets updated each month (both columns and values under the columns) and I would like to adjust most of the columns to rows so I can use it in other processing. For instance each app has a certain percentage for some of the columns and across everything adds up to 100% (see below).

There is a reshape for fold with multiple rows but you have to explicitly add each column. The fold with pattern I can't seem to figure out how that would work to get what I want.

Is there anything other than writing a python script to do this?

Thank you for all the help!

 

app       app_id      000489     000492      000520       001094        C00280      C00304   

myapp       1                                1.4             98.6

thisone       2            30                                                   25                45

 

Would like to pivot to:

app      app_id       numbers       values

myapp      1            000489      

myapp      1            000492      1.4

myapp      1            000520       98.6

myapp      1            001094        

myapp      1            C00280      

myapp      1            C00304

0 Kudos
2 Replies
Dataiker
Dataiker

Hi

DSS handles data with schemas defined at design-time, so datasets with varying column number and names will not be an option. You should read the files with a python recipe and pivot them with something like

df.set_index("app").drop("app_id").stack().reset_index()

 

Level 3
Author

Thank you for the help!! I still need to implement but will mark this as solved and will post a follow-up. 

0 Kudos