Selectively tabulating columns based on name

Solved!
cdutoit
Level 3
Selectively tabulating columns based on name

Hi -

I'm trying to add column values based on the column name containing a specific word.

For example, here is a sample dataset that shows for each company, what the lead source was and count:

 

Deal NamePaid-LinkedIn-TopTenPaid-PPC-DisplayPaid-LinkedIn-Trends
Apple301
IBM010
Google110

 

This is showing that Apple deal originated from 3 people at apple clicking the "top ten" linked in ad, and one person clicking the "Trends" linkedin ad. In the real dataset, there are about 60 different lead sources (columns).

The output I'm seeking is: "For each deal show how many contacts came from linkedin", or more specifically referencing the above data set "For each deal, add together the numbers in every column where the column name contains 'LinkedIn'"

The output from the above table should be:

 All LinkedIn Sources
Apple4
IBM0
Google1

 

I'd appreciate any pointers of how I could go about trying to derive that.

 

Thanks

Chris

 

 

 

0 Kudos
1 Solution
Manuel
Dataiker Alumni

Hi, 

In a prepare recipe, you can use the "fold multiple columns by pattern" processor to transform your data from wide to long format. Use the smart pattern builder to fold only the columns that start with "Paid-Linkedin".

Then you can use a Group recipe with your key (Deal Name), adding the values from the fold value column.

Tell me if this helps.

Best regards

View solution in original post

0 Kudos
2 Replies
Manuel
Dataiker Alumni

Hi, 

In a prepare recipe, you can use the "fold multiple columns by pattern" processor to transform your data from wide to long format. Use the smart pattern builder to fold only the columns that start with "Paid-Linkedin".

Then you can use a Group recipe with your key (Deal Name), adding the values from the fold value column.

Tell me if this helps.

Best regards

0 Kudos
cdutoit
Level 3
Author

Wow ... brilliant. That does the trick nicely. Thanks so much for the tip.

0 Kudos