Check out the first Dataiku 8 Deep Dive focusing on Productivity on October 29th Read More

Merging Rows to Column Merge

Level 2
Merging Rows to Column Merge

 I have two datasheets that I want to merge. One data sheet has a the dates spanned across the columns with corresponding values(population infected) in each column for one row (county). In another datasheet I have the dates in one column with values(temperature) displayed in that column for each row (county).  

What I'm trying to do is merge the two datasets into one dataset so I have temperature, population, county in one place so I can do a time-series analysis to produce trends with population infected as a function of temperature as an example. 

How do I merge the two datasets?

3 Replies
Dataiker
Dataiker

Hi,

Based on the datasets you describe, I would suggest using:

1. "Prepare" recipe with the "Fold multiple columns" processor to transform your "population" dataset into a format with one date column and one value column. 

(I assume your "weather" dataset is already in this format)

2. "Join" recipe to join this new dataset on your "weather" dataset with your date columns as join keys.

If that doesn't work, could you please provide a sample or screenshot of your two datasets?

Best regards,

Alex

Level 2
Author

Thanks for your solution. I'm having trouble with the fold multiple columns function as I don't know what to put in the column fields. Do I place all the dates? Is there a transpose function I can perform?

I attached the sample data sets I'm trying to merge. 

Thanks,

Vinh

 

Dataiker
Dataiker

Hi, 

You can use the Fold processor as follows:

Screenshot 2020-05-07 at 14.52.28.png

Hope it helps,

Alex

0 Kudos