core designer certification

emma123
Level 1
core designer certification

Hello every one,

 

I am passing the core designer certification, and i have a problem with the step 3 : Merge the information from the three datasets into a single dataset. We recommend using the CO2_and Oil.csv dataset as the base (left dataset) for merging other datasets.

 

In fact, when I left join I don't have all the data from the datasets to have the 2008 from 2012 data ...

My question is, what are the steps to finish the step 3 and to have correctly the step 4 ...

0 Kudos
7 Replies
JordanB
Dataiker

Hi @emma123,

Please make sure that you have inferred the storage types on each input dataset. Note, you may have to select "check now or check again" for the infer types button to be clickable. 

Screenshot 2023-12-12 at 9.47.26 AM.png

I've just gone through the steps without an issue, please read through the directions again.

Note, you may need 2 post-filter conditions on the join recipe...one keeps only rows that satisfies conditions >= 2008 and <= 2012.

Hope this helps!

Thanks!

Jordan

 

0 Kudos
emma123
Level 1
Author

Hey, thanks for you response, i already did that ...

 

It's juste when i am here there is a problem ... it only propose me the country 'Afghanistan', i dont know what to choose tho ...

Capture d'รฉcran 2023-12-12 161158.png

 

Capture d'รฉcran 2023-12-12 161252.png

also, I checked that all the datasets sources are from 1800 to 2012. And when i left join I have only from 1800 to 1848, correspond to the afghanistan dates ...

 
 

 

0 Kudos
JordanB
Dataiker

Hi @emma123,

Can you check your dates? You mentioned "1800 to 2012. And when i left join I have only from 1800 to 1848, correspond to the afghanistan dates ...". Note, the dates are 2008 to 2012. 

Do not worry that you can only see Afghanistan, it is only showing a sample of your data here and it is currently sorted. You should not need to touch anything in the entities.

All you need to do is:

1. Add the 3rd dataset to join all three datasets:

Screenshot 2023-12-12 at 11.35.37 AM.png

 

2. Add the post-filter (>= 2008, <= 2012)

Please then run the join recipe and you should see the correct results in the output dataset (985 rows, 11 columns).

Thanks,
Jordan

0 Kudos
emma123
Level 1
Author

Thank you so much ! it works now !! but just a question for the step 5, when I report Oil production (Etemad & Luciana) (terawatt-hours), meat_prod_tonnes, and Food Balance Sheets: Eggs - Production (FAO (2017)) (tonnes), I dont have any data inside the colonn ... I puted this code for example for one column created : 
[["Oil production (Etemad & Luciana) (terawatt-hours)"]] / [[Population]]

But I dont have any numbers inside the new colomn, and it's the same for the 2 others news columns ..

Do you know why?

0 Kudos
JordanB
Dataiker

@emma123 You need to look at "Hint 2" and select the link provided to find the syntax for using formulas on columns with spaces. Use the syntax shown in the link to create your formula.

0 Kudos
emma123
Level 1
Author

I put that as you told me : numval([["Oil production (Etemad & Luciana) (terawatt-hours)"]]) / numval([["Population"]]). But again I don't have results in my columns ...

 

It's still empty...

0 Kudos
JordanB
Dataiker

Hi @emma123,

You only need to use numval() on the column that has spaces (i.e. numval("Oil production (Etemad & Luciana) (terawatt-hours)")/Population)

Thanks,

Jordan

0 Kudos