Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
New to Dataiku and I am being asked to join 2 datasets for flight data and to capture the destination latituded and longitudes for each flight. The first data set has the variable for the airports (airport_icao) on a column, and the other has the destinations (destinations_icao) under a column in a row. How can I inner join the two datasets on those 2 variables, without them having the same name? (Using Visual Recipes, not familiar with SQL) Also how would I rename the columns? I've provided the two datasets below. Any input is appreciated, thank you!
From the Dataset Explore view you can find the action "Export" to export a sample in csv or excel format. But you need to be sure that you can share the data in a public forum... If you can't share it, a screenshot with 3 or 4 rows of data (not just the column or rows names) could be enough
Thanks, that screenshot show perfectly one of the datasets. Now I would need a kind of simar screenshot of the other dataset. What you provided before, the table with 3 rows and 5 columns, and airport_icao plus coordinates, looks like a description of a table, but is not a screenshot of the data.
Sorry, I don't think I understand the exact problem. I was expecting that you had 2 datasets:
1) Dataset 1: is the one shown in the screenshot named "Screen Shot 2020-08-27 at 5.38.18 PM.png"
2) Dataset 2: another dataset where you have for each "airport_icao" the corresponding longitudes and the latitudes
By joining the Dataset 1 with the 2, you are enriching the Dataset 1 by adding two more columns with the latitude and longitude of the destination airport.
Is this correct? If so, how does the Dataset 2 actually looks?
If wrong, I'm sorry I can't help, because I wouldn't have the understanding of what you are trying to achieve.
My apologies for the confusion, I am still learning this myself. I believe I am being asked to inner join the table of the third screenshot with the second screenshot. "Join on destination_icao and airport_icao" are the exact instructions, which I assume is the column and row respectively. However I do not know how to inner join them since one lies in a row. If that helps. If not, I appreciate your efforts!
Don't worry, no apologies needed!
For what they are asking you to do (join), the dataset providing the coordinates for each airport_icao, should have a schema similar to the following table:
or in another case, like this table:
But what you share as your "second snapshot" or dataset, doesn't have any data to actually do a join. Is that the only other dataset you have to do the work they asked you to do?