Find Distance b/t Locations on Different Rows
Hello,
Is there a way to find the geographical distance in miles between addresses which are on different rows of data?
For example my dataset would look similar to the below and I would want to know the different in miles between the loading and shipping location for an order:
Order# Status Address
123 Load 45 Main St. Akron, OH
123 Ship 38 Red St. Trip, UT
7789 Load 29 Bird St. Boise, ID
7789 Ship 51 Main St. Tahoe, CA
Operating system used: Windows
Operating system used: Windows
Answers
-
Manuel Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 193 ✭✭✭✭✭✭✭
Hi,
Here is a suggestion:
- Your dataset needs to have column to order from. Perhaps you already have a timestamp. Otherwise, in a Prepare recipe, prefix Load/Ship with 1-,2-
- Use the Window recipe, partitioned by order number and sorted by your order column, to add the geopoint from the previous row to the current row.
- In a Prepare recipe, use the "compute distance between geopoints" processor
I hope this helps
-
Thank you for the suggestions Manuel!
On the "Window" recipe, in the "Aggregations" section, how do I add the geopoint from the previous row to the current row? -
Manuel Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 193 ✭✭✭✭✭✭✭
Hi,
In the Window definition, define the frame to pick one preceding row.
In the Aggregations, use Last to pick the geopoint from the preceding row.
I hope this helps.