Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hello, first time question poster here and new to Dataiku. I've replicate a flow that I created in another tool which uses geo-join in a prepare recipe to find the nearest location between 2 lists of locations with long & lat data points. When comparing the output of Dataiku with the output of the other tool, I am noticing some different matches on nearest location as well as some variations in the distance between the same 2 locations. In order to account for the differences, I'm seeking to understand the methodology of the Dataiku geo-join for nearest location, but I haven't been able to find any documentation or references to it. Does Dataiku use a linear/straight-line distance between 2 points to geo-join, shortest driving route, or some other approach?
Any insights into this would be much appreciated. Thanks!
Operating system used: Windows
I believe they use a JTS wrapper and this method: https://docs.geotools.org/stable/javadocs/org/geotools/referencing/GeodeticCalculator.html
write up is here: https://blog.dataiku.com/building-the-geospatial-join-recipe-in-dataiku