Geospatial Search Made Easy

Step-by-step guide for implementing geospatial search

Photo by GeoJango Maps on Unsplash

Finding nearby restaurants, finding cabs, delivering stuff online, finding nearby pharmacies, finding your next right swipe, and many more. What’s the one thing which is common in the above applications? Answer: It’s the use of geospatial search. In simple words, geospatial search is finding something nearby within the given radius using geospatial data.

Location Intelligence

Location intelligence plays an important role in gaining various insights for various use cases. Even before the computers and advanced machine, location data helped businesses in gaining insights. With the advancement of technologies these applications for location data have only increased.

Photo by henry perks on Unsplash

Location data is everywhere. Our devices like mobiles, laptops, or even modern cars are equipped to share location data. This has resulted in many interesting applications for location data. One such simple application is geospatial search.

Geospatial Search

As discussed earlier, geospatial search is the process of searching given locations within the given radius based on the geopoints (latitude/ longitude) for that given location. In this article, we take a deep dive into geospatial searches and their implementation.

We will take an example of retail stores for our implementation of geospatial search. Suppose, we want to build an app for your customers in which you want to recommend the nearest stores to them from the given large database of stores. This can be achieved using the simple implementation of the geospatial search and will give us a good baseline for our implementation and understanding of geospatial search. Hopefully, this will also enable us to explore many more interesting implementations of geospatial search.

Photo by Jon Tyson on Unsplash

Problem Statement: Find all the retail stores within the given radius of the user.

Before jumping on to the solution, we first need to look into the given inputs. The main pre-requisite for geospatial searches is having the geocoded data ie. data with the latitude and longitude values, for both the source and target points.

There can be cases where we may not get the geocoded data but we are just provided with the physical address. In this, we can preprocess data to convert the physical address to the proper latitude and longitude values. There are many solutions available in the market for geocoding data.

Now we have the required data, let us quickly jump on to the solution….

Implementation

What is the first solution that comes to your mind? I’m sure the first and simplest solution that will come to your mind will be to find the distance. The main intuition behind this is we can use the lat/ long of the user and each retail store and find the distance between them. While this solution works perfectly fine but there are a few complexities to this that we need to take care of.

  • Since we are dealing with the points of the earth’s surface and the earth is not flat, simple distance formula will not suffice our needs to find the accurate distance between two points of the earth. For finding the distance between two points on Earth we did not just need to take care of the given points but also take into account the radius and curvature of Earth. Luckily, this is a solved problem and we can use Haversine Formula to easily calculate the distance between two geopoints.

Distance, d(kms) = 3963.0 * arccos[(sin(lat1) * sin(lat2)) + cos(lat1) * cos(lat2) * cos(long2 — long1)]

Following is the JAVA & SQL implementation for the above formula:

public double distance(double lat1, double lat2, double lon1, double lon2) {
    lon1 = Math.toRadians(lon1);
lon2 = Math.toRadians(lon2);
lat1 = Math.toRadians(lat1);
lat2 = Math.toRadians(lat2);
    // Haversine formula
double dlon = lon2 - lon1;
double dlat = lat2 - lat1;
double a =
Math.pow(Math.sin(dlat / 2), 2)
+ Math.cos(lat1) * Math.cos(lat2) * Math.pow(Math.sin(dlon / 2), 2);
    double c = 2 * Math.asin(Math.sqrt(a));

// calculate the result
return (c * 6371);
}
Select 
Id,
Name,
Lat,
Lon,
acos(sin(:lat)*sin(radians(Lat)) +
cos(:lat)*cos(radians(Lat))*cos(radians(Lon)-:lon)) * 6371 As Distance
From Stores
Where
acos(sin(:lat)*sin(radians(Lat)) +
cos(:lat)*cos(radians(Lat))*cos(radians(Lon)-:lon)) * 6371 < search_radius

So job done? Not yet !!! While above solution will work fine when we have a small search set. But as you can see in the above code/ query, the formula calculating the distance involves performing complex mathematical operations which results in performance degradation especially when we are dealing with a very large data set.

So what’s the solution?? How can we optimize this further?

Photo by jaikishan patel on Unsplash

Bounding Box

The main problem with the above solution was not the logic but the number of times we run this formula. To optimise the solution further we need to find a way we reduce the number of times we apply this formula or in other words, we want to reduce our search set.

How can we reduce our search set? A simple answer to this is using the bounding box. By bounds, we mean using the finding the min and max values for latitude and longitude within our stores will reside. Since we are provided with the search radius and geopoints for the user we can calculate min and max bounds within which our stores will reside. For more on the bounding box algorithm check.

Bounds Min-Max Latitude/ Longitude

Following the JAVA code for finding the bounds:

@Builder
public class Bounds {
private Double minLat;
private Double maxLat;
private Double minLon;
private Double maxLon;
}
public Bounds getLatLongBounds(double radius, double lat, double lon) {

double bound1 = radius / 6371 * 180 / PI;
double bound2 = bound1 / Math.cos(lat * PI / 180);
var bounds =
Bounds.builder()
.minLat(lat - bound1)
.maxLat(lat + bound1)
.minLon(lon - bound2)
.maxLon(lon + bound2)
.build();

return bounds;
}

We can use the bounds from the above code to filter our search set for the required result. This reduces our problem to just finding the range of values within the given bounds by simple comparison.

Select 
Id,
Name,
Lat,
Lon
From Stores
Where Lat Between :minLat And :maxLat
And Lon Between :minLon And :maxLon

Is the problem solved? Not yet! The main use for the bounding box algorithm is to reduce our search set to the given box. Still not enough to get an accurate result for our problem

Optimal Solution

Now as the final optimal solution, we will combine both the distance formula and the Bounding box algorithm to find the retail stores in the given radius.

  1. In the first step, we will get the bounds using the bounding box for a given user location and our search radius. This will drastically reduce our search set and remove most of the unwanted values.
  2. Now since our search set is reduced we can safely apply the distance formula to the filtered results from step 1 to get the more accurate search results with the optimal performance.
Select 
Id,
Name,
Lat,
Lon,
acos(sin(:lat)*sin(radians(Lat)) + cos(:lat)*cos(radians(Lat))*cos(radians(Lon)-:lon)) * 6371 As D
From Stores
Where
Lat Between :minLat And :maxLat
And Lon Between :minLon And :maxLon
And acos(sin(:lat)*sin(radians(Lat)) + cos(:lat)*cos(radians(Lat))*cos(radians(Lon)-:lon)) * 6371< :search_radius
Order By D

Pro Tips

  • Pre-processing the data for geocoding can reduce both time and cost of performing the search operation.
  • If you are implementing the above geospatial search programmatically, do consider using multithreading (parallel streams/Java threads) for filtering. Since each operation is independent of other values, operations need not be sequential.
  • If you are implementing the search as a database query then indexing the columns can significantly increase the speed of geospatial search.

Summary

In this article, we looked into what is Location intelligence and Geospatial Search. Then took a deep dive into implementing Geospatial searches both programmatically as well as through SQL Queries. We looked into a rather implementation to a more complex and optimized solution. And in the end, we wrapped our knowledge with some pro tips.

If you loved my work please like and share this article( it’s free :)). Also, do follow me for more articles like these.

Also, check out my other articles:

Geospatial Search Made Easy was originally published in Walmart Global Tech Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Article Link: Geospatial Search Made Easy. Step-by-step guide for implementing… | by Jaideep Pahwa | Walmart Global Tech Blog | Sep, 2023 | Medium