Finding Distance with Geocoded Data in TSQL
by Joe Finsterwald
This code sample applies to SQL Server 2000 and 2005. Examples are written in t-SQL. 
Contents
- The Problem
- The Solution
- Step 1: Constructing and Filling your Tables
- Step 2: Convert from Degrees to Radians
- Step 3: Define an Area and Retrieve a Subset Table
- Step 4: Use the Law of Cosines to find points within the Subset
- Step 5: Build a Stored Procedure and Put it all Together
- View the Demo
- Conclusion
The Problem:
You need to search a large number of geo-coded records by distance. Unfortunately, the point of origin for your calculation is not static. As a result, it is not possible to pre-calculate the information and write it to a table.
The solution:
The most elegant solution to this problem is to leverage the power of the database to do the heavy lifting for you. This problem
has been solved in a myriad of ways on different sites, but I wasn't able to find a pure SQL solution so I decided to implement one.
That said I'd like to give credit where credit is due. This article borrows heavily from an article by Mike Shaffer at the following URL: http://www.4guysfromrolla.com/webtech/010500-1.shtml.
This example uses a geo-coded zip code as the point of origin to derive a distance comparison between a large number of points in a
database. The beauty of this method is that it leverages the set based logic of SQL to derive a subset of data before it begins
calculating distance. I’ve also added in a unit conversion capability for purposes of internationalization.
Step 1: Constructing and Filling your Tables
I've include scripts to create and fill the two tables that I'm using for this example:
Step 2: Convert from Degrees To Radians
We need to build a utility function that will convert from degrees to radians because our algorithm is going to use trigonmeteric functions.
Step 3: Define an Area and Retrieve a Subset Table
Degrees latitude are uniformly the same distance apart from pole to pole. Each minute of latitude corresponds to one nautical mile. From
this we can create a trapazoid whose area contains a subset of the geocoded points we are interested in. This will significantly speed our query.
Step 4: Use the Law of Cosines to find points within the Subset.
This function incorporate the Law of Cosines which allows us to find two points on the surface of a sphere.
Step 5: Build a Stored Procedure and Put it all Together
I've written a stored procedure which calls the functions I've written in the preceding steps.
Conclusion:
Computing distances for large sets of data in a speedy way is easiest in the datalayer. Hopefully this example has demonstrated the
power of TSQL when it comes to doing heavy computation on a large set of data.