MarkMaunder dot com

How to create a ZIP code distance lookup table with 1 line of SQL

A while back, Jobster CTO Phil Bogle blogged about some of the tricks I’ve used to do fast location queries in SQL. The link to my SQL query to generate the zip lookup table for radius searches is now dead (a cybersquatter stole my domain name and I don’t want to discuss it!). So here’s the original blog post:

If you need to build a radius search for something on your website, then creating a zip code distance lookup table performs much better than calculating distance for every search. This monster chunk of SQL will create that table for you. It takes about 12 hours to run on a fast machine.

insert into zip_dist2 (fromzip, tozip, dist) select z1.zip, z2.zip, ROUND((3956 * 2 * atan2(sqrt((POW(sin(((z1.lat * (atan2(1,1) * 4) / 180) – (z2.lat * (atan2(1,1) * 4) / 180))/2.0),2) + cos((z2.lat * (atan2(1,1) * 4) / 180)) * cos((z1.lat * (atan2(1,1) * 4) / 180)) * POW(sin(((z1.lon * (atan2(1,1) * 4) / 180) – (z2.lon * (atan2(1,1) * 4) / 180))/2.0),2))),sqrt(1-(POW(sin(((z1.lat * (atan2(1,1) * 4) / 180) – (z2.lat * (atan2(1,1) * 4) / 180))/2.0),2) + cos((z2.lat * (atan2(1,1) * 4) / 180)) * cos((z1.lat * (atan2(1,1) * 4) / 180)) * POW(sin(((z1.lon * (atan2(1,1) * 4) / 180) – (z2.lon * (atan2(1,1) * 4) / 180))/2.0),2))))),2) as distance from zip_data as z1, zip_data as z2 where (3956 * 2 * atan2(sqrt((POW(sin(((z1.lat * (atan2(1,1) * 4) / 180) – (z2.lat * (atan2(1,1) * 4) / 180))/2.0),2) + cos((z2.lat * (atan2(1,1) * 4) / 180)) * cos((z1.lat * (atan2(1,1) * 4) / 180)) * POW(sin(((z1.lon * (atan2(1,1) * 4) / 180) – (z2.lon * (atan2(1,1) * 4) / 180))/2.0),2))),sqrt(1-(POW(sin(((z1.lat * (atan2(1,1) * 4) / 180) – (z2.lat * (atan2(1,1) * 4) / 180))/2.0),2) + cos((z2.lat * (atan2(1,1) * 4) / 180)) * cos((z1.lat * (atan2(1,1) * 4) / 180)) * POW(sin(((z1.lon * (atan2(1,1) * 4) / 180) – (z2.lon * (atan2(1,1) * 4) / 180))/2.0),2))))) < 100 and z2.zip != z1.zip

This generates a lookup table that contains two zip codes and their distance from each other for every zip in the USA within 100 miles of each other. It uses the Haversine formula to calculate the distance between two points on the surface of a sphere.

I used this when designing our vertical search engine to create the lookup tables we use for our radius search. I use MySQL for this. You’re also going to need a zip_data table that contains zip codes and their respective latitudes and longitudes. You can buy this data for about $50 from the many retailers online.

Leave a Comment

Your email address will not be published. Required fields are marked *

My name is Mark Maunder. I've been blogging since around 2003 when I started on Movable Type and ended up on WordPress which is what I use to publish today. With my wife Kerry, I'm the co-founder of Wordfence which protects over 5 million WordPress sites from hackers and is run by a talented team of 36 people. I'm an instrument rated pilot and I fly a Cessna 206 along with a 1964 Cessna 172 in the Pacific Northwest and Colorado. I'm originally from Cape Town, South Africa but live in the US these days. I code in a bunch of languages and am quite excited about our emerging AI overlords and how they're going to be putting us to work for them.