Finding things within some distance in SQL
Posted in Databases on Jan 2, 2011
One of the query optimization scenarios I’ve seen a lot over the years is finding something within some distance from a point. For example, finding people within some distance of yourself, apartments in a radius from a postal code, and so on.
These queries usually use the great-circle formula. That might be because Google finds lots of pages claiming that this is the right way to do a radius search. “The earth is not flat!”, they all say. That’s true, but it doesn’t mean that the great-circle formula is a good approach. It’s usually a really bad approach, in fact. It’s needlessly precise for most things, not precise enough for others, and it’s an expensive query to execute; all the trig functions tend to eat a bunch of CPU, and make it impossible to use ordinary indexes. This is true for all of the databases I’ve used – MySQL, Postgres, and SQL Server.
The great-circle formula is needlessly precise for a few reasons:
In cases where you really do need precision, there’s a reasonable chance that the great-circle formula still isn’t right for you, because not only is the earth not flat, the earth isn’t a sphere either.
What’s the optimization I usually suggest? It’s usually perfectly acceptable to just return results within a square centered on the point of interest. In most cases, the results will be just as satisfactory to the users. The remainder are usually very special cases.