16
Apr
2008

Filed under: django, geography, location

6 comments

Django geography hacks

If you need to handle some basic geography in your Django application, but can't or don't want to use GeoDjango, here are some quick hacks to accomplish simple geocoding and distance calculations.

When I set out to add these to my app, I looked around for something already written, and of course found GeoDjango, which incorporates tons of GIS functions. But it's its own Django branch, with a hefty list of prerequisite software. It looks great for serious GIS projects, but it was more than I wanted to take on for this. Fortunately, I found a lighter alternative, geopy, which was just what I was after.

First, I had to geocode my locations, finding their latitude and longitude using addresses or postal codes. Google and Yahoo! have mapping APIs with excellent geocoding, but you're only allowed to use their geocoding data to display their maps. Fortunately, geopy supports a number of other geocoding services with better terms. I'll use geocoder.us, which is free for non-commercial use and very reasonable otherwise.

Once you have coordinates, geopy gives you a number of accurate, easy methods to calculate distance. So, on to integrating it into your Django model. Here's an example:

from django.db import models
from django.contrib.localflavor.us.us_states import STATE_CHOICES

import geopy.distance
from geopy import geocoders

class LocationManager(models.Manager):
  def __init__(self):
    super(LocationManager, self).__init__()

  def near(self, latitude=None, longitude=None, distance=None):
    if not (latitude and longitude and distance):
      return []

    queryset = super(LocationManager, self).get_query_set()

    # prune down the set of all locations to something we can quickly check precisely
    rough_distance = geopy.distance.arc_degrees(arcminutes=geopy.distance.nm(miles=distance)) * 2
    queryset = queryset.filter(
          latitude__range=(latitude - rough_distance, latitude + rough_distance), 
          longitude__range=(longitude - rough_distance, longitude + rough_distance)
          )

    locations = []
    for location in queryset:
      if location.latitude and location.longitude:
        exact_distance = geopy.distance.distance(
                  (latitude, longitude),
                  (location.latitude, location.longitude)
                  )
        exact_distance.calculate()
        if exact_distance.miles <= distance:
          locations.append(location)

    queryset = queryset.filter(id__in=[l.id for l in locations])
    return queryset

class Location(models.Model):
  objects = LocationManager()
  name = models.CharField(max_length=100)
  address = models.TextField()
  city = models.CharField(max_length=100)
  state = models.USStateField(choices=STATE_CHOICES)
  zipcode = models.CharField(max_length=10)
  latitude = models.FloatField(blank=True, null=True)
  longitude = models.FloatField(blank=True, null=True)

  def save(self):
    if not (self.latitude and self.longitude):
      geocoder = geocoders.GeocoderDotUS()

      geocoding_results = None

      if self.address:
        # try the full address
        query = '%(address)s, %(city)s, %(state)s %(zipcode)s' % self.__dict__
        geocoding_results = list(geocoder.geocode(query, exactly_one=False))

      # then just city/state/zip
      if not geocoding_results and self.city:
        query = '%(city)s, %(state)s %(zipcode)s' % self.__dict__
        geocoding_results = list(geocoder.geocode(query, exactly_one=False))

      # and finally just zip
      if not geocoding_results and self.zipcode:
        query = '%(zipcode)s' % self.__dict__
        geocoding_results = list(geocoder.geocode(query, exactly_one=False))

      if geocoding_results:
        place, (latitude, longitude) = geocoding_results[0]
        self.latitude = latitude
        self.longitude = longitude
    super(Location, self).save()

Let's work backwards. The fields of the Location class are straightforward (if you're in the U.S., of course; otherwise replace state and zipcode). The geocoding is also pretty obvious: use the most precise address information available to obtain the Location's coordinates. It's worth experimenting with the different geocoders geopy supports to see which works best for you.

On to distance calculation. The LocationManager class has a utility method, near, which given a point's latitude and longitude and a maximum distance, returns all the locations within that distance of the point. I'm using the default method, the Vincenty distance. You could use one of geopy's other methods, like great circle distance. Other things you might want to change:

  • On the assumption that there will be a lot of Locations in the database, I'm doing some crude winnowing of Locations before performing the precise calculations; in my testing so far this provides acceptable performance with over a million locations. If that's not necessary for your dataset, it can be easily removed.
  • You can easily work with kilometers instead of miles, but again, this app is only going to be used in the U.S.

Now that we have geocoded locations and a way to determine which are within a given distance of a certain point, here's how you might use them. Say you have another Model called Activity, which has a ForeignKey link to Location. The user is looking for activities near his ZIP code, which you also have modeled with latitude and longitude. You could do something like this in your search view:

if zipcode:
  zc = ZipCode.objects.get(zipcode=zipcode)
  if distance:
    distance=int(distance)
    nearby_locations = Location.objects.near(zc.latitude, zc.longitude, distance)
    activities = Activity.objects.filter(location__in=nearby_locations)

It doesn't get much simpler than that, and it's perfectly adequate for rough distance estimates used to narrow down datasets by location.

One last thing. You can get free databases of ZIP codes with latitude and longitude at:

Comments (6)

Jj — 17 June 2008 0:51
I am looking for a locations solution as well for several Django projects we're working on.
We've had the issue with the locations hierarcy since we're not in US, and this app has to run for different countries, so State -> Zip doesn't cut it.
We found Django Geonames which does a decent job modeling this situation.
But that doesn't fulfill 100% my requirement.
I am currently workin on a pluggable app, like Django-tagging that provides a Widget, a formfield, ModelField, and location models to specify Exact addresses or city references.

I suppose this is a quite used solution but I am surprised that I(google) can't find something that fits that requirement. If you heard of something, I'd love to hear.
john — 17 June 2008 16:19
I've not run across a comprehensive solution. What I found was that there are databases that are close (Geonames, MaxMind, the GEOnet Names Server database (http://earth-info.nga.mil/gns/html/namefiles.htm)), but no one source for all the place info you'd want -- particularly outside of the US.

And it's the same story on the application front. Django has the localflavor contrib helpers. Satchmo (http://satchmoproject.com) has its own model for place info down to first-level admin areas. There's Babel (http://babel.edgewall.com/) for translated place names, timezones, currencies and other localization. So far, django-geonames looks like the best general solution.

Sorry I can't be more help. If you do find something, or end up publishing what you're working on, please let me know.
Erik Ankrom — 31 August 2008 16:46
Thanks for the code, works like a charm for me with a couple of modifications!

I commented out

if not (self.latitude and self.longitude):

since this would only invoke the geocoding if there weren't existing values. If the user changes address,city,state, or zip after first creation, it isn't updated. This didn't work for my use case. My model isn't updated that frequently, so doesn't add a whole lot of overhead.

Also, I used Google Maps API rather than gecoder.us because I was having problems with lookups that only contained city and state. Solved my issue going though Google, and I'm using Google Maps for output anyway.

I added a couple of pairs for trying as well, here are all I try in order:

address, city, state zip
address, city, state
city, state zip
city, state
zip

Thanks for sharing!
john — 31 August 2008 17:24
Thanks, Erik. Glad it was useful. For usage that doesn't meet Google's terms, geocoder.us has a beta API to geocode city and state:

http://geocoder.us/help/city_state_zip.shtml

It's not yet supported by geopy, but their CSV interface is super easy. Works OK for me so far, which is good, because I have run into a few addresses that Google can do and geocoder.us can't, and this at least gets me close.

John Boxall — 18 October 2008 22:34
Hey John, just wanted to say thanks for the snippet!

If anyone else is looking to have the retrieved locations sorted it's an easy addition:

close_locations = sorted(locations, lambda x,y: cmp(x.miles, y.miles))[:10]


queryset = queryset.filter(id__in=[l.id for l in close_locations])
return queryset
Nick — 26 August 2009 1:13
I have run into a few small problems.

I recieved a KeyError u'city'. I have city mapped to a ForeignKey in another Table.

Also, It will only map the points if I take out city and state from the full address search. State is the USStateField and city, as stated above, is a ForeignKey.

Comments have been turned off for this article, but you can always contact us about it.