Work the Shell - Calculating the Distance between Two Latitude/Longitude Points
Last month, I closed this column with a script that can return latitude/longitude values for two addresses, with the intent ultimately being to have the script calculate the distance between those two points. As an example:
$ farapart.sh "union station, denver co" \ "union station, chicago il" Calculating lat/long for union station, denver co = 39.75288, -105.000473 And calculating lat/long for union station, chicago il = 41.878658, -87.640404
The formula to calculate distance actually is pretty complicated. Here's a JavaScript implementation of the math I showed last month:
var R = 6371; // kilometers var dLat = (lat2-lat1); var dLon = (lon2-lon1); var a = Math.sin(dLat/2) * Math.sin(dLat/2) + Math.cos(lat1.toRad()) * Math.cos(lat2.toRad()) * Math.sin(dLon/2) * Math.sin(dLon/2); var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a)); var d = R * c;
This is going to be a wee bit tricky to convert into a shell script, needless to say, but because we have to use a more sophisticated math tool than the built-in capabilities of Bash anyway, this also means we have a number of options to work with, including Perl, awk and bc. For that matter, we also can just write a quick C program that solves this equation given four variables, but really, why make it easy when I can make it complex? If I wanted easy, I would whip out some Perl, right? Last month, I promised some bc, so let's see if we can make that rusty old app do the heavy lifting.
The first step mathematically is to convert the lat/lon values we get from the mapping system from degrees to radians. This turns out to be straightforward:
Radians = degrees * ( pi / 180 )
Pi, of course, is 3.1415926535897932384.
Given values like:
41.878658, -87.640404
The radians equivalent of those is then:
0.7309204767, -1.529613605
To warm up with bc, here's a simple command-line way to calculate one of these values:
echo "scale=8; -87.640404 * ( 3.14159265 / 180)" | bc
That's all well and good, but it turns out that the different equations I explored for calculating the distance between two points requires the atan2() function, which isn't part of bc.
Rather than beat my head against the old-school wall until the bits are bloodied, I'm going to throw in the towel and admit that this might just be a bit too complex a mathematical problem for a shell script and bc.
Having spent way more hours than I want to admit trying to get this to work properly in bc, I'm going to “cry uncle” and switch temporarily into a different programming language. I'm going to jump into C for a few lines and whip out a simple program that, given two lat/lon pairs in degrees, calculates the distance between them in miles (Listing 1).
Listing 1. C Distance Program
#include <stdio.h> #include <math.h> #include <stdlib.h> #define EARTH_RADIUS (6371.0072 * 0.6214) #define TORADS(degrees) (degrees * (M_PI / 180)) main(int argc, char **argv) { double lat1, long1, lat2, long2; double dLat, dLong, a, c, d; lat1 = TORADS(atof(argv[1])); long1 = TORADS(atof(argv[2])); lat2 = TORADS(atof(argv[3])); long2 = TORADS(atof(argv[4])); dLat = lat2 - lat1; dLong = long2 - long1; a = sin(dLat/2) * sin(dLat/2) + cos(lat1) * cos(lat2) * sin(dLong/2) * sin(dLong/2); c = 2 * atan2(sqrt(a), sqrt(1-a)); printf("%g\n", EARTH_RADIUS * c); }
Does it work? Let's find out:
$ distance 39.75288 -105.000473 41.878658 -87.640404 917.984
That seems reasonable. The great circle distance between those two points is 917 miles. Google Maps, of course, shows about 10% greater distance, but perhaps that's because there is no direct-as-the-crow-flies route via roads?
Of course, there also are errors with this formula too, because Earth isn't a perfect sphere but rather an oblate spheroid that has a different diameter depending on where you're measuring. But for our purposes, let's just gloss over that problem. You can Google it to learn about things like the Vincenty formula, but that's beyond the scope of this ridiculous sidetrack.
Now we have all the pieces we need: location to lat/lon and distance between two lat/lon points. Let's put it all together.
To get everything to work well, I actually hacked and slashed at the original script to make it a bit more succinct and, of course, invoke the C “distance” program as shown in Listing 1. [Listing 1 also is available on our FTP site at ftp.linuxjournal.com/pub/lj/listings/issue188/10606.tgz.] Ready? It's surprisingly short:
#!/bin/sh converter="https://api.maps.yahoo.com/ajax/ ↪geocode?appid=onestep&qt=1&id=m&qs=" tmpfile="/tmp/bc.script.$$" # Get lat/long for point 1 addr="$(echo $1 | sed 's/ /+/g')" values="$(curl -s $converter$addr | \ cut -d\" -f13,15 | \ sed 's/[^0-9\.\,\-]//g;s/,$//')" lat1=$(echo $values | cut -d, -f1) long1=$(echo $values | cut -d, -f2) # Get lat/long for point 2 addr="$(echo $2 | sed 's/ /+/g')" values="$(curl -s $converter$addr | \ cut -d\" -f13,15 | \ sed 's/[^0-9\.\,\-]//g;s/,$//')" lat2=$(echo $values | cut -d, -f1) long2=$(echo $values | cut -d, -f2) # Now we have the lat/long for both points, let's # figure out the distance between them... dist=$(./distance $lat1 $long1 $lat2 $long2) echo "$1 to $2 is $dist miles" exit 0
The script would be even shorter if we tweaked the C program to accept x,y location pairs, but I'll leave that one to you. Instead, let's do a few tests:
$ farapart.sh \ "union station, denver, co" \ "union station, chicago, il" union station, denver, co to union station, chicago, il is 917.984 miles
Now, how about something a bit more ambiguous:
$ farapart.sh "long beach, ca" "boston, ma" long beach, ca to boston, ma is 2597.53 miles
Well, darn it, that seems way too short. Let's see what Yahoo Maps reports as the distance between those two cities. Sure enough, it reports that the trip should be 3,015 miles, not 2,597 miles.
Somewhere there's an error that's giving us poor results. My guess is there's some sort of significant rounding error going on in the C program (because we can verify experimentally that the lat/lon information we're getting is valid, simply by plugging it in to a mapping app and seeing where it places us).
I'm all tapped out on this example, however. It turned out to be far more tricky than I anticipated, and I leave it as an exercise to you, dear reader, to see if you can figure out what's broken in the C program and report your fix to us. We'll publish the best of them next month! Meanwhile, next column, I'll get back to something that's more about the shell and less about mathematics. I mean, heck, I didn't like math when I was working on my computer science degree, so why am I playing with it now?
Dave Taylor has been involved with UNIX since he first logged in to the on-line network in 1980. That means that, yes, he's coming up to the 30-year mark now. You can find him just about everywhere on-line, but start here: www.DaveTaylorOnline.com. In addition to all his other projects, Dave is now a film critic. You can read his reviews at www.DaveOnFilm.com.