This post was made possible by Kevin L., who reminded me of the awesomeness of Radio Lab during last Tuesday’s midday bicycle spin. For those not familiar, Radio Lab is an hour long science show. ‘Science show’ doesn’t really do it justice though; the shows are highly accessible, interesting, and always well done.

While on the Radio Lab site listening to the show on limits of the human body (appropriate ride topic), another show caught my eye, entitled *Numbers*. A Radio Lab on numbers? I was sure that there’d be some gems; it did not disappoint. There was one topic in particular that captured my attention, *Benford’s Law*.

I had never heard of Benford’s Law prior to the show. In short, it states that the first (non-zero) digits of numbers of many data sets follow a logarithmic distribution. 1 occurs the most frequently, roughly 30% of the time, 2 less frequently, approximately 17.5% of the time, down to 9, with a frequency of around 4.5% of the time. The claim in the show was that this applies to all kinds of financial transactions and is used as a tool by forensic accountants to find suspicious dealings. (Applications of Benford’s Law are permissible as evidence in court.) Additionally, because of the logarithmic nature of the distribution, it has the property of *scale invariance*; that is, if the data set is multiplied by constant, the distribution doesn’t change. Multiplying by a constant is often the exact method of conversion between units of measure. Therefore, the law will apply regardless of the unit of measure.

Skeptical that this actually works, I did a bit of probing. The first data set was all of my American Express credit card charges; I have a handy record of the roughly 1700 charges made since November 28, 2006. Kendall suggested turning this into a test of the scale invariance as well. Here are the results compared to the prediction made by Benford’s Law, in U.S. Dollars and also converted to Chinese Yuan:

So after my mind regrouped after exploding, I looked at a few other data sets. The next data set was the populations of incorporated areas in the United Sates, available from the Census Bureau (here):

A little more outlandish data set consists of the frequencies of words in Charles Dicken’s *Tale of Two Cities* (available here):

Not exactly spot on, but the general tendency seems to be there.

The next data set is the land areas of all of the countries in the world, in square kilometers and square miles (data courtesy of Wikipedia):

Even the numbers in the church bulletin seemed like they wanted to follow Benford’s Law:

I’m sure the 909 area code is to blame!

I’m going to wrap this up before the anthropomorphizing of numbers gets out of control (they don’t like it after all). Benford’s Law: fact is stranger than fiction?

P.S. The first segment in the *Numbers* Radio Lab show reported evidence that we aren’t born thinking about numbers in a successive fashion (i.e. 1, 2, 3, 4, …); rather our innate understanding is more or less logarithmic (i.e. 1, 2, 4, 8, 16, …). I find it interesting that real world data is also often logarithmic.

## 4 Comments

Nice work on the plots and variety of data sets, USA Today will run these results shortly, I’m sure of it!

I am excited that you posted a post I can read and not wonder what the heck your talking about. I thought the use of the number sorting as a forensic tool was pretty awesome. I look forward to many more street level instead of cloud level posts!

My gut reaction was that both math and computer science live on the same cloud. But then thinking about the number of people that have taken high school math or higher v. the number that are C.S. nerds revealed substantial cognitive bias in my gut. ðŸ™‚

On a tangent, the mathematician in the last segment of the Numbers show also happens to have authored a recent series of online articles for the New York Times. (My dad has been forwarding me links.) They’re a good read. The series starts with From Fish to Infinity: http://opinionator.blogs.nytimes.com/author/steven-strogatz/page/2/

Great post Jason. I’m a big Radio Lab fan. The Numbers show was good but my favorite was their August 2008 show The (Multi) Universe(s). If you are a recent fan it’s definitely worth checking out.

http://blogs.wnyc.org/radiolab/2008/08/12/the-multi-universes/

I can’t remember ever listening to a bad Radio Lab.