Got an email from Mebane Faber who noted the roughly inverse correlation you currently see in salaryper, and asking about whether I’d done proper year-end analysis. The response follows:
I threw the project together as sort of a fun thing out of curiosity, and haven’t taken the time to do a proper analysis. However you can see in the previous years that the inverse relationship happens each year at the beginning of the season, and then as it progresses, the big market teams tend to mow down the small guys. Or at least those that are successful–the correlation between salary and performance at the end of a season is generally pretty haphazard. In fact, it’s possible that the inverse correlation at the beginning of the season is actually stronger than the positive correlation at the end.
I think the last point is kinda funny, though I’d imagine there’s a less funny statistics term for that phenomenon. Such a fine line between funny and sounding important.
Some thoughtful comments passed along by Alex Hutton regarding the last post:
Part of the problem with point technology solutions is in the policies of implementation. IMHO, we undervalue the subject matter expert, or operate as a denigrated bureaucracy which does not allow the subject matter expert the flexibility to make decisions. When that happens, the decision is left to technology (and as you point out, no technology is a perfect decision maker).
I thought it was apropos that you brought in the Schneier example. I’ve been very much involved in a parallel thought process in the same industry as he, and we (my partner and I) are coming to a solution that attempts to balance technology, point human decision, and the bureaucracy within which they operate.
If you believe the Bayesians, then the right Bayesian network mimics the way the brain processes qualitative information to create a belief (or in the terms of Bayesians, a probability statement used to make a decision). As such, the current way we use the technology (that policy of implementation, above) is faulty because it minimizes that “Human Computational Engine” for a relatively unsophisticated, unthinking technology. That’s not to say that technologies like facial recognition are worthless - computational engines, even less magic ones that aren’t 99.99% accurate, are valid pieces of prior information (data).
Now in the same way, Human Computational Engines are also less than perfectly accurate. In fact, they are not at all guaranteed to work the same way twice - even by the same person unless that person is using framework to provide rigor, rationality, and consistency in analysis.
So ideally, in physical security (or information security where Schneier and I come from) the imperfect computer detection engine is combined with a good Bayesian network and well trained/educated/experienced subject matter experts to create a more accurate probability statement around terrorist/non-terrorist - one that at least is better at identifying cases where more information is needed before a person is prevented from flying, searched and detained. While this method, too, would not be 100% infallible (no solution will ever be), it would create a more accurate means of detection by utilizing the best of the human computational engine.
I believe the Bayesians, just 99.99% of the time.
A great Unicode in 5 Minutes presentation from Mark Lentczner at Linden Lab. He passed it along after reading this dense post, clearly concerned about the welfare of my readers.
(Searching out the image for the title of this post also led me to a collection of Favourite Unicode Codepoints. This seems ripe for someone to waste more time really tracking down such things and documenting them.)
Mark’s also behind Context Free, one of the “related initiatives” that we have listed on Processing.org.
Context Free is a program that generates images from written instructions called a grammar. The program follows the instructions in a few seconds to create images that can contain millions of shapes.
Grammars are covered briefly in the Parse chapter of vida, with the name of the language coming from a specific variety called Context Free Grammars. The magical (and manic) part of grammars is that their rules tend to be recursive and layered, which leads to a certain kind of insanity as you try to tease out how the rules work. With Context Free, Mark has instead turned this dizziness into the basis for creating visual form.
Updated 14 May 08 to fix the glyph. Thanks to Paul Oppenheim, Spidery Ha Devotee, for the correction.
Got an email over the weekend from Tom Vanderbilt, who had seen the All Streets piece, and was kind enough to point me to this map (PDF) from the USGS that depicts the average distance to the nearest road across the continental 48 states. (He’s currently working on a book titled Traffic: Why We Drive the Way We Do (and What It Says About Us) to be released this fall).
And too bad I just learned the word conterminous, but had I used that in the original project description, we would have missed (or been spared) the Metafilter discussion of whether “lower 48” was accurate terminology.

A really interesting map, which of course also shows the difference between something thrown together in a few hours and actual research. In digging around for the map’s source, I found that exactly a year ago, they also published a paper in Science describing their broader work:
Roads encroaching into undeveloped areas generally degrade ecological and watershed conditions and simultaneously provide access to natural resources, land parcels for development, and recreation. A metric of roadless space is needed for monitoring the balance between these ecological costs and societal benefits. We introduce a metric, roadless volume (RV), which is derived from the calculated distance to the nearest road. RV is useful and integrable over scales ranging from local to national. The 2.1 million cubic kilometers of RV in the conterminous United States are distributed with extreme inhomogeneity among its counties.
The publication even includes a response and a response to the response—high scientific drama! Apparently some lads feel that “roadless volume does not explicitly address ecological processes.” So let that be a warning to all you non-explicit addressers.
For those lucky to have access to the journal online, the supplementary information includes a time lapse video of a section of Colorado, and its roadless volume since 1937. As with all things, it’s much more interesting to see how this changes over time. A map of all streets in the lower 48 isn’t nearly as engaging as a sequence of the same area over several years. The latter story is simply far more compelling.