writing | ben fry

Writing

Our gattaca future begins with our sports heroes

The New York Times this morning documents Major League Baseball’s use of DNA tests to verify the age of baseball prospects:

Dozens of Latin American prospects in recent years have been caught purporting to be younger than they actually were as a way to make themselves more enticing to major league teams. Last week the Yankees voided the signing of an amateur from the Dominican Republic after a DNA test conducted by Major League Baseball’s department of investigations showed that the player had misrepresented his identity.

Some players have also had bone scans to be used in determining age range.

(Why does a “bone scan” sound so painful? “You won’t provide a DNA sample? Well, maybe you’ll change your mind after the bone scan!”)

Kathy Hudson of Johns Hopkins notes the problem with testing:

“The point of [the Genetic Information Nondiscrimination Act, passed last year] was to remove the temptation and prohibit employers from asking or receiving genetic information.”

The article continues and makes note of the fact that such tests are also used to determine whether a player’s parents are his real parents, which can have an upsetting outcome.

But perhaps the broader concern (outside broken homes) and the scarier motivation for expansion of such testing is noted by a scouting director (not named), who comments:

“Can they test susceptibility to cancer? I don’t know if they’re doing any of that. But I know they’re looking into trying to figure out susceptibility to injuries, things like that. If they come up with a test that shows someone’s connective tissue is at a high risk of not holding up, can that be used? I don’t know. I do think that’s where this is headed.”

Injury is perhaps the most significant, yet most random, factor in scouting. If we’re talking about paying someone $27 million, will the threat of a federal discrimination law (wielded by a young player and agent) really be enough to keep teams away from this?

Wednesday, July 22, 2009 | genetics, sports

Book

Visualizing Data is my 2007 book about computational information design. It covers the path from raw data to how we understand it, detailing how to begin with a set of numbers and produce images or software that lets you view and interact with information. When first published, it was the only book(s) for people who wanted to learn how to actually build a data visualization in code.

The text was published by O’Reilly in December 2007 and can be found at Amazon and elsewhere. Amazon also has an edition for the Kindle, for people who aren’t into the dead tree thing. (Proceeds from Amazon links found on this page are used to pay my web hosting bill.)

Examples for the book can be found here.

The book covers ideas found in my Ph.D. dissertation, which is the basis for Chapter 1. The next chapter is an extremely brief introduction to Processing, which is used for the examples. Next is (chapter 3) is a simple mapping project to place data points on a map of the United States. Of course, the idea is not that lots of people want to visualize data for each of 50 states. Instead, it’s a jumping off point for learning how to lay out data spatially.

The chapters that follow cover six more projects, such as salary vs. performance (Chapter 5), zipdecode (Chapter 6), followed by more advanced topics dealing with trees, treemaps, hierarchies, and recursion (Chapter 7), plus graphs and networks (Chapter 8).

This site is used for follow-up code and writing about related topics.

Much Clicked

Full Archives