Writing

Watching the evolution of the “Origin of Species”

I’ve just posted a new piece that depicts changes between the multiple editions of Darwin’s “On the Origin of Species:

screen-outline-500px

To quote myself, because it looks important:

We often think of scientific ideas, such as Darwin’s theory of evolution, as fixed notions that are accepted as finished. In fact, Darwin’s On the Origin of Species evolved over the course of several editions he wrote, edited, and updated during his lifetime. The first English edition was approximately 150,000 words and the sixth is a much larger 190,000 words. In the changes are refinements and shifts in ideas — whether increasing the weight of a statement, adding details, or even a change in the idea itself.

The idea that we can actually see change over time in a person’s thinking is fascinating. Darwin scholars are of course familiar with this story, but here we can view it directly, both on a macro-level as it animates, or word-by-word as we examine pieces of the text more closely.

This is hopefully the first of multiple pieces working with this data. Having worked with it since last December, I’ve been developing a larger application that deals with the information in a more sophisticated way, but that’s continually set aside because of other obligations. This simpler piece was developed for Emily King’s “Quick Quick Slow” exhibition opening next week at Experimenta Design in Portugal. As is often the case, many months were spent to try to create something monolithic, then in a very short time, an offshoot of all that work is developed that makes use of that infrastructure.

Oddly enough, I first became interested in this because of a discussion with a friend a few years ago, who had begun to wonder whether Darwin had stolen most of his better ideas from Alfred Russel Wallace, but gained the notoriety and credit because of his social status. (This appealed to the paranoid creator in me.) She cited the first edition of Darwin’s text as incoherent, and that it gradually improved over time. Interestingly (and happily, I suppose), the process of working on this piece has instead shown the opposite, and I have far greater appreciation for Darwin’s ideas than I had in the past.

Friday, September 4, 2009 | science, text, time  

Thesaurus Plus Context

can i get it in red?BBC News brings word (via) that after a 44 year effort, the Historical Thesaurus of the Oxford English Dictionary will see the light of day. Rather than simple links between words, the beastly volume covers the history of the words within. For instance, the etymological timeline of the word “trousers” follows:

trousers breeks 1552- · strosser 1598-1637 · strouse 1600-1620 · brogues 1615- a 1845 · trouses 1679-1820 · trousers 1681- · trouser 1702- ( rare ) · inexpressibles 1790- ( colloq. ) · indescribables 1794-1837 ( humorous slang ) ·etceteras 1794-1843 ( euphem. ) · kickseys/kicksies 1812-1851 ( slang ) · pair of trousers 1814- · ineffables 1823-1867 ( colloq. ) · unmentionables 1823- · pantaloons 1825- · indispensables a 1828- ( colloq. euphem. ) · unimaginables 1833 · innominables 1834/43 ( humorous euphem. ) · inexplicables 1836/7 · unwhisperables 1837-1863 ( slang ) · result 1839 · sit-down-upons 1840-1844 ( colloq. ) · pants 1840- · sit-upons 1841-1857 ( colloq. ) · unutterables 1843; 1860 ( slang Dict. ) · trews 1847- · sine qua nons 1850 · never-mention-ems 1856 · round-me-houses 1857 ( slang ) · round-the-houses 1858- ( slang ) · unprintables 1860 · stove-pipes 1863 · terminations 1863 · reach-me-downs 1877- · sit-in-’ems/sitinems 1886- ( slang ) · trousies 1886- · strides1889- ( slang ) · rounds 1893 ( slang ) · rammies 1919- ( Austral. &S. Afr. slang ) · longs 1928- ( colloq. )

Followed by a proper explanation:

breeks The earliest reference from 1552 marks the change in fashion from breeches, a garment tied below the knee and worn with tights. Still used in Scotland, it derives from the Old English “breeches”. trouser The singular form of “trousers” comes from the Gallic word “trews”, a close-fitting tartan garment formerly worn by Scottish and Irish highlanders and to this day by a Scottish regiment. The word “trouses” probably has the same derivation. unimaginables This 19th Century word, and others like “unwhisperables” and “never-mention-ems”, reflect Victorian prudery. Back then, even trousers were considered risque, which is why there were so many synonyms. People didn’t want to confront the brutal idea, so found jocular alternatives. In the same way the word death is avoided with phrases like “pass away” and “pushing up daisies”. stove-pipes A 19th Century reference hijacked in the 1950s by the Teddy Boys along with drainpipes. The tight trousers became synonymous with youthful rebellion, a statement of difference from the standard post-war suits. rammies This abbreviation of Victorian cockney rhyming slang “round-me-houses” travelled with British settlers to Australia and South Africa.

Are you seeing pictures and timelines yet? Then this continues for 600,000 more words. Mmmm!

And Ms. Christian Kay, one of the four editors, is my new hero:

An English language professor, Ms Kay, one of four co-editors of the publication, began work on it in the late 1960s – while she was in her 20s.

It’s hard to fathom being in your 60s, and completing the book that you started in your 20s, though it’s difficult to argue with the academic and societal contribution of the work. Her web page also lists “the use of computers in teaching and research” as one of her interest areas, which sounds like a bit of an understatement. I’d be interested in computers too if my research interest was the history 600,000 words and their 800,000 meanings across 236,000 categories.

Sadly, this book of life is not cheap, currently listed at Amazon for $316, (but that’s $79 off the cover price!) Though with a wife who covets the full 20 volume Oxford English Dictionary (she already owns the smaller, 35 lbs. version), I may someday get my wish.

Wednesday, August 5, 2009 | text  

History of Predictive Text Swearing

Wonderful commentary on being nannied by your mobile, and head-in-the-sand text prediction algorithms.

There’s lots more to be said about predictive text, but in the meantime, this also brings to mind Jonathan Harris’ QueryCount, which I found to be a more interesting followup to his WordCount project. (WordCount tells us something we already know, but QueryCount lets us see something we suspect.)

Monday, August 18, 2008 | text  
Book

Visualizing Data Book CoverVisualizing Data is my 2007 book about computational information design. It covers the path from raw data to how we understand it, detailing how to begin with a set of numbers and produce images or software that lets you view and interact with information. When first published, it was the only book(s) for people who wanted to learn how to actually build a data visualization in code.

The text was published by O’Reilly in December 2007 and can be found at Amazon and elsewhere. Amazon also has an edition for the Kindle, for people who aren’t into the dead tree thing. (Proceeds from Amazon links found on this page are used to pay my web hosting bill.)

Examples for the book can be found here.

The book covers ideas found in my Ph.D. dissertation, which is the basis for Chapter 1. The next chapter is an extremely brief introduction to Processing, which is used for the examples. Next is (chapter 3) is a simple mapping project to place data points on a map of the United States. Of course, the idea is not that lots of people want to visualize data for each of 50 states. Instead, it’s a jumping off point for learning how to lay out data spatially.

The chapters that follow cover six more projects, such as salary vs. performance (Chapter 5), zipdecode (Chapter 6), followed by more advanced topics dealing with trees, treemaps, hierarchies, and recursion (Chapter 7), plus graphs and networks (Chapter 8).

This site is used for follow-up code and writing about related topics.