Writing

Obama Limited to 16 Bits

I guess I never thought I’d read about the 16-bit limitations of Microsoft Excel in mainstream press (or at least outside the geek press), but here it is:

Obama’s January fundraising report, detailing the $23 million he raised and $41 million he spent in the last three months of 2007, far exceeded 65,536 rows listing contributions, refunds, expenditures, debts, reimbursements and other details.

Excel has since its inception been limited to 65,536 rows, the maximum number you get when you represent the row number using two bytes. Mr. Millionsfromsmallcontributions has apparently flown past this limit in his FEC reports, forcing poor reporters to either use Microsoft Access (a database program) or pray for the just-released Excel 2007, where in fact the row restriction has been lifted.

In the past the argument against fixing the restriction had always been a mixture of “it’s too messy to upgrade something like that” and “you shouldn’t have that many rows of data in a spreadsheet anyway, you should use a database.” Personally I disagree with the latter; and as silly as the former sounds, it’s been the case for a good 20 years (or was the row limit even lower back then?)

The OpenOffice project, for instance, has an entire page dedicated to fixing the issue in OpenOffice Calc, where they’re limited to 30,000 rows—the limit being tied to 32,768, or the number you get with 15 bits instead of 16 (use the sixteenth bit as the sign bit indicating positive or negative, and you can represent numbers from -32768 to 32767 instead of unsigned 16 bit values that range from 0 to 65535).

Bottoms up for the first post tagged both “parse” and “politics”.

Thursday, June 5, 2008 | parse, politics  
Book

Visualizing Data Book CoverVisualizing Data is my 2007 book about computational information design. It covers the path from raw data to how we understand it, detailing how to begin with a set of numbers and produce images or software that lets you view and interact with information. When first published, it was the only book(s) for people who wanted to learn how to actually build a data visualization in code.

The text was published by O’Reilly in December 2007 and can be found at Amazon and elsewhere. Amazon also has an edition for the Kindle, for people who aren’t into the dead tree thing. (Proceeds from Amazon links found on this page are used to pay my web hosting bill.)

Examples for the book can be found here.

The book covers ideas found in my Ph.D. dissertation, which is the basis for Chapter 1. The next chapter is an extremely brief introduction to Processing, which is used for the examples. Next is (chapter 3) is a simple mapping project to place data points on a map of the United States. Of course, the idea is not that lots of people want to visualize data for each of 50 states. Instead, it’s a jumping off point for learning how to lay out data spatially.

The chapters that follow cover six more projects, such as salary vs. performance (Chapter 5), zipdecode (Chapter 6), followed by more advanced topics dealing with trees, treemaps, hierarchies, and recursion (Chapter 7), plus graphs and networks (Chapter 8).

This site is used for follow-up code and writing about related topics.