<< ben fry |
genome valence |
The most recent version of Valence (shown above) visualizes biological data,
and was created for the
|
I also developed the |
Jennifer Connelly makes use of Genome Valence in the Hulk movie. |
With several 'genome' projects nearing states of completion, a primary
use of the data for biologists is to search for a sequence of letters
and see if it's found in the genome of another organism. If the
sequence is found, it is then possible, based on what's known about
the sequence as it's found in the other organism, to guess the
function of that sequence of letters.
This piece is a visual representation of the algorithm (called BLAST) most commonly used for genome searches. The genome of an organism is made up of thousands of genes (34,000 for the human, 20,000 for the mouse, and 14,000 for the fruitfly). A gene is made up of a sequence of As, Cs, Gs, Ts that averages 1000 to 2000 letters apiece. In order to handle this amount of information, the BLAST algorithm breaks each sequence of letters into 9 letter parts. Every unique nine letter set is represented as a point on screen. The points are arranged from the center, with the most common sets on the outside, the less common towards the middle. |
|||||||
Across the top, a sequence to be searched for is read from an organism. For each set of 9 letters found in the sequence, an arc is drawn between its point in the space and the point representing the next set of nine letters. |
Meanwhile, the same sequence as above can be seen moving through the
space as a ribbon of text, wrapping itself between the points that it
connects.
For most nine letter sets, there are three points, corresponding to the three organisms and how frequently that set is found in each. The three points are connected by the three lines on each arc, one for each of the organisms being represented--the outer ring is usually the human, the inner is the fruitfly. Using the trackball, it is possible to input a sequence for a search by clicking and dragging across the bottom of the screen. This will draw another ribbon that weaves through the space to highlight the sequence of selected letters. |
||||||
This piece is part of a larger body of research into how to build
visual constructions of very large amounts of data, in particular
genomic information. The works range from practical tools to
conceptual works for alternative methods for viewing data.
Valence originated as a project for my master's thesis which focussed on using properties of organic systems as a method for dealing with large amounts of data from dynamic sources. |