Friday, March 19, 2010

The Six Degrees of Functional Genomics

I enjoyed a whirlwind functional genomics talk by Steve Kay today about the circadian rhythm in a number of model organisms. Perhaps most interesting was his outline of the functional genomic approach. They were, paraphrased, as follows.

1) Identify the elements of the circadian rhythm machinery
2) Model and generate quantitative hypotheses about this machinery
3) Synthetically reproduce the machinery

He went on to describe how we are working on wrapping up step 1 and just beginning to dip our toes into step 2.

I should start by saying that I think the research described was truly impressive and brought to bear a large number of high throughput techniques to answer a question in a way that went beyond the usual model of figuring out what "your favorite gene" has to do with process X. He's turned things on their head and asked what does gene Y have to do with "my favorite process", and answered the question en masse.

However after he delved into the previous work on circadian rhythm I was left a little worried. In a process which, like the cell cycle machinery, transcription and translational control is so key, how can you avoid the fact that a vast swath of the cell's general control mechanisms for these fundamental processes will in some way also affect the circadian rhythm? At some point, any element of the cell that isn't completely inert is going to affect any process you can choose, albeit in perhaps a small way.

Which brings me to the Six Degrees of Functional Genomics. No part of the cell exists in isolation. No process in the cell can, really, be excised from the context of the larger cell. What we're talking about, after all, a tiny little bag of water packed chock full of different proteins. True, there are compartments, but these compartments have a nasty way of communicating with each other. In the end, every protein in the cell is functionally related to every other protein of the cell, given enough degrees of separation. Quantitative models are likely to look more like weather models than a nice damped oscillator. In the end, it's all very dependent on initial conditions and most predictions will be probabilistic in nature.

To try to develop quantitative models of cellular behavior given current knowledge may be like trying to model all of human social interaction by the Facebook "friends" network. It's heavily biased for certain kinds of interactions, like those between college dorm mates. It's true, these relationships are very important, and they tell you a lot about how the cell behaves day to day. But there's almost certainly a set of interactions that we don't know we don't know, like the interactions with parents and family (not everyone wants their parents seeing all those facebook photos).

To this extent it probably is important to try to develop the networks we currently have. We do still want to find key elements of our processes of interest. Hopefully we don't get so carried away looking for the next gene that we forget the complexity of cellular function try to build ever bigger piles of genes in our category of interest.

I'm aware that I am setting up a bit of a straw man, here, and I don't pretend that Dr. Kay is unaware of these kinds of concerns. But I think when the broader scientific community looks to functional genomics and computational biology for answers, they need to be aware of the fundamental limitations. Frankly I am not sure if the broader scientific community looks to functional genomics and computational genomics for much of anything, but that has its own problems.