First It Was the Genome, Now the Proteome: Another Information-Rich Library
"More complex than previously thought" -- how often have we heard that phrase in news stories about the human genome? We heard it again when the ENCODE project found unexpected complexity in the human transcriptome (the library of DNA that is transcribed). Now, the phrase is being used regarding the human proteome, the set of all human proteins encoded by transcribed DNA. From Johns Hopkins University:
In a summary of the effort, to be published May 29 in the journal Nature, the team also reports the identification of 193 novel proteins that came from regions of the genome not predicted to code for proteins, suggesting that the human genome is more complex than previously thought. (Emphasis added.)
Finding new proteins hints that there is far more functional information in the genome than currently known. The researchers liken themselves to librarians creating a card catalog for uncatalogued books:
"You can think of the human body as a huge library where each protein is a book," says Akhilesh Pandey, M.D., Ph.D., a professor at the McKusick-Nathans Institute of Genetic Medicine and of biological chemistry, pathology and oncology at The Johns Hopkins University and the founder and director of the Institute of Bioinformatics. "The difficulty is that we don't have a comprehensive catalog that gives us the titles of the available books and where to find them. We think we now have a good first draft of that comprehensive catalog."
Maybe it would be better to refer to the genes as instruction manuals for building robots, and the proteins as the working robots. Either way, the information content in the library is richer than expected:
While genes determine many of the characteristics of an organism, they do so by providing instructions for making proteins, the building blocks and workhorses of cells, and therefore of tissues and organs. For this reason, many investigators consider a catalog of human proteins -- and their location within the body -- to be even more instructive and useful than the catalog of genes in the human genome.
With more than 84% of the human protein-coding genes now catalogued from 30 human tissues, the researchers found that the "structures and functions of proteins are complex and diverse." Most interesting is the discovery that "supposedly noncoding regions of DNA" actually generate proteins:
Among the proteins whose data patterns have been characterized for the first time are many that were never predicted to exist. (Within the genome, in addition to the DNA sequences that encode proteins, there are stretches of DNA whose sequences do not follow a conventional protein-coding gene pattern and have therefore been labeled "noncoding.") The team's most unexpected finding was that 193 of the proteins they identified could be traced back to these supposedly noncoding regions of DNA.
Why would this be "unexpected"? Only a researcher who expects much of DNA to be evolutionary junk would be surprised. Advocates of the theory of intelligent design would expect function to be the norm. Predictably, the paper in Nature says nothing about evolution.
Another benefit of design-based research is the inspiration it generates. Look how thrilled the researchers were to find all this new information:
"This was the most exciting part of this study, finding further complexities in the genome," says Pandey. "The fact that 193 of the proteins came from DNA sequences predicted to be noncoding means that we don't fully understand how cells read DNA, because clearly those sequences do code for proteins."
Would it have been exciting to find that those regions were junk? Of course not. When you uncover more complexity, you find more things for scientists to study. The real "science stopper" has been the "junk DNA" myth. The proteome project shows how a search for design, with the expectation of finding design, is a motivator of good science. It helps bring scientists closer to "fully understanding" the information in life. After all, one seeks to understand a library more than a junkyard.
Pandey believes that the human proteome is so extensive and complex that researchers' catalog of it will never be fully complete, but this work provides a solid foundation that others can reliably build upon.
So many books, so little time! Now we have three catalogs in a row -- the genome, the transcriptome, and the proteome -- all revealing a vast library of complex, specified information. Intelligent design theory is the "solid foundation" on which future researchers can build a fuller understanding of the language of life.