Codes Within Codes: How Dual-Use Codons Challenge Statistical Methods for Inferring Natural Selection

David Klinghoffer recently noted the discovery of dual-use codons, dubbed “duons,” where a triplet of nucleotides can have multiple functions. Of course one of those is the standard function of encoding an amino acid. But now it turns out a codon can have another function as well: it can bind transcription factors which regulate the transcription of the gene. As many are now observing, this means that a single nucleotide sequence can have multiple levels of meaning. That is to say, there are multiple codes within the genetic code. In fact, one commentator observed that on the same analysis, codons may have more than two uses:

By this logic one could coin the term “trion” by pointing out that histone binding is also independently affected by A-C-T-G letter frequencies within protein-coding stretches of DNA.

But this isn’t the first time that scientists have discovered multiple codes in biology. Earlier this year I discussed research that found an analog code in the DNA that helps regulate gene expression, in addition to the digital code that encodes primary protein sequence. In other cases, multiple proteins are encoded by the same gene! And then of course there’s the splicing code, which helps control how RNAs transcribed from genes are spliced together in different ways to construct different proteins (see here and here).

It boggles the mind to think about how such “codes within codes” could evolve by random mutation and natural selection. But now it turns out that evidence of different functions for synonymous codons could threaten many standard methods used to infer selection in the first place

Because of redundancy in the genetic code, there are anywhere between two and six codons that will encode any given amino acid that life uses. These are called “synonymous codons” because they all have the same standard function in the genetic code: encoding the same amino acid. But this new study shows that codons can have other functions as well — like binding transcription factors. The paper concludes:

Our results indicate that simultaneous encoding of amino acid and regulatory information within exons is a major functional feature of complex genomes. The information architecture of the received genetic code is optimized for superimposition of additional information and this intrinsic flexibility has been extensively exploited by natural selection. Although TF [transcription factor] binding within exons may serve multiple functional roles, our analyses above is agnostic to these roles, which may be complex.

(Stergachis et al., “Exonic Transcription Factor Binding Directs Codon Choice and Affects Protein Evolution,” Science, Vol. 342: 1367-1372 (December 13, 2013).)

Of course “exploited by natural selection” is another way of saying “these codons are rife with different types of potentially useful functions.” A news article in Science elaborates on the findings:

Despite redundancy in the genetic code, the choice of codons used is highly biased in some proteins, suggesting that additional constraints operate in certain protein-coding regions of the genome. … The authors determined that ~14% of the codons within 86.9% of human genes are occupied by transcription factors. Such regions, called “duons,” therefore encode two types of information: one that is interpreted by the genetic code to make proteins and the other, by the transcription factor-binding regulatory code to influence gene expression. This requirement for transcription factors to bind within protein-coding regions of the genome has led to a considerable bias in codon usage and choice of amino acids, in a manner that is constrained by the binding motif of each transcription factor.

The paper argues that because there’s often a bias towards certain synonymous codons over other synonymous codons, this shows a “code” where transcription factors can “prefer” to bind to certain codons during processes that regulate gene expression. Because they haven’t elucidated the exact workings of this “code,” it’s hard to say for sure whether codon biases result from an actual “code,” or just preferences of the binding motifs of transcription factors. To understand how exactly these mechanisms work, more work will need to be done.

Nonetheless, one thing is clear: there ARE biases towards certain synonymous codons, meaning synonymous codons have some function, meaning synonymous codons are NOT functionally neutral. Given that we know that synonymous codons can preferentially bind transcription factors (or other molecules, like histones), we have a good idea of what kinds of functional mechanisms are causing certain synonymous codons to be preferred.

All of this poses a major conundrum for statistical methods that evolutionists use to infer natural selection in studies purporting to explain the evolution of genes. Last summer here on ENV, I discussed and critiqued such statistical methods. According to this way of thinking, an excess of nonsynonymous mutations implies “positive selection” is preserving mutations that change amino acid sequence. An excess of synonymous mutations implies selection is at work to “weed out” mutations that change amino acid sequence — i.e., there is no “positive selection.” If synonymous and nonsynonymous mutations are fixed at a proportional rate, this indicates no selection pressure, and the gene is undergoing “neutral” evolution.

To use these statistical techniques, evolutionary biologists rely on the crucial assumptions that (1) synonymous mutations are selectively neutral because they don’t modify amino acid sequence in a protein or perform any other selectable functions, and (2) nonsynonymous mutations which change the amino acid sequence are preserved because they cause some selectable change in protein function. In my article last summer I cited multiple studies that challenge both assumptions. For example, the first assumption is challenged by a paper in Science which said the “discovery that synonymous codon changes can so profoundly change the role of a protein adds a new level of complexity to how we interpret the genetic code.”[***] In other words, synonymous codons can have functions in addition to encoding an amino acid. This recent study in Science, however, provides additional strong evidence refuting the first assumption that synonymous mutations are selectively neutral.

(Indeed, the news article in Science also challenged the second assumption, stating, “Intriguingly, a large fraction of the variants that result in a nonsynonymous change are predicted not to alter protein function.”)

But the main point of this new study is as follows: particular synonymous codons can be preferred for functional reasons. This means synonymous codons can have important functions in addition to encoding an amino acid, suggesting that the numerous studies which have purported to detect natural selection in genes, operating under the assumption that synonymous codons are selectively neutral, should be viewed with extreme skepticism.

*** Reference Cited: Ivana Weygand-Durasevic and Michael Ibba, “New Roles for Codon Usage,” Science, Vol. 329:1473-1474 (September 17, 2010) reporting on Fangliang Zhang, Sougata Saha, Svetlana A. Shabalina, Anna Kashina, “Differential Arginylation of Actin Isoforms Is Regulated by Coding Sequence-Dependent Degradation,” Science, Vol. 329:1534-1537 (September 17, 2010). See also Gene-Wei Li, Eugene Oh, and Jonathan S. Weissman, “The anti-Shine-Dalgarno sequence drives translational pausing and codon choice in bacteria,” Nature (2012) doi:10.1038/nature10965 (published online March 28, 2012); Gina Cannarozzi, Nicol N. Schraudolph, Mahamadou Faty, Peter von Rohr, Markus T. Friberg, Alexander C. Roth, Pedro Gonnet, Gaston Gonnet, and Yves Barral, “A Role for Codon Order in Translation Dynamics,” Cell, Vol. 141: 344-354 (April 16, 2010); Tamir Tuller, Asaf Carmi, Kalin Vestsigian, Sivan Navon, Yuval Dorfan, John Zaborske, Tao Pan, Orna Dahan, Itay Furman, and Yitzhak Pilpel, “An Evolutionarily Conserved Mechanism for Controlling the Efficiency of Protein Translation,” Cell, Vol. 141: 344-354 (April 16, 2010).

Image: Close-up of Watson and Crick’s model of DNA, Science Museum, London; Stew Dean/Flickr.

Evolution News_{& Science Today}

Evolution

Intelligent Design

Codes Within Codes: How Dual-Use Codons Challenge Statistical Methods for Inferring Natural Selection