Here Comes McBride (Again): The Debate on Junk DNA Continues
Over at his Still Monkeys blog, Paul McBride has offered a rebuttal to my defense of Casey Luskin's chapter on junk DNA in the new book Science and Human Origins. Similar to Larry Moran (whom I previously rebuffed here), McBride writes,
McLatchie claims that "ID proponents are well aware of [junk DNA] literature and do not, as [McBride] claims, conflate 'junk DNA' and 'non-coding-DNA.'" If only this were true. Let's look once again at what Luskin says in Chapter 4. Luskin (wrongly) attributes to Francis Collins that "non-coding DNA shared by humans and other mammals is supposedly functionless junk." Luskin doesn't say 'most' or 'some' non-coding DNA, he conflates the two. Luskin goes on to say that, "[n]umerous studies have found extensive evidence of function for non-coding DNA, showing that it is not genetic "junk" after all." More conflation, plain and simple. Luskin also heads up an entire section in his chapter as Non-coding DNA: Not really junk after all - the ultimate conflation. Luskin is guilty as charged. I have made no strawman, these are Luskin's words and they are wrong. I'm glad McLatchie agrees that such a conflation is erroneous, but he needs to give Luskin, not me, the heads-up.Again, is Paul McBride seriously suggesting that Casey Luskin doesn't think that Francis Collins -- formerly head of the Human Genome Project -- is familiar with origins of replication, transcription factors, promoters, operators and enhancers? All of those fall into the category of non-coding DNA, and have been known about for a very long time. Indeed, even in The Language of God (as I noted in my response to Larry Moran), Francis Collins concedes that function has been established for some non-coding DNA, although he maintains that most of it is functionless. This fact is highlighted in Jonathan Wells's The Myth of Junk DNA. In Science and Human Origins, Casey Luskin writes,
Collins offers two main DNA-based arguments for his claim that humans share a common ancestor with apes and other animals. First, non-coding DNA shared by humans and other mammals is supposedly functionless junk, which according to Collins means "the conclusion of a common ancestor for humans and mice is virtually inescapable."As anyone with even a cursory familiarity with The Language of God knows, this is indeed one of the major pieces of evidence used by Collins to support common ancestry. Although Collins concedes that unqualified appeals to similarity do not, on their own, necessarily provide evidence for common descent (they can be explained just as readily by common design), shared non-functional sequences, argues Collins, do strongly support common ancestry. Luskin correctly represents this argument -- he does not, as McBride alleges, assert or imply that Collins thinks all non-coding DNA is junk. He merely highlights Collins's claim that shared non-coding DNA that is supposedly functionless provides compelling evidence for shared ancestry. McBride is reading into the text what Luskin doesn't say.
Francis Collins does, however, make the mistake that David Klinghoffer and myself lamented in previous posts. On page 136 of The Language of God, Collins writes that "[m]ammalian genomes are littered with such AREs, with roughly 45 percent of the human genome made up of such genetic flotsam and jetsam." See this paper for a listing of identified functions for such "genetic flotsam and jetsam."
McBride subsequently takes issue with my use of the word "Darwinist" and points out that, historically, support for the junk DNA paradigm came principally from the neutralist, rather than the selectionist, camp. On this, McBride is correct. I was using the word "Darwinist" more broadly than this, however, to refer to those who subscribe to the paradigm of evolutionary naturalism.
Moving on to transposable elements, McBride writes,
That there are functional Alus is not surprising. Duplications of TEs are a class of mutation and like any mutation they can be functional. This is not in dispute. The dispute is around how many of those copies that end up spreading through populations are functional, rather than nearly neutral. McLatchie is still not giving us the quantitative part -- how much of the "dark genome" can we explain in this fashion? The answer remains: very little.It isn't easy to quantify the exact percentage of transposable elements that have been established to have some function because that number is steadily increasing -- it is thus a moving target. McBride is correct, however, that functions have not yet been identified for the majority of those transposable elements. But do we have sound reason to think that functions will not be ascribed to these elements in the future? Wouldn't it be better to take a "wait-and-see" approach to the non-coding genomic repertoire? Remember the title of the paper I cited in my previous response to McBride: "From 'JUNK' to Just Unexplored Noncoding Knowledge: the case of transcribed Alus." To borrow a Darwinist cliché, give science time!
On the pervasiveness of transcription, McBride again notes that there is controversy in the literature with respect to the quantity of our DNA that is transcribed into mRNA. As I noted in my previous post, however, even if transcription is much less pervasive than is commonly thought, transcription isn't a prerequisite for a stretch of DNA being functional.
Last year, an interesting paper was published in PLoS Biology entitled "The Reality of Pervasive Transcription", (Clark et al., 2011). The paper presents "an evaluation of the analysis and conclusions of van Bakel et al. compared to those of others." They conclude that "the totality of the evidence strongly supports pervasive transcription of mammalian genomes" and offer three main criticisms of van Bakel et al's work:
(1) the existence of pervasive transcription is supported by multiple independent techniques; (2) re-analysis of the van Bakel et al. tiling arrays shows that their results are atypical compared to those of ENCODE and lack independent validation; and (3) the RNA sequencing dataset used by van Bakel et al. suffered from insufficient sequencing depth and poor transcript assembly, compromising their ability to detect the less abundant transcripts outside of protein-coding genes.
They note that the work of van Bakel et al. suffers from five major problems (fleshed out in detail in the paper). They write,
We suggest that the overarching conclusions drawn by van Bakel et al.--that there is only spasmodic (not pervasive) low-level transcription of much of the genome, and that much of this transcription has "random character" --are the result of a number of debatable aspects of their logic and analysis. These may be summarized as (1) insufficient sequencing depth and breadth and poor transcript assembly, together with the sampling problems that arise as a consequence of the domination of sequence data by highly expressed transcripts; compounded by (2) the dismissal of transcripts derived from introns; (3) a lack of consideration of non-polyadenylated transcripts; (4) an inability to discriminate antisense transcripts; and (5) the questionable assertion that rarer RNAs are not genuine and/or functional transcripts.
In conclusion, they note,
There is already a wide and rapidly expanding body of literature demonstrating intricate and dynamic transcript expression patterns, evolutionary conservation of promoters, transcript sequences and splice sites, and functional roles of "dark matter" transcripts. In any case, the fact that their expression can be detected by independent techniques demonstrates their existence and the reality of the pervasive transcription of the genome.
Moving onto the topic of pseudogenes, McBride writes,
We then venture into pseudogenes. Apparently, I have understated the number of functional cases in pseudogenes. McLatchie exclaims: "Only a couple of known examples? Actually, the number is far larger than that." But exactly how much larger? He doesn't tell us. He mentions some conjecture for a potential function for GULOP, the pseudogene in our broken vitamin C synthesis pathway, however provides no research linking said function to GULOP.I cited a review of the functional pseudogene literature in my previous post (here it is again). McBride may also want to check out this review paper by Balakirev and Ayala (2003). A further review paper was published in January of this year. The paper bore the provocative title "Pseudogenes are not pseudo any more," (Wen et al., 2012). The paper stated in conclusion,
The study of functional pseudogenes is just at the beginning. There remain many questions to be addressed, such as the regulatory elements controlling the cell or tissue specific expression of pseudogenes. But, definitely, the so-called pseudogenes are really functional, not to be considered any more as just "junk" or "fossil" DNA. Surely, many functional pseudogenes and novel regulatory mechanisms remain to be discovered and explored in diverse organisms.Regarding the hypothesized function of the GULO gene in humans, I linked to a previous blog post of mine in which I cited evidence that vitamin C is synthesized prenatally in humans, which opens up the possibility that the GULO gene is involved in this process before being subsequently suppressed for good reasons later on (as I discuss). The link between the GULO gene and said function is yet to be experimentally confirmed, but it is something that may be worth investigating.
In short, while it is certainly true that function has not yet been identified for the majority of the genome, there is no reason to think that the trend -- of finding new, previously unsuspected functionality -- will not continue, or that the "unexplored non-coding knowledge" will not continue to shrink.