Fact-Checking Wikipedia on Common Descent: The Evidence from Comparative Physiology and Biochemistry

I recently read the Wikipedia web-page on the “Evidence of Common Descent.” The page comprises a succinct, yet comprehensive, description of the most frequently cited arguments for the proposition of universal descent with modification. Since this is a subject that interests me, I decided to take it upon myself to write a review of the arguments, and in so doing to evaluate their merits. Wikipedia lists eight categories of evidence for common descent, which I hope to address over the course of this and future articles. These are:

1. Evidence from comparative physiology and biochemistry.
2. Evidence from comparative anatomy.
3. Evidence from paleontology.
4. Evidence from biogeographical distribution
5. Evidence from observed natural selection.
6. Evidence from observed speciation.
7. Evidence from observed artificial selection.
8. Evidence from computation and mathematical iteration.

For now, I am going to address the first of Wikipedia’s eight lines of evidence, namely, the evidence from comparative physiology and biochemistry.

Universal Biochemical Organization and Molecular Variance Patterns

The first argument given hinges on the “Universal biochemical organization and molecular variance patterns.” We are told,

All known extant organisms are based on the same fundamental biochemical organization: genetic information encoded as nucleic acid (DNA, or RNA for viruses), transcribed into RNA, then translated into proteins (that is, polymers of amino acids) by highly conserved ribosomes. Perhaps most tellingly, the Genetic Code (the “translation table” between DNA and amino acids) is the same for almost every organism, meaning that a piece of DNA in a bacterium codes for the same amino acid as in a human cell. ATP is used as energy currency by all extant life. A deeper understanding of developmental biology shows that common morphology is, in fact, the product of shared genetic elements.² For example, although camera-like eyes are believed to have evolved independently on many separate occasions,³ they share a common set of light-sensing proteins (opsins), suggesting a common point of origin for all sighted creatures.⁴⁵⁶ Another noteworthy example is the familiar vertebrate body plan, whose structure is controlled by the homeobox (Hox) family of genes.

This seems to me to be among the weaker arguments for universal common ancestry. Unqualified appeals to similarity do not demonstrate common descent any more than they demonstrate common design. If we have compelling independent evidence (as I think we do) to think that an intelligent agent played a role in the origins and evolution of life on earth, a common design blueprint makes at least as much sense as an explanation for such patterns of similarity as does common descent.

At any rate, the argument from similarity appears to me to be largely a circular one. Upon the discovery of similar molecular, biochemical and developmental features, it is reasoned that independent evolution of those features is so improbable that, under the principle of parsimony, common descent constitutes the most likely explanation. But when one runs across similar features and patterns which cannot be explained by common descent, it is taken as evidence for convergent evolution, and thus the efficacy of the Darwinian mechanism.

The Wikipedia article correctly mentions the multi-lineage independent evolution of camera-like eyes. As mentioned, the key molecule involved in absorbing light across all phyla is rhodopsin. This is true not only of the eukaryotes, but also of photosensitive prokaryotes such as some bacteria and algae. It is not the case, however, that the molecule in all these groups is derivative from a common photosensitive ancestor. In their classic work [Salvini-Plawen LV, Mayr E On the evolution of photoreceptors and eyes. Evol. Biol. 1977;10:207-263], Mayr and Salvini-Plawen note,

All the evidence however indicates that the earliest invertebrates, or at least those that gave rise to the more advanced phyletic lines, had no photoreceptors.

This conclusion was evident from a study of the molluscs alone!

The molluscs display greatest diversity in the differentiation of eyes among all groups of animals and 7-11 different lines can be distinguished; the ancestral stock was obviously devoid of photoreceptors and neither [various mollusc groups] nor most original larvae […] possess photoreceptors.

Wikipedia also mentions “the familiar vertebrate body plan, whose structure is controlled by the homeobox (Hox) family of genes.” Two points are worthy of mention here. For one thing, equivalent homeobox genes are often found to play analogous roles in development across taxa where no common ancestor is to be found. One of the best-known examples of this is the distalless gene, which is involved in the development of projections or appendages in various animal phyla. This includes vertebrate fins and limbs, arthropod legs, echinoderm tube feet, ascidian syphons and ampullae, and annelid parapods. But here’s the remarkable thing: On an evolutionary scheme of things, it is believed that the divergence of the various animal phyla occurred so long ago that the common progenitor lacked those appendages. It thus appears that essentially the same gene has been convergently deployed in the development of non-homologous (i.e., analogous) structures.

The common evolutionary rationalization of this phenomenon is to posit that the gene in question had some kind of propensity for promoting the development of the respective structure. But this solution appears dubious, particularly in the case of the even more spectacular example of eye development. In species as different as insects and mammals (which possess compound and camera eyes respectively) — the common ancestor of which, according to evolutionary reckoning, lived so long ago that it did not have eyes — the embryological formation of the eyes uses remarkably similar genes (e.g., eyeless and Pax6 respectively). But not only is Pax6 common to the development of vertebrate and invertebrate eyes, but so too are quite a number of other transcription factors. Among these are the mammalian “Six” genes, and their analogue in Drosophila called sine oculis. These genes are deployed somewhat later in development than is Pax6. Even the convergently deployed genes Dach and dac (vertebrates and Drosophila respectively) are seemingly homologous in structure, but each possesses quite a restricted role and is utilized late in development. Moreover, the eyes of vertebrates develop from two embryonic tissues, namely the epithelium and optic vesicle, whereas the eyes of Drosophila develop from a single embryonic tissue: the imaginal disc.

DNA Sequencing

Wikipedia continues,

Comparison of the DNA sequences allows organisms to be grouped by sequence similarity, and the resulting phylogenetic trees are typically congruent with traditional taxonomy, and are often used to strengthen or correct taxonomic classifications. Sequence comparison is considered a measure robust enough to be used to correct erroneous assumptions in the phylogenetic tree in instances where other evidence is scarce. For example, neutral human DNA sequences are approximately 1.2% divergent (based on substitutions) from those of their nearest genetic relative, the chimpanzee, 1.6% from gorillas, and 6.6% from baboons.⁷ Genetic sequence evidence thus allows inference and quantification of genetic relatedness between humans and other apes.⁸⁹ The sequence of the 16S ribosomal RNA gene, a vital gene encoding a part of the ribosome, was used to find the broad phylogenetic relationships between all extant life. The analysis, originally done by Carl Woese, resulted in the three-domain system, arguing for two major splits in the early evolution of life. The first split led to modern Bacteria and the subsequent split led to modern Archaea and Eukaryotes.

This argument has been addressed many-a-time here at Evolution News & Views (e.g., here).

The basic point I would make is this: The percentage identity of human and chimpanzee genomes has been substantially overstated. Hahn et al. (2006), for example, report,

Gene families are groups of homologous genes that are likely to have highly similar functions. Differences in family size due to lineage-specific gene duplication and gene loss may provide clues to the evolutionary forces that have shaped mammalian genomes. Here we analyze the gene families contained within the whole genomes of human, chimpanzee, mouse, rat, and dog. In total we find that more than half of the 9,990 families present in the mammalian common ancestor have either expanded or contracted along at least one lineage. Additionally, we find that a large number of families are completely lost from one or more mammalian genomes, and a similar number of gene families have arisen subsequent to the mammalian common ancestor. Along the lineage leading to modern humans we infer the gain of 689 genes and the loss of 86 genes since the split from chimpanzees, including changes likely driven by adaptive natural selection. Our results imply that humans and chimpanzees differ by at least 6% (1,418 of 22,000 genes) in their complement of genes, which stands in stark contrast to the oft-cited 1.5% difference between orthologous nucleotide sequences. This genomic “revolving door” of gene gain and loss represents a large number of genetic differences separating humans from our closest relatives. [emphasis added]

So, when one takes into account the multitude of species-specific DNA insertions/deletions (indels) that are present along any segment compared between humans and chimps, the sequence identity drops.

I would also point out in this regard that a detailed comparison of certain heterochromatic chromosome regions between humans and chimpanzees has yet to be made, and that the 99% identity figure is largely derived from protein-coding regions (which are, of course, usually highly conserved) which comprise about 1.5% of the two genomes.

Furthermore, if we are going to let all the evidence speak, then why not take a look at this study by Hughes et al., published in Nature just last year. That research yielded evidence that the male-specific portions of the human and chimp Y chromosome “differ radically in sequence structure and gene content,” suggesting “wholesale renovation.” The Nature News report noted that,

The common chimp (Pan troglodytes) and human Y chromosomes are “horrendously different from each other,” says David Page of the Whitehead Institute for Biomedical Research in Cambridge, Massachusetts, who led the work. “It looks like there’s been a dramatic renovation or reinvention of the Y chromosome in the chimpanzee and human lineages.”

Furthermore, as this excellent article by ENV’s Casey Luskin shows, there is a plethora of inconsistencies and non-congruences associated with modern molecular phylogenetics. For one thing, molecules often contradict morphology in yielding phylogenies. As Patterson et al. (1993) report,

As morphologists with high hopes of molecular systematics, we end this survey with our hopes dampened. Congruence between molecular phylogenies is as elusive as it is in morphology and as it is between molecules and morphology.

Additionally, Hasegawa et al. (1997) note,

That molecular evidence typically squares with morphological patterns is a view held by many biologists, but interestingly, by relatively few systematists. Most of the latter know that the two lines of evidence may often be incongruent.

Endogenous Retroviruses

Wikipedia proceeds into a discussion of shared endogenous retroviruses (ERVs) as an argument for common descent. We are told,

Endogenous retroviruses (or ERVs) are remnant sequences in the genome left from ancient viral infections in an organism. The retroviruses (or virogenes) are always passed on to the next generation of that organism which received the infection. This leaves the virogene left in the genome. Because this event is rare and random, finding identical chromosomal positions of a virogene in two different species suggests common ancestry.¹⁰

I have previously addressed this subject at length, and thus there is no need to discuss it in detail now. Suffice it to say that retroelement integrations are characteristically non-random, many have been documented to be functional in the context of their host’s genome, and some evidence suggests that these elements may, in fact, be intrinsic to the genome (please see my thorough treatment of the topic for references/citations and scientific justification for this claim).

Proteins

On proteins, Wikipedia states,

The proteomic evidence also supports the universal ancestry of life. Vital proteins, such as the ribosome, DNA polymerase, and RNA polymerase, are found in everything from the most primitive bacteria to the most complex mammals. The core part of the protein is conserved across all lineages of life, serving similar functions. Higher organisms have evolved additional protein subunits, largely affecting the regulation and protein-protein interaction of the core. Other overarching similarities between all lineages of extant organisms, such as DNA, RNA, amino acids, and the lipid bilayer, give support to the theory of common descent. Phylogenetic analyses of protein sequences from various organisms produce similar trees of relationship between all organisms.¹¹ The chirality of DNA, RNA, and amino acids is conserved across all known life. As there is no functional advantage to right- or left-handed molecular chirality, the simplest hypothesis is that the choice was made randomly by early organisms and passed on to all extant life through common descent. Further evidence for reconstructing ancestral lineages comes from junk DNA such as pseudogenes, “dead” genes which steadily accumulate mutations.¹²

This argument does nothing to demonstrate universal common ancestry. It merely assumes it. An unqualified appeal to similarity does nothing to discriminate between common descent and common design as possible candidate explanations. It also ignores instances where shared or similar features cannot be explained by common descent (convergence).
Examples include the remarkable independent evolution of bat and whale echolocation systems and the uncanny apparently independent evolution of similar DNA biosynthesis mechanisms in eubacteria and archaea. For a catalogue of such occurrences, see this article, or biochemist Fazale Rana’s contribution (chapter 21) to The Nature of Nature.

Pseudogenes

On pseudogenes, Wikipedia tells us,

Pseudogenes, also known as noncoding DNA, are extra DNA in a genome that do not get transcribed into RNA to synthesize proteins. Some of this noncoding DNA has known functions, but much of it has no known function and is called “Junk DNA.” This is an example of a vestige since replicating these genes uses energy, making it a waste in many cases. Pseudogenes make up 99% of the human genome (1% working DNA).¹³ A pseudogene can be produced when a coding gene accumulates mutations that prevent it from being transcribed, making it non-functional. But since it is not transcribed, it may disappear without affecting fitness, unless it has provided some new beneficial function as non-coding DNA. Non-functional pseudogenes may be passed on to later species, thereby labeling the later species as descended from the earlier species.

Actually, a wide range of functions have been identified for pseudogene sequences. See, for example, this paper, published in Nature in June of last year (I blogged about it at the time here). Poliseno et al. reveal that, “PTENP1 — along with KRAS1P, the pseudogene of the gene KRAS, and potentially other pseudogenes — is not a non-functional relic, but a modulator of gene expression.” The authors of the paper discuss the observed interaction between the RNA encoding for the PTEN tumor suppressor gene and its corresponding pseudogene, PTENP1, demonstrating that this pseudogene acts as a tumor suppressor. A function was also assigned for the KRAS oncogene and the corresponding pseudogene, KRAS1P. As the authors explain,

We also demonstrate that pseudogenes such as PTENP1 can derepress their cognate genes, even when expressed at lower levels (Supplementary Fig. 3a and Fig. 2f-h). We propose that pseudogenes are “perfect decoys” for their ancestral genes, because they retain many of the miRNA binding sites and can compete for the binding of many miRNAs at once.

Typically when I start to talk about functionality for so-called “junk DNA” such as pseudogenes, Darwinists tell me that such instances are the exception rather than the rule, and that I am guilty of cherry-picking data. The trouble with this argument is that in no conversation am I typically able to discuss more than a handful of documented cases, while the critic is free to simply shift his or her ground with every cited instance, telling me that they still lay claim on the majority of these elements. Alas, I do not have the time or the will to go into all of the wealth of literature that has been accumulating in recent years documenting pseudogene functions. One way of overcoming this difficulty (i.e., the impracticality of discussing more than a few documented examples, which leaves us open to the charge of cherry-picking) is to cite review literature. Review articles are great for providing overviews of the current literature and listing many pertinent references and citations in the process.

One such review paper appeared in 2003 in the Annual Review of Genetics, authored by Balakirev and Ayala. The paper noted that,

Pseudogenes have been defined as nonfunctional sequences of genomic DNA originally derived from functional genes. It is therefore assumed that all pseudogene mutations are selectively neutral and have equal probability to become fixed in the population. Rather, pseudogenes that have been suitably investigated often exhibit functional roles, such as gene expression, gene regulation, generation of genetic (antibody, antigenic, and other) diversity. Pseudogenes are involved in gene conversion or recombination with functional genes. Pseudogenes exhibit evolutionary conservation of gene sequence, reduced nucleotide variability, excess synonymous over nonsynonymous nucleotide polymorphism, and other features that are expected in genes or DNA sequences that have functional roles. We first review the Drosophila literature and then extend the discussion to the various functional features identified in the pseudogenes of other organisms. A pseudogene that has arisen by duplication or retroposition may, at first, not be subject to natural selection if the source gene remains functional. Mutant alleles that incorporate new functions may, nevertheless, be favored by natural selection and will have enhanced probability of becoming fixed in the population. We agree with the proposal that pseudogenes be considered as potogenes, i.e., DNA sequences with a potentiality for becoming new genes.

For further discussion of pseudogene function, see Casey Luskin’s discussion of the pseudogene misnomer here, where he notes that even pseudogenes that aren’t transcribed can serve important roles. See also my discussion of “junk RNA” here, in which I draw attention to a Nature news article which reports that “the polished rice peptides could also have implications for how we view pseudogenes, which have long been thought to be defunct relics of protein-coding genes. Pseudogenes often contain many signals that would stop protein synthesis and, as a result, could only encode short amino-acid chains. Maybe this would provide a new way for pseudogenes to have some sort of function.”

Wikipedia subsequently lists several “specific examples” of the evidence for evolution from comparative physiology and biochemistry. Among these are the fusion origin of the human chromosome 2; the variance of the ubiquitous protein Cytochrome c; and the recent African origin of modern humans. Let’s take a look at each in turn.

The Origin of Chromosome 2

Briefly, the chromosomal fusion argument for human-chimp common descent begins with the observation that humans possess 23 pairs of chromosomes, whereas apes possess 24 pairs, thus allowing one to predict that — evolution being true — a chromosomal fusion must have taken place at some point in our lineage. And, indeed, this is what we observe. Chromosome 2 possesses two centromeres. It also possesses a section where there are two telomeres in the middle of the chromosome, which are oriented in such a way so as to suggest that the ends of the two chromosomes were fused together. Every telomere in human and great-ape chromosomes has the six base-pair sequence TTAGGG repeated over and over approximately fifty to one hundred times in tandem. Such telomeric repetitive units, when they are found not in the telomeres at the end of the chromosome, but rather in the middle of the chromosome (perhaps near the centromere), are referred to in the literature as “interstitial telomeric sequences” (or ITS’s). At the supposed fusion site in chromosome 2, the sequence in the upper strand abruptly changes from TTAGGG repeats to CCCTAA repeats (the complementary sequence of the inversion). This is taken to indicate that the DNA in a telomere of one chromosome and the DNA in a telomere of the other chromosome broke and subsequently the two chromosomes fused at the broken ends. This site is referred to in the literature as 2q13 (“2” referring to the chromosome number, “q” referring to the long arm, and “13” referring to the position on the arm).

Furthermore, chromosomal centromeres possess a characteristic DNA called alpha satellite sequences. Secondary alpha satellite DNA (over and above that which is associated with the active centromere), which has been found in the case of chromosome 2 (see Avarello et al. 1992), is taken as further evidence for this fusion event.

But just how sound is this argument?

For one thing, there are, in fact, plausible alternative explanations for this observation. For example, envision a scenario where our genus Homo, originally possessing 48 chromosomes, underwent a chromosomal fusion event within its own independent lineage. Sure, the banding patterns of chromosome 2 are similar to two of the autosomes in the chimpanzee lineage. But then we are only coming back to the argument from similarity which, as I have already argued, supports common descent no more than it suggests common design.

Secondly, some of the arguments for supposing that chromosome 2 did indeed arise from a fusion event have been significantly weakened in recent years. One very interesting peer-reviewed paper, appearing in the journal Cytogenetic and Genome Research in 2009, by Farre, Ponsa and Bosch, reported:

Although their function has not yet been clearly elucidated, interstitial telomeric sequences (ITSs) have been cytogenetically associated with chromosomal reorganizations, fragile sites, and recombination hotspots. In this paper, we show that ITSs are not located at the exact evolutionary breakpoints of the inversions between human and chimpanzee and between human and rhesus macaque chromosomes. We proved that ITSs are not signs of repair in the breakpoints of the chromosome reorganizations analyzed. We found ITSs in the region (0.7-2.7 Mb) flanking one of the two breakpoints in all the inversions assessed. The presence of ITSs in those locations is not by chance. They are short (up to 7.83 repeats) and almost perfect (82.5-97.1% matches). The ITSs are conserved in the species compared, showing that they were present before the reorganizations occurred.

So, what is the significance of the cited paper? Though there are many documented instances of these interstitial telomeric sequences in the genomes of humans and chimps, the 2q13 interstitial telomeric sequence is the only one that is able to be associated with an evolutionary breakage point or fusion. The other ones fail to line up with primate chromosomal breakpoints.

As the authors of the paper note,

The availability of complete genome sequences (Hubbard et al., 2007) offers the opportunity to characterize the regions flanking the breakpoints of chromosomal reorganizations at the molecular level. However, to our knowledge, only the head-to-head ITS located in the human 2q13 region, which is a relic of an ancient telomere-telomere fusion, is precisely associated with an evolutionary breakpoint (Ijdo et al., 1991). Here, we used bioinformatic tools to analyze, in the current genome releases, the presence of short ITSs in the chromosomal inversions that do not involve terminal regions and that occurred between human and chimpanzee and between human and rhesus macaque during evolution.”

The pro-ID evolutionary biologist Richard Sternberg has also briefly weighed in on the paper here. Sternberg notes,

How, precisely, are miles and miles of TTAGGG of significance? From the standpoint of chromosome architecture, the repetitive elements en masse have the propensity to form complicated topologies such as quadruplex DNA. These sequences or, rather, topographies are also bound by a host of chromatin proteins and particular RNAs to generate a unique “suborganelle” — for the lack of better term — at each end. As a matter of fact, the chromatin organization of telomeres can silence genes and has been linked to epigenetic modes of inheritance in yeast and fruit flies. Furthermore, different classes of transcripts emanate from telomeres and their flanking repetitive DNA regions, which are involved in various and sundry cellular and developmental operations.
[…]
ITSs reflect sites where TTAGGG repeats have been added to chromosomes by telomerases, that these repeats are moreover engineered — literally synthesized by the telomerase machinery, that ITSs have a telomere-like chromatin organization and are associated with distinct sets of proteins, and that many have been linked to roles such a recombination hotspots.

Thus, the take-home message is this: To make much of the 2q13 interstitial telomeric sequence and portray it as typical of what is observed in chimp and human genomes may be considered careful cherry-picking of data.

And what about the secondary alpha satellite sequences found in chromosome 2? Is that not best understood as a genetic residue from a previously functioning centromere on a separate chromosome? Perhaps. But the situation is not quite as clear as is often made out. Neo-centromeres, for example, are rare events which result in the formation of a new centromere (see, for example, Warburton 2004). One suggestion, however, that the additional centromere in chromosome 2 did not arise by this process is the fact that neo-centromeres are usually not associated with the characteristic centromeric repetitive alpha-satellite DNA. But these neo-centromeres are poorly understood, and it may come to pass that a mechanism is discovered that can make these neo-centromeres full of alpha-satellite DNA.

One particularly interesting study, from Baldini et al. (1993), reported the presence of secondary alpha satellite DNA on human chromosome 9! To further complicate matters, Luke and Verma (1995) subsequently reported on the occurrence of secondary alpha satellite DNA in all primates. In 1997, a research group published another interesting study (Samonte et al., 1997). These researchers hybridized twenty-one different chromosome-specific human alpha satellite DNA probes to the full complement of chromosomes from the chimpanzee, gorilla and the orangutan. They reported that most of the human probes failed to hybridize to the equivalent ape chromosome. Instead, they gave positive signals on non-corresponding chromosomes.

Thus, they concluded, alpha satellite DNA sequences show little conservation in primate lineages.

Cytochrome C

Wikipedia also cites Cytochrome c as a confirmation of the standard phylogeny of taxonomic groups. It’s somewhat curious that Wikipedia fails to mention the phylogenetic tree of Cytochrome b, which yields strikingly different results. As Lee (1999) reports,

[T]he mitochondrial cytochrome b gene implied…an absurd phylogeny of mammals, regardless of the method of tree construction. Cats and whales fell within primates, grouping with simians (monkeys and apes) and strepsirhines (lemurs, bush-babies and lorises) to the exclusion of tarsiers. Cytochrome b is probably the most commonly sequenced gene in vertebrates, making this surprising result even more disconcerting.

Recent African Origin of Modern Humans

Wikipedia concludes this section with a word about the African origin of modern humans. It notes,

Mathematical models of evolution, pioneered by the likes of Sewall Wright, Ronald Fisher and J. B. S. Haldane and extended via diffusion theory by Motoo Kimura, allow predictions about the genetic structure of evolving populations. Direct examination of the genetic structure of modern populations via DNA sequencing has allowed verification of many of these predictions. For example, the Out of Africa theory of human origins, which states that modern humans developed in Africa and a small sub-population migrated out (undergoing a population bottleneck), implies that modern populations should show the signatures of this migration pattern. Specifically, post-bottleneck populations (Europeans and Asians) should show lower overall genetic diversity and a more uniform distribution of allele frequencies compared to the African population. Both of these predictions are borne out by actual data from a number of studies.²⁸

I agree that the genetic and anthropological evidence is strongly suggestive that modern humans originated in Africa. But this seems to me to be suggestive of, at best, limited levels of common descent. And it is only consistent with that conclusion; it does not compel it. But it is important, in evaluating these arguments, that one consider all the evidence: not just the evidence that is consistent. It seems to me that when this is done, the arguments for common descent — certainly in its universal sense — are, at best, inconclusive.

In the next part of this series, I shall discuss the arguments for common descent from comparative anatomy.