Cryptic diversity in vertebrates: molecular data double estimates of species diversity in a radiation of Australian lizards (Diplodactylus, Gekkota)

Paul M. Oliver, Mark Adams, Michael S.Y. Lee, Mark N. Hutchinson, Paul Doughty

Abstract

A major problem for biodiversity conservation and management is that a significant portion of species diversity remains undocumented (the ‘taxonomic impediment’). This problem is widely acknowledged to be dire among invertebrates and in developing countries; here, we demonstrate that it can be acute even in conspicuous animals (reptiles) and in developed nations (Australia). A survey of mtDNA, allozyme and chromosomal variation in the Australian gecko, genus Diplodactylus, increases overall species diversity estimates from 13 to 29. Four nominal species each actually represent multi-species complexes; three of these species complexes are not even monophyletic. The high proportion of cryptic species discovered emphasizes the importance of continuing detailed assessments of species diversity, even in apparently well-known taxa from industrialized countries.

1. Introduction

An accurate inventory of species diversity is a fundamental baseline for most fields of biological research. However, there remains a major ‘taxonomic impediment’—an unknown but very large proportion of the world's species diversity remains scientifically undocumented. This gap in our understanding is a serious problem faced by all researchers trying to understand, manage and conserve biotic resources (UNEP 2008). The problem of unrecognized diversity is most acute among invertebrates and in developing countries. However, while the rate of species description for birds is relatively low, numerous new species from other vertebrate groups continue to be described. For example, 2007 was one of only four years in history in which more than 100 reptile species were described (Uetz et al. 2008). Available evidence suggests that hundreds of amphibian, reptile, mammal and fish species remain unrecognized (e.g. Lundberg et al. 2000; MacKinnon 2000; Meegaskumbura et al. 2002; Fouquet et al. 2007).

Likewise, while many newly discovered species hail from developing countries, a large number (including vertebrates) are also being found in industrialized nations. Australia is one of only two megadiverse developed countries (Mittermeier et al. 1997). All Australian states have had their own scientific institutions since the late nineteenth century, and a long history of taxonomic research: nonetheless, the problem of cryptic vertebrate diversity was flagged by Donnellan et al. (1993), and over a hundred terrestrial vertebrate species were added to the list of recognized taxa between 1990 and 2000 (Cogger 2000; Van Dyck & Strahan 2008). In the light of the profound local and global environmental changes affecting the Australia biota, this high number of recent descriptions raises the question of how many of its species remain undescribed. This issue is particularly pertinent given the decreasing global support for, and interest in, careers in taxonomy (Cotterill 1995; Lee 2000).

In conjunction with other researchers, we are undertaking a comprehensive review of species diversity in a moderately diverse (54 recognized species) Australian gecko family, the Diplodactylidae. This family is part of an ancient adaptive radiation of diplodactyloid geckos with a long history in the Australasian region (Oliver & Sanders in press). Here, we show that species diversity in the moderately diverse terrestrial genus Diplodactylus has been underestimated by at least a factor of two. These results powerfully assert the need for comprehensive and continuing commitment to integrated taxonomic research by highlighting that major gaps exist in our understanding of species diversity—even in well-known groups and in developed countries.

2. Material and methods

(a) Species recognition

While there remains considerable disagreement about the exact nature of species, it has been strongly argued that most species concepts interpret species as evolutionary lineages (De Queiroz 2005), and that species delineation should involve integration of multiple independent datasets to identify these lineages (De Queiroz 2007). Herein, we present data from three independent sources: mtDNA, allozymes, and karyotype.

The dataset obtained for all terminal taxa, based on mtDNA sequence data, is—as any single locus—inadequate by itself for designating species (Funk & Omland 2003; Moritz & Cicero 2004). Deeply separated mitochondrial lineages suggest lack of recent maternal gene flow, but in order to be rigorously confirmed as separate species, such lineages need to be corroborated by other markers. Allozyme data provide an independent and multi-locus assessment, based on co-dominant nuclear genetic markers, of levels of nuclear genetic differentiation between lineages. Allozyme analysis is likely to be a more accurate indicator of levels of current gene flow (genetic differentiation) than are maternally inherited single-copy mitochondrial loci (e.g. Fitzpatrick 2002), and has proved useful in discovering and diagnosing species (e.g. Adams et al. 1987). Karyotypic differences can also provide strong evidence for distinct species, although chromosome differentiation does not guarantee speciation (Reed et al. 1995; Sites et al. 1995), and additional evidence is needed to demonstrate a lack of gene flow.

Where mitochondrial clades corresponded with evidence from at least one (and often both) of the other datasets, we interpret this as strong evidence for the existence of independent lineages (species). In a number of additional instances, the mitochondrial data revealed evidence of deep phylogenetic structure among samples for which we do not have allozyme or karyotypic data. In at least five instances, these lineages probably also represent distinct species; these are discussed in more detail below.

(b) mtDNA

Screening for cryptic unrecognized diversity was undertaken using the ND2 gene, which has been widely used in gecko systematics. Methodologies for DNA extraction and amplification are described in Pepper et al. (2006), Oliver et al. (2007a,b) and Doughty et al. (2008). We screened multiple exemplars of 12 of the 13 recognized species of Australian Diplodactylus; tissues of the rare Diplodactylus kenneallyi have never been collected. Our final alignment consisted of 797 bp of ND2 for 167 Diplodactylus specimens comprising 145 unique haplotypes. GenBank accession numbers for all sequences are given in appendix 1 of the electronic supplementary material.

Uncorrected and corrected genetic distances between terminal taxa were calculated using PAUP* (Swofford 2000). Corrected distances were estimated using the GTR+I+G model, selected by the Akaike information criterion in Modeltest (Posada & Crandall 1998). To assess evolutionary relationships between all terminals, a phylogeny was inferred using maximum likelihood implemented in RaxML v. 7.0.4 (Stamatakis 2006a) and Bayesian inference (using MrBayes v. 3.1.2; Altekar et al. 2004). Maximum-likelihood topology and bootstrap support values were estimated using the ‘-f a’ function. Optimal topology was estimated using the GTR+I+G model and bootstrap supports were calculated from 1000 fast replicates with the GTR CAT model (see Stamatakis 2006b); these results are presented below. The Bayesian analysis recovered the same topology and emphasized the same deep mitochondrial nodes as the likelihood analysis (for details of the Bayesian analysis and tree, see appendix 2 of the electronic supplementary material).

(c) Allozymes

Allozyme scoring methodologies and data for Diplodactylus pulcher, Diplodactylus klugei and the Diplodactylus vittatus group have been published in Aplin & Adams (1998) and Oliver et al. (2007b). Comparative allozyme data for the Diplodactylus tessellatus and Diplodactylus conspicillatus species complexes are presented for the first time in this study. Fifty-one putative loci were successfully scored in one or more of the three independent allozyme studies: Acon-1; Acon-2; Acp-1; Acp-2; Acyc; Adh-1; Adh-2; Adh-3; Ak; Alb; Dia; Enol; Est; Fdp; Fum; Gapd; G6pd; Gda; Glo; Got-1; Got-2; Gpi; Gpt; Gsr; Guk; Hbdh; Idh-1; Idh-2; Lap; Ldh-1; Ldh-2; Mdh-1; Mdh-2; Me-1; Me-2; Mpi; Ndpk; Np; PepA-1; PepA-2; PepB; PepD; Pgam; 6Pgd; Pgk; Pgm-1; Pgm-2; Pk; Sod; Sordh; and Tpi. Details of enzyme and locus abbreviations, electrophoretic conditions and stain recipes are presented in Richardson et al. (1986) or Bostock et al. (2006), while locus nomenclature follows Adams et al. (1987).

For each dataset, principal coordinates analysis (PCO) was undertaken on a pairwise genetic distance matrix among all individuals to determine whether the different mtDNA lineages represented therein were also independently diagnosable by their allozyme profiles. Discrete PCO clusters were only considered to be supported where they differed in at least one fixed difference from all others (allowing a cumulative tolerance of 10% for any shared alleles; see Horner & Adams (2007) for further details of principles and methodology). A table summarizing the number of fixed allozyme differences plus the Nei's unbiased genetic distance (Nei 1978) between taxa identified using PCO is presented in appendix 3 of the electronic supplementary material. In total, we were able to assess taxon diagnosability in 136 pairwise comparisons, based on the comparative allozyme profiles of 137 individuals at 39–42 loci.

(d) Karyotypic data

For the majority of mtDNA lineages, data are also available on karyotypic morphology (King 1987). The ancestral Diplodactylus karyotype is thought to be 2n=38, all acrocentric (King 1987); however, this genus is characterized by taxonomically significant chromosomal fusion events (King 1987; Oliver et al. 2007b). Metaphase chromosome spreads were obtained from epithelial tissue cultures derived from genital duct, lung and pericardial samples. Standard tissue culture methods were used to establish cultures and to harvest metaphase spreads for karyotypic analysis (Freshney 2000).

3. Results and discussion

(a) Estimates of diversity

Our data suggest that actual species diversity within Diplodactylus is at least double the current total. Eleven unnamed lineages are characterized by high mtDNA divergence (usually more than 10% uncorrected sequence divergence) and at least one and usually multiple fixed allozyme differences. A number of these lineages in both the D. tessellatus and the D. vittatus complexes (see below) are also characterized by unique derivations from the ancestral 2n=38 karyotype (table 1). One of these lineages, ‘Cape Range’, has since been formally described (Diplodactylus capensis; Doughty et al. 2008) and additional descriptions are in preparation.

View this table:
Table 1

Summary of genetic and karyotypic data for taxa in the genus Diplodactylus. Unrecognized species are indicated with a U and candidate species are indicated by a C. Inter-mtDNA shows minimum uncorrected and corrected (brackets) genetic distance to nearest relatives. Intra-mtDNA shows maximum uncorrected and corrected (brackets) genetic distance within nominal species. Fixed D is the minimum number of fixed allozyme differences from other nominal species. Assignation of names to species should be considered preliminary pending formal description.

We identified a further five ‘candidate’ (Fouquet et al. 2007) species: highly divergent mtDNA lineages for which corroborating allozyme or karyotypic data are currently unavailable (table 1), which future studies will probably confer species status. All five of these lineages are diagnosable morphologically when examined post hoc (P. Couper & P. Oliver 2008, personal observation). Four of these are within the nominal species D. conspicillatus: lineage E is basal to a clade of three lineages, which can each be diagnosed on allozymes; and lineages F, G and H form a clade that is sister to all other D. conspicillatus forms. The ‘Yetna’ lineage, part of a species complex restricted to Western Australia, shows relatively shallow mtDNA divergence from its nearest relatives, but is geographically isolated and morphologically diagnosable.

Our data suggest that actual species diversity in Diplodactylus is at least 29 species (13 described species, 11 new species and 5 candidate species). However, this revised estimate of Diplodactylus species is still likely to be an underestimate for three reasons. First, we used only three sources of information in our search for cryptic taxa. With additional data sources (e.g. nuclear DNA sequences, detailed morphometrics), some ‘intraspecific’ mtDNA lineages may be diagnosable and warrant species-level recognition. There is already preliminary evidence of such differentiation within D. ‘southern’ and Diplodactylus granariensis (Oliver et al. 2007b; Doughty et al. 2008).

Second, additional taxon sampling could discover new, highly divergent lineages. Our sampling for large areas of Australia is very sparse, particularly northern and northeastern areas (figure 1a). Seven species from northern and western areas are only known from a limited number of specimens or sites (table 1; figure 1b). Preliminary morphological analysis of D. conspicillatus has also revealed the existence of considerable morphological diversity in eastern Queensland. We have genetic samples from only a single locality in this area. Finer scale genetic and morphological analysis of another group of diplodactyloid geckos (the leaf-tail geckos Phyllurus and Saltuarius) on the Australian east coast has revealed subtle but significant patterns of variation indicative of cryptic species (e.g. Couper et al. 2008). As data for Diplodactylus approach this fine level of resolution and understanding, diversity estimates will probably increase further. There is clearly a critical need for more and detailed genetic sampling from across northern Australia.

Figure 1

(a) Map of Australia showing distribution of recognized species (grey squares) and new or candidate species (black triangles) in the genus Diplodactylus. Unrecognized taxa are found across the continent and show no clear patterns in environmental or geographical distribution. While Diplodactylus are absent from far southeastern and eastern Australia, the relative lack of samples from across the north reflects the remoteness and consequent lack of sampling effort in this region. (b) Maximum-likelihood tree from 797 bp of ND2 data calculated using the GTR+I+G model for the genus Diplodactylus showing recognized (italics), unrecognized (bold) and candidate species (bold with asterisk). Key nodes with high bootstrap support values (above 70) from 1000 repetitions using the -f a search function in RaxML are indicated by asterisks. GenBank numbers and locality details for individual specimens are given in appendix 1 of the electronic supplementary material.

Finally, mtDNA (ND2) distances within Diplodactylus are generally very large. Almost all species are separated from their nearest relatives by more than 20 per cent corrected pairwise divergence (table 1). Furthermore, eight taxa, here treated as single species (e.g. tessellatus, ‘eastern inland’, galeatus), themselves contain deep corrected pairwise divergences of at least 10 per cent (table 1). Precise molecular divergence dating was not possible owing to an absence of robust calibration points within Diplodactylidae. However, estimated pairwise divergence rates of ND2 in related taxa (agamid lizards) are 1.3–1.62% per million years: the lower rate was based on parsimony methods and thus approximates uncorrected divergence (0.65% per lineage; Macey et al. 1998), and higher rate was based on likelihood models and thus approximates corrected divergence (0.81% per lineage; Shoo et al. 2008). If these rates characterize Diplodactylus, lineages within each of the above eight taxa would have diverged in the Late Miocene (over 6 Myr ago). Such deep divergences would suggest further species, but more rigorous dating and additional loci are required.

Our conservative estimate of at least 16 additional Diplodactylus species increases overall diversity estimates for the Australian radiation of the family Diplodactylidae from 54 to 70 species (approx. 30%). Other studies suggest that similar high levels of cryptic diversity exist within other genera of Australian Diplodactylidae (e.g. Lucasium: Pepper et al. 2006). Clearly, diversity within this family has been significantly underestimated. This level of undescribed diversity has serious implications for all evolutionary, ecological and conservation studies on the group. For instance, a study on variable temperature tolerance on D. vittatus (Bustard 1968) is likely to have been confounded by including other species, such as D. ‘eastern inland’. Likewise, many ‘species’ formerly regarded as widespread actually comprise multiple species, each with much more restricted ranges and potentially more vulnerable (e.g. Doughty et al. 2008).

(b) Phylogenetic and geographical patterning in unrecognized diversity

Our genetic data are typical of many recent studies in demonstrating how morphological data can be ambiguous and even misleading. Previously unrecognized Diplodactylus species are concentrated in three widespread morphologically similar groups previously thought to contain a total of only four species: a distinctive clade currently referred to as D. conspicillatus (comprising nine species, one recognized) and two paraphyletic groups, D.tessellatus’ (three species, one recognized) and the ‘D. vittatus complex’ (sensu Oliver et al. 2007b; seven species, two recognized).

The D. conspicillatus clade is perhaps the most distinctive and morphologically divergent Diplodactylus (Kluge 1967). Likewise, D. tessellatus is easily recognized and has at least two synapomorphies that distinguish it from its sister lineages in the D. vittatus group (Kluge 1967). The apparent paraphyly of this species complex (figure 1b) is unexpected and the precise relationships of the component species deserve further research. However, it is clear that D. tessellatus is polytypic as currently recognized. In both D. conspicillatus and D. tessellatus, the distinctiveness of each species complex may have overshadowed more subtle but highly significant internal variation, causing workers to uncritically assume the existence of single species.

Conversely, the D. vittatus group is not diagnosed by any unique characters and its paraphyly has been previously demonstrated (Oliver et al. 2007a,b). Among this group, what is perhaps most striking is that species which are readily diagnosed on morphology (and have been recognized for decades), are interspersed among numerous unrecognized allopatric, morphologically similar, but genetically highly distinctive taxa. The taxonomic resolution of this probable basal grade has probably been hindered by the relative lack of morphological differentiation.

There are few clear patterns in the geographical distribution of unrecognized species diversity. Diplodactylus is absent or poorly represented from far southeastern and eastern Australia. Within the remaining range of the genus, unrecognized species were found across all habitats and ecological zones: both the northernmost and two southernmost species are undescribed. There is also no clear patterning in terms of the distribution of lineages with respect to proximity to centres of human habitation, exemplified by four unrecognized species within 500 km of the state capital of Adelaide. However, there were fewer unrecognized taxa around Perth (Western Australia) and Sydney (New South Wales); both areas have been historic centres of collection for specimens for taxonomic research.

(c) The importance of integrated approaches

While taxonomy has traditionally relied heavily on phenotypic traits, other data are often required in morphologically conservative groups to infer species boundaries. In our study, while there are subtle morphological differences concordant with species boundaries (Doughty et al. 2008; P. Oliver 2008, personal observation), these differences are often masked by other morphological variation (see above). Without the genetic and karyotypic data, it would be impossible to determine which morphological variation is taxonomically significant. It is widely recognized that good taxonomic practice attempts to integrate data from multiple datasets when inferring species boundaries (Sites & Marshall 2004; De Queiroz 2007). We can only underscore this sentiment; there is no short cut to good taxonomy.

In recent years, there has also been a controversy around the potential to use mtDNA ‘barcodes’ to accelerate taxonomic discovery. mtDNA barcoding seems suited for both identifying deeply divergent (matrilineal) lineages and rapidly ascribing individuals to species after boundaries have been delimited based on integration of all relevant sources of information. However, using some minimum or average measure of mtDNA divergence to define species per se is highly problematic (e.g. Lee 2004; Moritz & Cicero 2004). Here, while mtDNA divergence coarsely correlates with species-level divergences, there does not appear to be any clear cut-off at which mtDNA alone indicates that species recognition is warranted. The lowest interspecific divergence we recovered (0.075 corrected) is below intraspecific divergence levels found in no less than 10 putative species (table 1). Admittedly, some of the latter may prove to be species complexes, but we will only know once additional datasets have been gathered. Only through appraisal of additional datasets can one infer the taxonomic significance of mitochondrial divergences. This proviso is particularly pertinent to poorly known groups for which barcoding is meant to be most useful; these are the groups with the least amount of available background information to aid interpretation of mtDNA divergences.

(d) The scale of the taxonomic impediment

If integrated taxonomic data can more than double estimates of species diversity in a highly studied group (vertebrates) and in a developed country (Australia), this raises grave questions about the scale of the taxonomic impediment. Geckos have a tendency to show high levels of cryptic species diversity (e.g. Pepper et al. 2006; Oliver et al. 2007b; Couper et al. 2008; and supplementary references). However, comprehensive multi-marker taxonomic assessments have been undertaken for relatively few geckos from Australia, and even fewer from developing countries (which contain the bulk of gekkonid diversity). Thus, most of the 1200+ nominal gecko species have not been rigorously assessed. If the levels of cryptic diversity found in recent studies are representative, this suggests that (at a minimum) several hundred gecko species remain unrecognized. Recent molecular studies of frogs from Latin America (Fouquet et al. 2007) and Southeast Asia (Stuart et al. 2006) also show that cryptic species complexes are common in other vertebrate groups. Clearly, worldwide diversity in many vertebrate groups remains seriously underestimated.

In the Australian context, our results emphasize the need for continuing detailed assessments of vertebrate diversity. It is becoming increasingly common for Australian wildlife researchers and monitoring authorities to discontinue the collection of vertebrate voucher specimens in the belief that such groups are well known, so that many opportunities to discover cryptic species are lost. All the ‘new’ species identified in this study were already represented in collections (although additional specimens for many were collected here). Many are also morphologically diagnosable in the light of the genetic results. However, without integrated data from multiple datasets, specifically including genetic samples with corresponding well preserved voucher specimens it is difficult to interpret conflicting and/or subtle patterns of morphological variation. A large number of widespread Australian vertebrate groups (particularly among fishes, small mammals and reptiles) have not been comprehensively assessed for unrecognized diversity, but show similar patterns of probable lineage diversity in the face of morphological conservatism. This suggests that species diversity may still be significantly underestimated in many groups, which have a long history of study and are considered relatively well known.

4. Conclusions

While the challenges facing invertebrate taxonomy are profound, vertebrates and vascular plants remain important proxies for much biological survey, ecological and conservation work. However, although vertebrate taxonomy is generally more resolved than invertebrates, in many groups, it clearly remains far from complete. Our results are particularly striking given that they come from a comparatively rich industrialized country with a long history of taxonomic research on the relevant group. Many vertebrate species therefore presumably remain to be discovered; elucidation of this unrecognized vertebrate diversity will require integrated data from multiple sources and continued support for basic taxonomic research. While such efforts entail considerable effort and expense, it is clear that there is potential for them to reap rich returns.

Acknowledgments

All specimens included in this study were collected with ethics approval and under permit from the relevant state authorities.

We thank Steve Cooper, Adam Skinner, Andrew Hugall and Kate Sanders for their help with laboratory protocols and data analyses, and Rhonda Hutchinson for providing new karyotypic data. The following people collected specimens used in this study or provided access to specimens, data and tissue in their care: Patrick Couper; Andrew Amey; Matt Vucko; Lin Schwarzkopf; Brad Maryan; Scott Keogh; Mitzy Pepper; Ross Sadlier; Terry Bertozzi; and Stephen Donnellan. This work was funded by the Australian Biological Resources Survey, Australia Pacific Science Foundation and Mark Mitchell Foundation.

Footnotes

    • Received December 16, 2008.
    • Accepted January 30, 2009.

References

View Abstract