Identifying units of biological diversity is a major goal of organismal biology. An increasing literature has focused on the importance of cryptic diversity, defined as the presence of deeply diverged lineages within a single species. While most discoveries of cryptic lineages proceed on a taxon-by-taxon basis, rapid assessments of biodiversity are needed to inform conservation policy and decision-making. Here, we introduce a predictive framework for phylogeography that allows rapidly identifying cryptic diversity. Our approach proceeds by collecting environmental, taxonomic and genetic data from codistributed taxa with known phylogeographic histories. We define these taxa as a reference set, and categorize them as either harbouring or lacking cryptic diversity. We then build a random forest classifier that allows us to predict which other taxa endemic to the same biome are likely to contain cryptic diversity. We apply this framework to data from two sets of disjunct ecosystems known to harbour taxa with cryptic diversity: the mesic temperate forests of the Pacific Northwest of North America and the arid lands of Southwestern North America. The predictive approach presented here is accurate, with prediction accuracies placed between 65% and 98.79% depending of the ecosystem. This seems to indicate that our method can be successfully used to address ecosystem-level questions about cryptic diversity. Further, our application for the prediction of the cryptic/non-cryptic nature of unknown species is easily applicable and provides results that agree with recent discoveries from those systems. Our results demonstrate that the transition of phylogeography from a descriptive to a predictive discipline is possible and effective.
Electronic supplementary material is available online at https://dx.doi.org/10.6084/m9.figshare.c.3512460.
- Received July 7, 2016.
- Accepted September 27, 2016.
- © 2016 The Author(s)
Published by the Royal Society. All rights reserved.