Rarity is widely used to predict the vulnerability of species to extinction. Species can be rare in markedly different ways, but the relative impacts of these different forms of rarity on extinction risk are poorly known and cannot be determined through observations of species that are not yet extinct. The fossil record provides a valuable archive with which we can directly determine which aspects of rarity lead to the greatest risk. Previous palaeontological analyses confirm that rarity is associated with extinction risk, but the relative contributions of different types of rarity to extinction risk remain unknown because their impacts have never been examined simultaneously. Here, we analyse a global database of fossil marine animals spanning the past 500 million years, examining differential extinction with respect to multiple rarity types within each geological stage. We observe systematic differences in extinction risk over time among marine genera classified according to their rarity. Geographic range played a primary role in determining extinction, and habitat breadth a secondary role, whereas local abundance had little effect. These results suggest that current reductions in geographic range size will lead to pronounced increases in long-term extinction risk even if local populations are relatively large at present.
Rarity is widely used to assess extinction risk for conservation purposes  and has been implicated as a risk factor in past extinctions [2–7]. There are many different ways to be rare, but their relative influences on long-term extinction dynamics are poorly known. Species may be rare because they have small geographic ranges, narrow habitat tolerances, small populations or any combination thereof. When each of these measures of rarity is dichotomized, there are seven unique forms of rarity (e.g. small range, narrow habitat, small population; large range, narrow habitat, small population) . Studies of contemporary risk reveal links between some aspects of rarity and population decline [9–12], but whether or not these declines will ultimately lead to extinction is, of course, uncertain. Furthermore, investigating the associations between rarity and extinction risk in extant species can easily become circular because some predictor variables, such as geographic range, are used in defining risk [13–15]. The fossil record provides the only opportunity to directly assess the relationships between rarity and extinction risk.
Previous studies have considered the effects of geographic range [3,5,6,16–22], and to a lesser extent, abundance [3,4,23–26] and habitat breadth [7,19,27], on extinction risk in the marine fossil record. However, no study has attempted to evaluate simultaneously the relative effects of rarity in all its different forms on long-term patterns of extinction risk. Because different aspects of rarity often covary [3,28], and are measured in different units, it is impossible to assess their relative importance in extinction dynamics simply by comparing the results of univariate analyses, which make up the majority of the existing literature on extinction selectivity in the fossil record. The few multivariate studies that have been conducted illustrate clearly the impact that covariation can have on inferred causal relationships [3,6,29–31], but do not focus on rarity specifically. Moreover, all of these studies were limited in spatial, temporal and/or taxonomic scope.
In this study, we used the Paleobiology Database (http://paleodb.org) to investigate the associations between different forms of rarity and extinction risk for marine animals through the Phanerozoic (543 Ma–recent). Focusing on the three canonical aspects of rarity—geographic range size, habitat breadth and local abundance—that are the basis for assessing risk today , we quantified rarity and extinction risk within each geological stage for more than 6000 marine invertebrate genera representing more than 70 taxonomic classes and spanning a wide variety of functional groups. The complete dataset contains more than 13 000 stage-level observations of genera with associated ecological and extinction data and was used to examine the associations between rarity and extinction risk in a multivariate analytical framework. These analyses reveal systematic and persistent differences in extinction risk among different forms of rarity over the macroevolutionary history of marine animals.
2. Material and methods
(a) Data download
The data used for our study were downloaded from the Paleobiology Database (http://paleodb.org) on 26 September 2010. The download included all occurrences entered by the Marine Invertebrate Working Group, excluding vertebrates and genera listed in quotation marks or qualified as ‘?’, ‘cf.’ or ‘aff.’. Vertebrates were not included in this study because of their relatively limited abundance and occurrence data. The following data associated with each occurrence were also downloaded: palaeolatitude, palaeolongitude, palaeoenvironment, primary lithology and taxon abundance. All occurrences were filtered to include only genera that were classified to a higher taxon (family, order and/or class) and which were assigned to one of the 74 geologic stages. The resulting dataset consisted of 301 904 occurrences in 42 467 collections for a total of 6491 genera. Overall, 72 classes are represented, with 27 classes having more than 1 per cent mean proportional diversity in the Palaeozoic, Mesozoic and/or Cenozoic Eras (see the electronic supplementary material, table S1).
(b) Measuring rarity and risk
Geographic range was measured as the area occupied by a genus relative to the maximum occupancy possible in an interval to account for temporal variation in the quality of the fossil record [3,21]. The globe was divided into 10 000 cells, each 3.6° longitude × 1.8° latitude in size, and the range size was calculated as the number of cells in which a genus occurred relative to the total number of cells that contained fossils in the interval, using the palaeocoordinates of fossil occurrences. Although cell size varies with latitude, a previous study of the Paleobiology Database conducted at comparable spatial and temporal scales found little difference between results generated using equal area versus 5° × 5° cells because high-latitude fossil occurrences are relatively limited overall, and tend to be concentrated in only a few regions during those intervals in which they occur . Palaeocoordinates were derived from Scotese's palaeomap rotations, provided as part of the standard download protocol from the Paleobiology Database. Over the last 100 million years, palaeocoordinates are well constrained using magnetic sea floor anomaly data. Due to subduction of oceanic crust, greater uncertainty exists in palaeogeographic reconstructions, in particular palaeolongitude, deeper in time [33,34].
Habitat breadth was measured using three environmental variables that are important determinants of species distributions in the oceans today and which can be readily identified in the marine geologic record: relative water depth (above and/or below storm wave base), substrate (carbonate, siliciclastic and/or mixed) and latitude (tropical and/or extratropical). We defined 12 possible habitats based on all possible pairwise combinations of these three variables (2 × 3 × 2) and counted up the number a genus occupied and divided that by the total number of habitats containing fossil occurrences in the interval to account for temporal variation in the preservation of environments. Storm wave base reflects the interactions between bathymetry, sedimentology and climate , and is best viewed as a relative, rather than absolute, measure of water depth . We use storm wave base to delimit the broad-scale depth tolerance of genera and refrained from finer subdivisions of palaeo-water depth because of greater uncertainty in defining these depth breaks across a diversity of sedimentary basins.
Local abundance for each genus was measured as its mean proportional abundance in those collections in which it occurred within an interval ; only collections containing 100 or more specimens were used for this calculation. Extinction risk was measured as the observed extinction or survival of a genus in an interval. Each measure of rarity was transformed to satisfy assumptions of normality using an arcsine square root transformation commonly applied to proportional data that include zero values and then scaled to zero mean and unit variance to allow relative effects on extinction risk to be assessed on a comparable scale.
These measures of rarity describe the summed geographic range, habitat breadth and local abundance of all congeneric species within each genus. Genus extinction similarly reflects the extinction of all the populations of its congeneric species. To the extent that many genera are monospecific, associations observed between rarity and extinction risk at the genus-level may be informative for understanding species-level dynamics.
(c) Modelling the relationships between rarity and extinction risk
We examined the association between rarity and extinction risk using two multivariate approaches. First, we assigned genera to one of the eight classes following the classification scheme of Rabinowitz . We used the median values of abundance, range size and habitat breadth to delimit rare versus common taxa in each interval. We compared the odds of extinction (odds of extinction = q/(1−q), where q is the probability of extinction) of genera in the seven classes characterized by one or more aspects of rarity (e.g. small range, broad habitat, large population; small range, small habitat, large population) to the odds of extinction for genera in the eighth class that were considered common by all three rarity measures (i.e. large range, broad habitat, large population). Distinguishing rare from common genera using the median versus a different quantile is an arbitrary decision because each rarity measure is continuous. Because most genera exhibit values that are less than half of the maximum observed, the median provides a conservative estimate of differential risk. Secondly, we used multiple logistic regression to assess the associations between each rarity measure and extinction risk within a continuous framework. A multiple logistic regression model was fit to all intervals containing data for 50 or more genera.
There are strengths and limitations to each of these approaches. Analysis of rarity classes can reveal synergistic effects resulting from the interactions between different aspects of rarity but imposes discrete breaks on what are really continuous variables. Relative odds of extinction can also only be compared in intervals in which some common genera go extinct. Using continuous multiple logistic regression addresses these two limitations, allowing us to examine these associations without having to define arbitrary cut-offs between ‘rare’ and ‘common’ genera. However, in most intervals the sample size of genera with associated rarity data is too small to estimate interaction terms as well as additive effects: the median sample size of genera in an interval is 178, with a minimum of 51 and maximum of 800; moreover, stages containing many genera may still yield little statistical power if the extinction rate was either very high or very low and as a result few genera survived or went extinct.
To address the potential confounding effects of incomplete sampling [37–40], the number of occurrences was used as a measure of per-genus sampling probability in the continuous multiple logistic regression model. The number of genus occurrences in each interval was logarithmically transformed prior to analysis.
3. Results and discussion
Our analysis revealed systematic differences in the odds of extinction among genera classified according to their rarity (figures 1 and 2). Geographic range had the strongest effect on extinction risk, with habitat breadth contributing secondarily. The median odds of extinction among genera with small ranges and narrow habitat breadth were six-times greater than for common genera (i.e. those with large populations, large ranges and broad habitat breadth). Because little extinction occurred among common genera in most stages, this finding indicates an approximately sixfold difference in the probability of extinction between these groups. Among genera with small ranges, broad habitat breadth was associated with a 30 per cent reduction in the median odds of extinction, yet these genera were still four times more likely to go extinct than common genera. In contrast, local abundance contributed remarkably little. The associations between rarity types and extinction risk were fairly consistent over geological time; although variation occurred among stages (e.g. weaker extinction selectivity during the Permian–Triassic and Cretaceous–Palaeogene mass extinctions), there is no evidence for systematic changes in the relationships between forms of rarity and extinction risk over the last 500 million years (figure 1).
The pattern of correlation between rarity measures and extinction risk is insensitive to analytical approach. A multiple logistic regression model in which the three aspects of rarity were treated as continuous additive predictors of extinction risk (see figure 3 and electronic supplementary material, table S2) yields comparable results to the analysis using discrete rarity classes. In this case, the odds ratio indicates the change in the odds of extinction associated with a one unit increase in the value of the predictor variable. Geographic range had a strong negative association with extinction, with more broadly distributed genera consistently at lower risk throughout the Phanerozoic. On average, habitat tolerance was also inversely associated with extinction risk, although this relationship was much more variable in the multiple logistic regression analysis. Local abundance showed little association with extinction risk, with genera that occurred at lower abundance at comparable risk to those that occurred at greater abundance.
Despite secular changes in our ability to resolve the geographic distributions of marine organisms in the fossil record (see §2b), this variation in the nature of the fossil record has no identifiable effect on the patterns of extinction selectivity presented here. There is no significant difference in the mean log odds of extinction according to the geographic range size between intervals before or after 100 Ma (t-test, p > 0.05), nor any secular trend in the association between the range size and extinction risk through the Phanerozoic (figure 3).
The associations between rarity and extinction risk revealed in the analysis that pooled data for all animal genera are also present when the analysis is conducted within more restricted taxonomic groupings. Bivalves, gastropods and brachiopods together comprise 72 per cent of the full dataset and each clade had sufficient data to examine the associations between the three continuous rarity measures and extinction risk. The multiple logistic regression model described above was fit separately to the data for each clade using stages in which more than 50 genera in the clade had associated rarity data. Results for each clade (figure 4) were comparable to those generated using the full dataset (figure 3), and there were no significant differences among clades in their distribution of log odds values for each rarity measure (p > 0.05 for each Kolmogorov–Smirnov test before Bonferroni correction for multiple comparisons).
Geographic range is the principal aspect of rarity associated with extinction risk in our multivariate analyses of marine invertebrates as a whole (figures 1–3), as well as multivariate analyses of diverse clades within the overall dataset (bivalves, gastropods and brachiopods). However, abundance data are too limited for most phyla and classes in most intervals to fit the multivariate model. To investigate variability in the association between geographic range and extinction risk among clades, we fit a logistic regression model with range size as the sole predictor for seven animal phyla and for marine protists. The association between range size and extinction risk may bear the indirect contributions of habitat breadth and population size, but should be dominated by the direct effect of geographic range. All eight groups exhibit a consistent, inverse relationship between geographic range size and extinction risk (figure 5). The two groups that contain some pelagic taxa (Protozoa and Mollusca) exhibit positive log odds (i.e. a positive association between geographic range and extinction risk) in a limited number of intervals. These results indicate remarkably consistent patterns of extinction risk with respect to different forms of rarity across both time and higher taxa.
Could the persistent correlation between rarity and extinction risk across geological time and higher taxa result from sampling biases rather than biological processes? Incomplete sampling could generate apparent differences in extinction risk among rare versus common genera by spuriously shortening the durations of rare taxa [29,37–40]. We examined this potential bias in two ways and found our results to be robust to these alternative treatments of the data. First, we added occurrence frequency as a measure of per-interval sampling probability for each genus to our multiple logistic regression model (figure 3) and found little change in the estimated associations between the three aspects of rarity and extinction risk. Parameter estimates were strongly correlated between the two models (abundance: Spearman ρ = 0.95, p < 0.0001; geographic range: Spearman ρ = 0.67, p < 0.0001; habitat breadth: Spearman ρ = 0.95, p < 0.0001) and there was equivocal support for the model that included this additional parameter (electronic supplementary material, figure S1). Furthermore, models that included geographic range or all three rarity measures in addition to occurrence frequency tended to have much greater support than a model of extinction risk based on number of occurrences alone (electronic supplementary material, figure S2). Secondly, we excluded genera observed in only one interval (singletons) from our multiple logistic regression analysis and found that this also had little effect on the estimated associations between rarity and extinction risk. Parameters estimated using the data for all genera were strongly correlated with those estimated using the data excluding singleton genera (abundance: Spearman ρ = 0.86, p < 0.0001; geographic range: Spearman ρ = 0.84, p < 0.0001; habitat breadth: Spearman ρ = 0.90, p < 0.0001). These two approaches are conservative; the frequency of singletons and the number of occurrences of genera in an interval result from the interplay between rarity and sampling yet are assumed in these two treatments to be exclusively sampling artefacts. These results are also congruent with several previous studies [3,5,21,23], which have shown that the associations between individual aspects of rarity and extinction risk could not be attributed simply to sampling biases.
More broadly, there is little reason to expect that sampling biases can explain the long-term associations between rarity and extinction risk that we observe here. First, we are not attempting to evaluate absolute changes in a given rarity metric over time; rather, we are asking whether the values of rarity metrics for genera within a given time interval are associated with either extinction or survival at the end of that same interval. If many rare taxa in one interval (t) appeared to go extinct, but in fact survived into the next interval (t + 1), this could enhance the apparent extinction selectivity in interval t. However, these taxa should then be added to the list of unsampled victims for some subsequent time interval (likely t + 1) and so result in an underestimation of extinction selectivity in that interval due to the failure to sample many rare victims. Thus, while variation in sampling intensity could create artificially strong or weak selectivity within any given interval, it could not produce the consistent interval-after-interval selectivity observed in this study unless most of the unsampled survivors persist to the present day, which is unrealistic.
The strong association between geographic range and extinction risk documented here has been observed previously in studies focused solely on geographic range (e.g. [18,21]), and in the few multivariate analyses conducted at primarily finer spatial, temporal and taxonomic scales [3,6,16,29–31]. Among early Cenozoic bivalves in North America, for example, geographic range was the only biological factor consistently associated with species duration after accounting for covariation with local abundance and body size . Similarly, among Cenozoic mollusks in South America  and New Zealand , geographic range was strongly and consistently associated with species duration even after factoring out the effects of life habit and other biological characteristics. At the global scale, geographic range was significantly associated with survivorship among bivalve genera during the Cretaceous–Palaeogene mass extinction , and among skeletonized marine invertebrate genera through the Phanerozoic , and in both of these studies the association between geographic range and survivorship remained after accounting for variation in species richness. These previous studies corroborate the strong effect of geographic range on extinction risk, yet the current study is the first to establish the importance of range size relative to all other forms of rarity.
One widely accepted explanation for the pervasive association between geographic range size and extinction risk is that large ranges buffer taxa from biotic and abiotic stresses affecting more limited geographic areas. The critical factor is the size of a taxon's range and only secondarily how that range is distributed across habitats (this study, [18,19]). Despite the importance of geographic range in macroevolution, relatively little is known about the factors that generate variation in geographic range size over deep time. To what extent can the substantial variation in range size observed across taxa be explained by differences in fecundity, dispersal, competition and environmental preference [3,7,30,41]? Have these same attributes, in conjunction with the expansion and contraction of environments over geologic time, given rise to the long-term changes in the range size observed over the histories of individual marine taxa [42–44]? And what role, if any, has extinction selectivity played in shaping long-term changes in the average range size of marine faunas (e.g. [45, fig. 1c])?
The lack of association between local abundance and extinction risk over geologic time is surprising given the extensive literature documenting effects of population size on extinction risk among extant species [11,12,46–48]. Yet our results are in accord with other analyses of the fossil record that have found no association between abundance and extinction risk , or a negative association that was due entirely to covariation between abundance and geographic range size . More broadly, positive , negative [4,50] and non-monotonic  relationships have all been reported in palaeontological studies. This degree of heterogeneity contrasts markedly with the consistent negative association between geographic range size and extinction risk, and strongly suggests that taxa observed at low abundance in the fossil record had population sizes that were considerably greater than the minimum size below which the effects of demographic stochasticity become critical [51,52] and/or possessed traits that allowed them to counteract the problems of reproduction and recruitment at low densities. Actualistic studies comparing the abundance of species in contemporary living communities with their associated time-averaged death assemblages  may help to identify threshold population sizes below which species are so rare they are unlikely to be preserved and subsequently sampled.
Short- and long-term monitoring data are lacking for most marine species today [54–56]. As a result, assessments of extinction risk for living marine animal species are typically made on the basis of qualitative or semi-quantitative measures of overall abundance and geographic range size . The fossil record shows that variation in geographic range size has the dominant effect on extinction risk over long timescales, and that this association is due to the buffering effects of range size and not the greater habitat tolerance or larger local population size of broadly distributed taxa. This pattern of extinction selectivity through the Phanerozoic marine fossil record fingerprints regional-scale environmental perturbations and not demographic stochasticity as the principal driver of past extinctions, thereby highlighting the utility of the fossil record for identifying the traits that lead to elevated risk.
Although extinction occurs when all the populations of a taxon decline to zero, the primary biological predictor of this process over the last 500 million years has been geographic range size followed by habitat breadth, but not local abundance. These results are robust to varying analytical approach and to alternative treatments of the data that address the potential biasing effects of incomplete sampling. Moreover, this pattern has held consistent throughout the history of marine animal life and within diverse animal phyla and classes. Therefore, it appears unlikely to reflect either the circumstances of any particular time in Earth history or clade-specific ecological or physiological traits. Taken together these results suggest that contemporary reductions in the range size observed in many groups  will be accompanied by a pronounced increase in long-term extinction risk even if population sizes remain relatively large at present.
We thank the researchers who collected these data and the members of the Paleobiology Database who compiled them, M. Elmore, M. Foote, J. A. Meachen, C. R. McClain, J. L. McGuire, and V. L. Roth for discussion, and D. L. Rabosky, P. J. Wagner and two anonymous reviewers for thoughtful comments on the present manuscript. This work was partially funded by NESCent (NSF grant EF-0905606) and DFG grant KI 806/7-1. This article is Paleobiology Database Publication no. 170. Data deposited at Dryad: doi:10.5061/dryad.0mq69.
- Received August 14, 2012.
- Accepted October 2, 2012.
- This journal is © 2012 The Royal Society