Coleoptera (beetles) is the most species-rich metazoan order, with approximately 380 000 species. To understand how they came to be such a diverse group, we compile a database of global fossil beetle occurrences to study their macroevolutionary history. Our database includes 5553 beetle occurrences from 221 fossil localities. Amber and lacustrine deposits preserve most of the beetle diversity and abundance. All four extant suborders are found in the fossil record, with 69% of all beetle families and 63% of extant beetle families preserved. Considerable focus has been placed on beetle diversification overall, however, for much of their evolutionary history it is the clade Polyphaga that is most responsible for their taxonomic richness. Polyphaga had an increase in diversification rate in the Early Cretaceous, but instead of being due to the radiation of the angiosperms, this was probably due to the first occurrences of beetle-bearing amber deposits in the record. Perhaps, most significant is that polyphagan beetles had a family-level extinction rate of zero for most of their evolutionary history, including across the Cretaceous–Palaeogene boundary. Therefore, focusing on the factors that have inhibited beetle extinction, as opposed to solely studying mechanisms that may promote speciation, should be examined as important determinants of their great diversity today.
There are more described species of beetles (Order Coleoptera) than any other animal group, with over 380 000 named species . While many have remarked on the vast numbers of beetles, how they came to be such a species-rich group is still under investigation . Our understanding of the historical pattern of beetle diversification currently is dominated by estimates based on molecular phylogenies of exclusively extant taxa [3–5]. The role of the beetle fossil record to understand their historical patterns of diversification has been reduced to merely supplying a handful of calibration points for these analyses. This relatively insignificant role of the fossil record may be due, in part, to a general misperception that insects do not have a high potential for preservation , or that it is biased by exceptional deposits  (e.g. ambers), and otherwise is hopelessly incomplete (e.g. [8,9]). While these claims may be true of some insect clades (e.g. lepidopterans; ), beetles, in particular, might be an exception. Their robust exoskeleton, such as the hardened elytra (modified forewings), makes the Coleoptera the most readily preserved insect group in the fossil record.
Other than the work of Labandeira & Sepkoski  and Nicholson et al. , which focused on the fossil record of insects overall, there have been no quantitative analyses of the beetle fossil record. Furthermore, the pattern of beetle diversification documented in the fossil record and its correspondence to molecular-phylogenetic estimates remains unclear (but see  for a more ecological approach). We therefore compiled coleopteran fossil occurrence data from the international palaeontologic and entomologic literature into a new, comprehensive database (available through the DRYAD repository (doi:10.5061/dryad.s8kv6)), and used it to quantitatively evaluate the coleopteran fossil record. We then use sample-standardization techniques to infer historical patterns of taxonomic richness and rates of family origination and extinction.
2. Material and methods
(a) Occurrence data
We compiled a database of fossil beetle species occurrences from localities older than the Pliocene (approx. 5 Ma) from the international palaeontological and entomological literature; including works published in the early to middle 1800s  through to early 2014. In addition, we incorporated data from other open access database projects, including the EDNA Fossil Insect Database (http://edna.palass-hosting.org/) and the Catalogue of Fossil Coleoptera (http://www.zin.ru/animalia/coleoptera/eng/paleosys.htm). Relying on early publications describing fossil insects used to be a cause for concern because our understanding of taxonomy has changed so much in the past 150 years. However, a resurgence of modern phylogenetic work on beetles and the discovery of new fossil specimens have brought renewed attention to these early specimens and a greater understanding of how they fit into a modern framework [14–16]. We standardized and corrected for nomenclatural consistency of all taxa using a classification of extinct beetle taxa above the genus rank . We obtained richness of extant taxa from Slipinski et al. . Note that our database was constructed at the species level, and our analyses conducted at the family level. Because of this, some localities included more than one species of a family, and so that family was represented in that particular collection by more than one family-level occurrence.
(b) Time scale and temporal resolution
We divided coleopteran evolutionary history into twelve 25 Myr time intervals spanning 0–300 Myr ago. We used intervals of uniform duration rather than non-uniform intervals (e.g. epochs or stages) because intervals with unequal durations can distort estimates of richness and taxonomic rates (; but see [19,20]). We chose 25 Myr intervals as a balance between the time scale over which major evolutionary events occur (e.g. rate shifts) and the number of data (e.g. sample size) required to infer those events. We experimented with alternative durations (results not shown), but longer intervals tend to obscure rapidly changing biological patterns by integrating them within intervals, whereas shorter intervals result in some with very few occurrences rendering the sampling quota (see ‘Sample standardization’) insufficient to reveal biological patterns spanning intervals. Also, the median stratigraphic range of beetle families is approximately 78 Myr (see ‘Taxonomic ranges’), which means that most families span at least two of our 25 Myr time intervals.
For analyses that were not sample-standardized, we assigned collections and, by definition, their constituent species occurrences to intervals based on the midpoint of their age uncertainty (i.e. the average of their maximum and minimum possible ages). During each pseudoreplicate analysis of the sample-standardization analyses (see below), we assigned to occurrences the ages of their respective localities, which we drew from a uniform random distribution between their maximum and minimum possible ages, then assigned to intervals accordingly. For example, if a collection age was resolved no finer than the Miocene in the literature, the uncertainty in collection age was taken as the boundary dates of the Miocene (5–23 Myr ago), and we correspondingly used the midpoint age (i.e. 14 Ma) for unstandardized analyses, and a different random date drawn uniformly between 5 and 23 Ma for each pseudoreplicate sample-standardization analysis.
(c) Taxonomic ranges
We determined stratigraphic ranges of families as the maximum and minimum ages of constituent species occurrences. The stratigraphic ranges of beetle families vary from groups with 0 Myr ranges (i.e. singletons) to the Cupedidae, which spans approximately 267 Myr (not including the Pull-of-the-Recent). The overwhelming majority of families that appear in the fossil record are extant. It is possible, therefore, analytically to treat the Recent as a stratigraphic interval, and include occurrences of these extant taxa. Such inclusion extends the observed range of taxa that were not sampled in the youngest fossil interval (or more). Methods that use entire ranges as data (e.g. range-through-richness (RT) and per capita rates; see below) can be biased by the inclusion of Recent occurrences (e.g. ), as the ranges of some (i.e. extant taxa), but not all taxa are extended beyond their last appearance in the fossil record. However, most such methods typically interpret the observed stratigraphic range (i.e. the dates of first- and last-appearance in the fossil record) as the taxonomic duration (i.e. the dates of true taxonomic origination and extinction; but see [22,23], which do not assume this equality). Therefore, the accuracy of any estimates of richness or rates using these methods is dependent on the accuracy of the taxonomic durations used as data. In this study, we show estimates of richness and rates both including and excluding (see the electronic supplementary material, figure S1). Recent occurrences suggest that those using the Recent occurrences most accurately reflect the taxonomic evolution of beetles. Although we acknowledge the potential bias the Pull-of-the-Recent can induce for the 33 extinct families in our database, we suggest this bias is minor in comparison with the underestimation of durations of the 36 extant families due to low sampling in our most recent time interval.
(d) Sample standardization
To mediate potential bias owing to the observed temporal variation in sampling over time, we separately used two sample-standardization strategies. First, we performed rarefaction of occurrences, with 1000 pseudoreplicate analyses, in which we subsampled occurrences from each interval to a quota. We determined this quota value dynamically in each pseudoreplicate analysis as the greatest observed family richness in any single interval. Because dates of localities were assigned randomly in each replicate analysis (see above), this maximum richness in an interval varied from replicate to replicate, and ranged between 86 and 96 families (median = 90). We chose the maximum richness in an interval as the quota, as this theoretically permitted the face-value patterns to be recovered after subsampling; any lower quota necessarily would analytically distort the patterns of taxonomic richness .
We also applied shareholder quorum subsampling (SQS; electronic supplementary material, figure S3; ), which has some clear theoretical advantages over simple rarefaction. However, we are cautious about the application of SQS to our dataset for theoretical (with respect to our dataset) and empirical reasons. Specifically, in SQS, collections within intervals are drawn until a particular ‘coverage’ level is reached. A taxon's contribution to the coverage quota is the proportion of occurrences in the interval that belong to it. Essentially, this proportion is a measure of the taxon's ‘commonness’ among localities in the interval. However, the vast majority of published records of insect species are original descriptions, and subsequent finds of the same species rarely find their way into print. This is often true also of the extant beetle literature, where new species are described, but subsequent finds are seldom documented. Therefore, a beetle species that is a relatively common fossil and one that is extraordinarily rare would both probably appear as single occurrences, thus diminishing the correspondence between the true ‘commonness’ of species and their representation as published occurrences. The use of SQS under these conditions might serve to distort evolutionary patterns, rather than reveal them. In practice, the macroevolutionary patterns found using SQS do not differ qualitatively from those using rarefaction by occurrences (results not shown), although confidence intervals on estimates of richness and rates are greater when using SQS.
(e) Taxonomic richness and rates
We estimated patterns of taxonomic richness using RT, in which we assume taxa were present in intervals between their first and last appearance, whether or not they were actually preserved in those intervals. Patterns of taxonomic richness using alternative metrics are shown in the electronic supplementary material, figures S1–S3. We calculated per capita rates of family origination and extinction  using the entire stratigraphic ranges determined using the subsampled occurrences (i.e. between intervals of first and last subsampled occurrence). These rate estimates have a number of favourable properties including a relatively low sensitivity to interval-to-interval variation in sampling and ‘edge-effects’ . We calculated net-diversification rates as the difference between per capita origination and extinction rates.
3. Results and discussion
Our database includes 5503 beetle species occurrences from 221 fossil localities. Overall sampling of beetle families varies through time, with peaks in the number of collections and specimen occurrences found in the Late Jurassic through to the Early Cretaceous, and the mid-Cenozoic (figure 1). The intervening middle through to the Late Cretaceous is a period of particularly low sampling, both in terms of number of collections and number of occurrences (figure 1). A similar pattern has been documented in the fossil record of continental vertebrates [26,27] and there is still debate as to whether this is attributable to larger global processes, such as marine transgressive and regressive cycles [28,29]. Our occurrence database permits an additional quantitative description of the variation in sampling over time, obtained as the proportion of families that range through an interval (i.e. have a first appearance before, and last appearance after) that are actually preserved in (i.e. have an occurrence within) that interval (dashed line in figure 2a ). This estimate of the probability of preservation within an interval generally follows the same pattern as the number of occurrences. Specifically, preservation probability peaks around 0.7 in the Early Cretaceous and Eocene, and is around 0.2 at its lowest point in the early Late Cretaceous.
Beetles are preserved in a number of depositional settings including amber/resins, fluvial, lacustrine, lagoonal, marine, swamp and pond environments. Most significant in terms of both diversity and abundance are the amber and lacustrine deposits, as has been noted also for other fossil insect groups (figure 2b; [31,32]). The non-amber record, comprised mainly lacustrine and marginal marine settings, preserves substantially more beetle occurrences (approx. 85%), and amber deposits first appear in the Early Cretaceous, failing to capture the first 165 million years of coleopteran evolution. Insect specimens in amber tend to be preserved in three dimensions, and although amber often has superior preservation quality of individual specimens, these deposits do not disproportionately influence the overall macroevolutionary patterns in the record. In fact, of all beetle families preserved in the fossil record, only 87 (approx. 41%) are found in amber compared with the 114 (approx. 54%) that are found in lacustrine deposits. Moreover, only four families (approx. 2%) are known exclusively from amber deposits. In other words, most families found in amber deposits are also found in non-amber deposits (see figure 2c). Furthermore, in no interval are amber deposits more likely to preserve a family of beetles, on average, than in non-amber deposits (figure 2a). Additionally, within a particular interval, families preserved in non-amber settings tend to have occurrences in multiple localities, whereas families typically occur in only one amber locality (figure 2d). One factor contributing to this pattern is that there are simply many more lacustrine localities than amber localities in any particular interval, so there may be a greater opportunity for capturing representatives of the same family from several lake deposits relative to one or a few amber deposits. That is not to say that ambers are an insignificant component of the coleopteran fossil record. Indeed, 38 beetle families (approx.18%) have their first occurrences as inclusions within amber deposits. While biases in the record do exist, the ability to capture such a high percentage of extant beetle groups in the fossil record suggests that future studies which use phylogenetic approaches to study beetle biodiversity, would indeed benefit from the inclusion of fossil occurrences .
(b) Taxonomic richness
Coleoptera first appear in the Permian [33,34], although there is some discussion as to whether basal members of the group may have first appeared even earlier, in the Carboniferous [34–36]. There is a steady accumulation of families throughout their evolutionary history (figure 3a), and this result is consistent with the nearly monotonic increase in family richness observed for all insects . All extant suborders (Archostemata, Myxophaga, Adephaga and Polyphaga) and extinct stem groups are preserved in the fossil record. Currently, there are 214 recognized families of beetles (excluding ichnofamilies), with 179 being extant and 35 entirely extinct [17,37]. Of these, 148 (approx. 69%) are preserved in the fossil record (see the electronic supplementary material, table S1 for family occurrence data), including 113 (approx. 63%) of the extant families. Nearly all the families absent from the fossil record (e.g. Aspidytidae, Cneoglossidae and Meruidae) are groups that have relatively few species (less than 300 species), many of which are endemic to more geographically limited regions and/or inhabit environments that have a low potential for preserving fossil insects [32,38–41]. Others are small, fungivorous and found in association with the bark of trees or in leaf litter and these beetles (e.g. Hobartiidae, Phloiophilidae and Sphaeritidae) are unlikely to be preserved unless they were to come in contact with sap and eventually become preserved as inclusions within amber [42,43].
It is useful to compare the macroevolution of the suborder Polyphaga to that of the other coleopteran suborders and stem groups, because Polyphaga differ from the others in two important ways. First, Polyphaga is substantially more species-rich today than the others, including approximately 90% of described extant species (figure 3b; ). In fact, while considerable research has focused on the reasons for exceptional beetle diversification (e.g. [3,45]), it may be more appropriate to focus on why the suborder Polyphaga is so much more diverse than the other suborders. Second, this disproportionate richness has been attributed to the degree of dietary variation within suborders. Of the extant non-polyphagan suborders, Archostemata are primarily wood boring, the Myxophaga are primarily algae feeding and Adephaga are primarily predaceous, although there are notable exceptions. All three of these suborders have several morphological characteristics attributable to their specialization on these habits [46,47]. By contrast, the Polyphaga, which includes species-rich groups like the Staphylinidae and Scarabaeidae, includes a full range of dietary preferences (e.g. algae and fluid feeders, xylophages, folivores, carnivores). Despite this dietary breadth, previous research suggests instead that it was dietary specialization of subgroups (e.g. Phytophaga) that promoted the diversification of the Polyphaga (and members within this group) [3,5,45].
We partitioned our analysis to examine the accumulation of these families, grouped by whether they are members of the suborder Polyphaga or one of the other non-Polyphagan suborders or stem groups. The non-polyphagan groups were the first to appear in the fossil record, and reach their peak in sample-standardized family richness in the early part of the Triassic (figure 3c). Polyphagans first appear in the same time interval, but do not surpass the family richness of the non-polyphagans until the Jurassic. After this time, the rapid monotonic diversification of the Polyphaga strongly contrasts with the slow decline of the non-polyphagan groups, which is characterized by both relatively low origination rates and slightly higher extinction rates (see below). This decline culminates the eventual loss of remaining stem clades, five adephaga families and four archostematan families.
(c) Rates of diversification
The overall net diversification rate for all beetle groups combined is highest at the beginning of the group's history (figure 4a). After this initial period of high rates, diversification rates remain relatively low, but positive, with small peaks in the Early Jurassic and Early Cretaceous (figure 4a). To understand diversification, we once again separate the Polyphaga from the non-Polyphaga. Net diversification rate is relatively low, but positive throughout the history of the Polyphaga, and low to negative for the non-polyphagan families (figure 4b). Non-polyphagans had their highest rates of origination in the Permian, represented primarily by extinct stem taxa. This was followed by declining rates through the Triassic and then consistently low levels of origination throughout their subsequent history (figure 4c). Polyphagan beetles have their highest levels of origination in the Early Jurassic and a slight increase in the middle Cretaceous, with origination rates being fairly low throughout the rest of their history (figure 4c). This pattern of origination corroborates the findings of Hunt et al. , who used molecular-phylogenetic techniques to show most coleopteran lineages above the family level diverged in the Triassic and Jurassic. This relatively early peak in diversification for the clade further demonstrates that these lineages were established early and are especially long-lived.
Polyphagan beetles show a modest increase in family-level origination rate during the middle Cretaceous (figure 4c). Interestingly, this increased rate of origination coincides with the timing of the angiosperm radiation . While it might be tempting to attribute this to coevolutionary diversification between beetles and flowering plants, this apparent pulse of origination is better explained by the first occurrence of beetle-bearing amber deposits in the fossil record. In fact, the middle Cretaceous peak in family-level origination rates disappears if amber deposits are excluded from the analysis (see the electronic supplementary material, figure S4). Early amber deposits include the Cretaceous Albian amber , Burmese amber , Lebanese amber  and Yantardakh/Taimyr amber . Families that have their first occurrences in these ambers include groups that are often associated with wood or with leaf litter and tend to have smaller body sizes, as has been documented previously in the insect taphonomy literature [31,42,43]. However, even this potential taphonomic bias cannot be attributed to the diversification of angiosperms, as all ambers from this time interval were probably produced by Auraucariaceae and not by members of the emerging angiosperms .
Non-polyphagan beetles have their highest rates of extinction in the Permian through to the Early Triassic (figure 4d), mostly owing to extinction of members of the stem Coleoptera. This is followed by consistently low levels of extinction throughout the remainder of their history. Similar changes in rates of extinction have also been demonstrated in some other insect clades, with ancient groups having higher extinction rates earlier in their history, followed by lower extinction rates and low richness today . By contrast, very few polyphagan families have gone extinct, so the extinction rate of Polyphaga is always effectively zero (figure 4d). When compared with family-level extinction rates for other fossil animals [11,26,54], the Polyphaga have some of the lowest levels of extinction.
No beetle groups show elevated levels of extinction at the Cretaceous–Palaeogene (KPg) boundary. Labandeira et al.  hypothesized that herbivorous insects should have experienced high levels of extinction at the KPg boundary based on their strong associations with host plants, which underwent high levels of extinction at this time. Others have found little disturbance at the KPg boundary and suggest that insect extinction may be more of a regional than global phenomenon [56,57]. Our results for the Coleoptera corroborate this latter conclusion, along with other studies of fossil insects [10,36,58]. However, it is possible that the KPg extinction may have had different impacts at the family level, as analysed here, and at lower taxonomic ranks, also suggested by Labandeira et al. . Currently, sampling of the beetle fossil record is not sufficiently dense at taxonomic ranks below the family, so testing this potential inequality awaits future sampling of the beetle fossil record, or alternative approaches.
While these analyses are being performed at the family-level, low extinction rates are likely to be a factor that contributed to the great species richness of polyphagan beetles today (cf. ). Therefore, instead of focusing on what promotes the speciation and radiation of Coleoptera, it might be more appropriate to focus on why beetles, polyphagans in particular, are less susceptible to extinction. Work focused on the Quaternary and Late Neogene has demonstrated that beetles are quite resilient to extinction, owing to their ability to change their geographical distributions in response to climate change . In fact, ecology, morphology and geographical range size [54,60,61] are thought to contribute to extinction risk and in fact, many of these traits may be phylogenetically constrained [62,63]. Focusing on extinction and what life-history characteristics make some beetles more likely to go extinct when compared with other groups of beetles may provide a greater understanding of why polyphagan beetles are so diverse. Conversely, it may be that origination rates at the species-level are sufficiently elevated to keep a family from going extinct. It is possible that diversification dynamics at the level of family and species are quite different, but it currently is not possible to test this using the fossil record, as the species-level data required are not readily available. In fact, most of the insect fossil record that has been collected remains in museum cabinets and undescribed in the literature. Thus, it will become possible to further compare macroevolutionary dynamics below the level of family as more effort is made to describe these taxa and provide access to these abundant collections.
Occurrence and collection databases are deposited in DRYAD (doi:10.5061/dryad.s8kv6).
This project was supported by the National Evolutionary Synthesis Center (NESCent), NSF no. EF-0905606 (sabbatical fellowship to D.M.S. and visiting scholar funding to J.D.M.).
The authors thank Suzanne Larsen and the library staff at the University of Colorado for their assistance in accessing the vast fossil Coleoptera literature, especially rare publications that were only available from limited resources or via microfiche. We thank Drs L. Qi-bin, A. Rasnitsyn, E. Jarzembowski, V. Blagoderov and B. Archibald for assistance in interpreting the temporal and depositional settings of several localities in the database. In addition, A. Cook-Hatfield, J. Daniels, A. P. Moe-Hoffman, J. Hodgkins and E. Leckey assisted with data entry. We thank the National Evolutionary Synthesis Center (NESCent) staff and associated scholars for support and fruitful discussion of this work. Dr K. Cranston provided advice on how to facilitate broader accessibility of the database. We greatly appreciate Drs C. Caruso, P. Donoghue, P. Harnik, T. Karim, C. Labandeira, H. Mahereli, C. Nufio, R. Plotnick, R. Smith and our anonymous reviewers for their insightful suggestions regarding this work.
- Received January 12, 2015.
- Accepted February 20, 2015.
- © 2015 The Author(s) Published by the Royal Society. All rights reserved.