Royal Society Publishing

Inner ear anatomy is a proxy for deducing auditory capability and behaviour in reptiles and birds

Stig A Walsh, Paul M Barrett, Angela C Milner, Geoffrey Manley, Lawrence M Witmer

Abstract

Inferences of hearing capabilities and audition-related behaviours in extinct reptiles and birds have previously been based on comparing cochlear duct dimensions with those of living species. However, the relationship between inner-ear bony anatomy and hearing ability or vocalization has never been tested rigorously in extant or fossil taxa. Here, micro-computed tomographic analysis is used to investigate whether simple endosseous cochlear duct (ECD) measurements can be fitted to models of hearing sensitivity, vocalization, sociality and environmental preference in 59 extant reptile and bird species, selected based on their vocalization ability. Length, rostrocaudal/mediolateral width and volume measurements were taken from ECD virtual endocasts and scaled to basicranial length. Multiple regression of these data with measures of hearing sensitivity, vocal complexity, sociality and environmental preference recovered positive correlations between ECD length and hearing range/mean frequency, vocal complexity, the behavioural traits of pair bonding and living in large aggregations, and a negative correlation between ECD length/rostrocaudal width and aquatic environments. No other dimensions correlated with these variables. Our results suggest that ECD length can be used to predict mean hearing frequency and range in fossil taxa, and that this measure may also predict vocal complexity and large group sociality given comprehensive datasets.

Keywords:

1. Introduction

Sensory adaptations in living species are relatively easy to observe and measure, but evidence of sensory capabilities in extinct taxa is seldom preserved in fossil material. Structures such as the regions of the brain cavity associated with sensory function and the osseous conduits for nerves can provide some information, but are often inaccessible in fossils owing to the presence of surrounding skull bones or the presence of lithified matrix. The rarity of these observational data is a major impediment to understanding the evolutionary pathways by which extant taxa achieved their sensory and behavioural abilities, as well as undermining a fuller understanding of the palaeobiology of extinct taxa.

Fortunately, technical advances in radiographic imaging, coupled with new neurological studies of extant taxa, have opened up new opportunities. The endocranium of mammals, birds and reptiles records to a varying degree the relative expansion of specific sense-related brain regions (e.g. the olfactory and optic lobes; Gittleman 1991; ,Kalisinska 2005; ,Iwaniuk et al. 2008; ,Steiger et al. 2008), allowing some measure of sensory adaptation to be inferred in extinct taxa (e.g. Witmer et al. 2003, ,2008; ,Domínguez Alonso et al. 2004; ,Kundrát 2007; ,Sampson & Witmer 2007; ,Milner & Walsh 2009). Likewise, the inner ear, which was previously observable only in damaged fossils (as a series of voids and tubes within the wall of the endocranium), can now be fully visualized with micro-computed tomography (μCT).

In extant reptiles and birds, the length and other dimensions of the basilar papilla (the hearing organ of the inner ear, analogous to the organ of Corti in mammals) are closely correlated with hearing frequency sensitivity (Manley 1973).

As the basilar papilla is housed within the endosseous cochlear duct (ECD; sensu Witmer et al. 2008; lagena of some authors), some estimate may be made of its maximal dimensions in fossil material (,Gleich et al. 2005). In theory, these ECD dimensions should provide a basis for inferring hearing sensitivity, which in turn has behavioural implications. For example, vocalizing vertebrates generally produce vocal frequencies within the range of their hearing (,Konshi 1970; ,Brown & Waser 1984; ,Endler 1992; ,Narins et al. 2004), so estimates of hearing frequency range may be informative about the presence of vocalization and likely vocalization frequencies in extinct taxa (,Evans 1936; ,Manley 1973). These estimates may also provide information about sociality and vocal complexity, since vocal communication tends to be more complex in species that form large, socially intricate aggregations (,Evans 1936; ,Blumstein 1997). Vocality and hearing may even provide some indication of preferred habitat in extinct taxa, as species inhabiting closed environments where visual communication is ineffective often possess more complex (,Garrick & Lang 1977) or lower frequency (,Brown & Waser 1984) vocalizations than sister taxa that do not.

Potentially, a great deal of information can be gained from ECD morphology, and several authors have used this structure to investigate the evolution of hearing, notably from basal archosauromorph reptiles, through non-avian dinosaurs to modern birds (e.g. Weishampel 1981; ,Rogers 1998; ,Clarke 2005; ,Gleich et al. 2005; ,Sanders & Smith 2005; ,Sampson & Witmer 2007; ,Witmer et al. 2008). The validity of these studies depends largely on how accurately the dimensions of the ECD reflect those of the basilar papilla and, therefore, auditory-related attributes. However, earlier accounts were based on measurement of the bony wall (,Elzanowski & Galton 1991) or physical endocasts (e.g. ,Clarke 2005; ,Carabajal et al. 2008) of the otic capsule, and probably overestimated the dimensions of the internal soft-tissue hearing apparatus, or introduced inaccuracies related to casting damaged material. Modern μCT imaging avoids these problems by providing reconstructions accurate to several micrometres of the internal space of the inner ear in intact specimens (e.g. Witmer et al. 2003, ,2008; ,Domínguez Alonso et al. 2004; ,Sanders & Smith 2005; ,Sampson & Witmer 2007; ,Sereno et al. 2007; ,Witmer & Ridgley 2008; ,Milner & Walsh 2009), thus representing the maximal dimensions of the original soft tissue. However, the ECD encloses soft-tissue structures other than the basilar papilla (e.g. the lagenar macula and perilymphatic space; ,Wever 1978; ,Gleich et al. 2005), and thus also cannot be regarded as a completely accurate representation of the basilar papilla itself. For example, structures such as the helicotrema form prominent outgrowths in the wall of the cochlear duct in squamates, radically altering the shape of endocast reconstructions. Given these concerns, it is necessary to evaluate explicitly the relationship between inner-ear bony anatomy and hearing sensitivity, a relationship that has frequently been assumed but that has never been tested quantitatively.

Here, we provide the first quantitative analysis of three-dimensional ECD dimensions and their relationship to auditory capability and behaviour in extant reptiles and birds. We use μCT analysis to test the hypotheses that ECD dimensions correlate with generalized measures of known hearing sensitivity and that those dimensions can also predict habitat selection, social complexity and vocalization complexity, and continue to examine the implications that the results have for the robust inference of audition-related behaviours in extinct taxa.

2. Material and methods

(a) Data collection

Fifty-nine species in 52 genera representing Testudines, Crocodylia, Rhynchocephalia, Squamata and eight avian orders (see the electronic supplementary material) were chosen on the basis of the known presence and relative complexity of vocalizations. Selection criteria also included availability of data on hearing sensitivity, environmental requirements and sociality. μCT scan data for selected taxa were acquired either from museum collections or through access to existing datasets; information on scanning procedures is provided in the electronic supplementary material.

Using Materialise Mimics 9.0, three-dimensional inner ear morphology for the left and right ears in all specimens was digitally segmented from the internal void space within the endosseous labyrinth to create virtual endocasts of the otic capsule (figure 1). This approach avoids inconsistencies in reconstruction resulting from uneven otic capsule wall thickness or where the capsule is difficult to differentiate from the rest of the skull. Segmented endosseous labyrinths were exported as pairs in high-resolution binary STL format, and then separated and saved as separate files using Konica-Minolta Polygon Editing Tool. Further post-processing involved removal of the pars vestibularis (saccule and semicircular canals) to leave only the ECD (pars cochlearis; figure 1). This procedure ensures consistent maximum length and volume measurements. In most taxa, a defined constriction is present in the region where the ECD contacts the saccule, and was used to demarcate a plane that divides the former from the latter. Structures of the pars vestibularis dorsal to this plane, including the saccule, were deleted. In those taxa that lack this constriction (e.g. Testudines), the line of separation was made directly beneath the horizontal semicircular canal. Projections from the wall of the ECD representing the helicotrema or perilymphatic sac in squamates (,Wever 1978) were not removed for this test, as it is not possible to distinguish these structures on the basis of osteological material alone (which is usually the only material available in palaeontological specimens).

Figure 1

Representative examples of segmented endosseous labyrinths in lateral view from the reptile and bird groups included in this study (see text). Not to scale. Dashed lines represent the approximate line of separation of the pars vestibularis from the ECD (f,g); dotted ellipses mark the approximate position of the fenestra vestibuli (fv). (a) Testudines (Chelydra serpentina), (b) Crocodylia (Caiman crocodilus), (c) Rhynchocephalia (Sphenodon punctatus), (d) Squamata (Gambelia wislizenii) showing the projection of helicotrema on the left, (e) Aves (Aythya fuligula), (f) isolated ECD of Gambelia wislizenii in proximal and (g) lateral views. Linear measurement variables: RCW, rostrocaudal width; MLW, mediolateral width; ML, maximum length.

ECD measurements (figure 1; see the electronic supplementary material) comprised total length, maximum rostrocaudal and mediolateral width (distance measurements made orthogonally using the horizontal semicircular canal as the reference plane) and total volume (extracted using Inus Technologies RapidForm 2006). Total length in markedly curved ECDs was measured using three-point (single-arc) curve fitting tools in RapidForm 2006. All measurements were scaled to skull basicranial axis length to reduce size effects (see the electronic supplementary material). Mean values for the measurements of left and right ECDs in the same specimen were calculated and log transformed to meet the assumptions of multiple regression. Information on the (independent) variables (e.g. hearing sensitivity, vocalization, environment and sociality) was collated from published and online sources and simplified to provide a tractable number of variables (see the electronic supplementary material for sources).

(b) Quantitative analysis 1: scaled and transformed ECD dimensions and hearing sensitivity

Hearing sensitivity data were available for only 24 out of the 59 taxa. This dataset was subjected to multiple regression analysis (simultaneous entry) twice using SPSS v. 15.0, with best hearing range and mean hearing frequency alternating as the dependent variable, and with the scaled ECD measurements as independent variables. Best hearing frequency range (defined here as occurring below the 30 dB sensitivity threshold following Wever 1978) was calculated as FmaxFmin, and mean hearing frequency as (Fmax+Fmin)/2, where Fmax and Fmin represent maximum and minimum frequency above the 30 dB threshold, respectively. Note that these values do not refer to absolute hearing range.

(c) Quantitative analysis 2: scaled and transformed ECD dimensions, vocalization, sociality and environment

All variables were subjected to four multiple regression analyses (simultaneous entry) using SPSS v. 15.0. Each multistate category was recoded as a set of dummy independent variables (greatest vocal complexity: no vocalization versus simple single-phrase vocalizations, simple multiple-phrase vocalizations and complex multiple-phrase vocalizations; environment: open terrestrial (e.g. deserts) versus closed terrestrial (e.g. jungles), aquatic/marine and fossorial (underground) environments; and sociality (normal group size): solitary versus pairs, groups of less than 20 and groups of 20 or more). Each of the four scaled and transformed ECD measurements was sequentially treated as the dependent variable, with the remaining three as further independent variables.

3. Results

(a) Scaled and transformed ECD dimensions and hearing sensitivity

Strong significant correlations were found between maximum scaled and transformed ECD length and mean hearing frequency and best hearing range (table 1; ,figure 2), indicating that ECD length is predictive of these auditory capacities. No other ECD measurement variables were significant in either model.

View this table:
Table 1

Significant hearing sensitivity and behavioural correlations recovered in this study.

Figure 2

Correlations between maximum scaled and transformed ECD length and (a) best hearing frequency range (y=6104.3x+6975.2; R2=0.547) and (b) mean best hearing (y=3311.3x+4000.8; R2=0.566). Frequency value estimates for Archaeopteryx lithographica and Odontopteryx toliapica are provisional, as damage precluded accurate measurement of the basicranial axis length scaling factor on both specimens.

(b) Scaled and transformed ECD measurements, vocalization and environment

Significant positive correlations were found between scaled and transformed ECD length and multiple-phrase vocalizations (with complex multiple-phrase vocalizations approaching significance) and with pair bonding, groups of less than 20 and groups of more than 20 individuals, indicating a possible link between ECD length, sociality and vocal complexity. Negative correlations were found between scaled and transformed ECD length and scaled and transformed rostrocaudal width and aquatic environments (table 1), suggesting a tendency for comparatively short and narrow cochlear ducts in aquatic taxa.

4. Discussion

Our aim was to test whether ECD measurements could be used to infer auditory capability and behaviours in extinct birds and reptiles, as has been attempted by previous authors, particularly for non-avian dinosaurs (e.g. Weishampel 1981; ,Rogers 1998; ,Clarke 2005; ,Sanders & Smith 2005). The results of this first quantitative attempt to test inferences of hearing and behaviour from osteological and fossil specimens show that maximum ECD length is indeed predictive of best hearing range and particularly of mean hearing frequency.

Our hearing sensitivity regressions also allow the first quantitative estimations of best hearing frequency range and mean frequency in fossil taxa on the basis of maximum ECD length (figure 2). Our data indicate that the predicted sensitivity values for the Lower Eocene (55 Ma) neognath seabird Odontopteryx toliapica (Natural History Museum (NHM) 44096) are 2100 Hz for mean best hearing and 3800 Hz for best hearing range (approx. 200–4000 Hz), while the contemporaneous seabird Prophaethon shrubsolei (NHM A683) was much closer to the modern neognaths in our dataset with a mean best hearing of 2600 Hz, and best hearing range of 4300 Hz (approx. 450–4750 Hz). The earliest known avialan, the Late Jurassic (147 Ma) Archaeopteryx lithographica (NHM 37001), had a mean best hearing frequency of approximately 2000 Hz, with a range of best hearing within a 2800 Hz band (approx. 600–3400 Hz). These predictions place Archaeopteryx close to the emu (Dromaius novaehollandiae) at the lower end of the sensitivity range of living birds. Our mean best hearing frequency estimate is therefore in good agreement with the best hearing range of 2400–3200 Hz predicted for Archaeopteryx by Gleich et al. (2005) based on body mass data.

The significant relationship between sociality (measured by broad boundaries in aggregation numbers) and ECD maximum length is concordant with observed relationships between increased complexity of vocal communication and larger social aggregations in mammals and birds (Blumstein 1997). Although this result may have been affected by the presence in the dataset of vocally complex birds that form large flocks, it suggests that the longer the ECD in a given fossil taxon, the more likely that taxon is to have lived in larger aggregations. The negative correlation between ECD length/rostrocaudal width and aquatic environments is difficult to explain since, apart from turtles, most aquatic species in this dataset (Crocodylia and some birds) possess relatively long cochlear ducts. It seems possible that these correlations relate to untested aspects of ECD shape in these taxa. Since no ECD dimensions other than length and rostrocaudal width correlate with the broad environmental variables used here, habitat does not appear to influence significantly the functional morphology of the cochlear duct. Further testing with more effective exemplar species for each category might help to further explore such a relationship, although other regions of the inner ear may be more informative (e.g. the relative size of the saccule with respect to fossorial lifestyles).

Extant bird species are well known to possess relatively longer cochlear ducts than living reptiles (e.g. Manley 1990), to the extent that there is little overlap in length between birds and their living sister group, Crocodylia (,figure 3). Since birds are more closely related to dinosaurs than to crocodiles (e.g. ,Padian & Chiappe 1998), the information they can provide about the relationship between ECD length and auditory sensitivity is crucial for the estimation of non-avian dinosaur auditory capability. We are aware that the disjuncture in relative length will have an effect on behavioural correlations in living taxa, but note that information from birds is likely to be particularly relevant in the context of inferring behaviour in dinosaurs. Nonetheless, the broad correlations recovered in this first quantitative investigation suggest that more variable categories capable of providing finer-grained distinctions may provide additional useful information in conjunction with a larger dataset.

Figure 3

Scaled ECD length versus scaled ECD rostrocaudal width (untransformed), illustrating the separation between Aves and most non-avian sauropsids with respect to ECD length.

Overall, our results indicate that it is possible to infer hearing and possibly sociality in both extinct and extant taxa using simple measurements gathered from the endosseous inner ear. Although the case is less clear when inferring vocalization on the basis of our data, these results suggest that larger datasets may be able to recover these and other relationships.

Acknowledgements

We thank S. D. Chapman and J. Cooper (Natural History Museum, London), M. Jones (University College London), M. Lowe (University College Museum of Zoology), J. Maisano (University of Texas, Austin) and N. Triche (Houston) for access to specimens or scan data. We also thank G. Dermody, D. Bate and A. Ramsey (Metris X-Tek, Tring), J. R. Hutchinson (Royal Veterinary College, Hatfield) and R. Abel (NHM, London) for access to scanning facilities and discussion of CT techniques. S. E. Evans (University College London) and M. A. Knoll (University of Portsmouth) are thanked for their useful technical discussion. R. C. Ridgely and D. L. Dufeau (Ohio University, Athens) provided technical assistance. This manuscript was greatly improved by useful comments from three anonymous reviewers. This work was supported by a NERC small grant NE/E008380/1 awarded to P.M.B. and A.C.M., as well as National Science Foundation grants IBN-0343744 and IOB-0517257 to L.M.W.

Footnotes

  • Present address: Department of Mineralogy, The Natural History Museum, London SW7 5BD, UK.

    • Received October 26, 2008.
    • Accepted December 12, 2008.

References

View Abstract