## Abstract

Animals produce a tremendous diversity of sounds for communication to perform life's basic functions, from courtship and parental care to defence and foraging. Explaining this diversity in sound production is important for understanding the ecology, evolution and behaviour of species. Here, we present a theory of acoustic communication that shows that much of the heterogeneity in animal vocal signals can be explained based on the energetic constraints of sound production. The models presented here yield quantitative predictions on key features of acoustic signals, including the frequency, power and duration of signals. Predictions are supported with data from nearly 500 diverse species (e.g. insects, fishes, reptiles, amphibians, birds and mammals). These results indicate that, for all species, acoustic communication is primarily controlled by individual metabolism such that call features vary predictably with body size and temperature. These results also provide insights regarding the common energetic and neuromuscular constraints on sound production, and the ecological and evolutionary consequences of producing these sounds.

## 1. Introduction

For over a century, scientists have puzzled over the great diversity of sounds produced for communication, from the chirp of a cricket to the moan of a whale. The importance of this sound production to an organism's fitness is reflected in a variety of sound-producing mechanisms that have evolved and the generally high energetic cost of producing sounds (Ryan 1988; Prestwich 1994; Hauser 1998; Oberweger & Goller 2001). To better understand this diversity, many informative biophysical and evolutionary models have been developed (Morton 1977; Gerhardt 1994; Bradbury & Vehrencamp 1998; Bass & McKibben 2003), and many intriguing patterns of animal communication have been described (Brackenbury 1979; Wallschager 1980; Fletcher 2004). In particular, there is a rich literature showing that some properties of acoustic signals may be correlated with metabolic rate, body size or temperature among closely related species (Wallschager 1980; Ryan 1986, 1988; Prestwich *et al*. 1989; Sanborn 1997). However, a more general quantitative theory of acoustic communication has remained elusive, in part because acoustic signals are often considered to be governed by taxon-specific traits (Dawkins 1993; Hasson 1997).

Here, we present and test a series of related models, based on well-established principles of animal energetics, which represent the first step towards a more general theory of acoustic communication. The models build on a recent model describing the body size and temperature dependence of metabolic rate, defined as the rate at which an organism takes up and uses energy for survival, growth and reproduction. This model proposes that whole organism's metabolic rate, *B*, scales to the ¾ power of body mass and exhibits a predictable, exponential temperature dependence described by the term e^{−E}^{/}^{kT}.

Thus, the metabolic rate per unit mass, *B*/*M*, is related to body size *M* (in grams) and temperature as
1.1
where *b*_{0} is a taxon-specific normalization constant (W g^{−3/4}) that varies about 10-fold between endotherms and ectotherms (Gillooly *et al*. 2001). The negative −¼ power scaling of mass-specific metabolic rate with body mass shown in equation (1.1) is consistent with theory that predicts this quarter power scaling based on how distribution networks (e.g. circulatory systems in animals and vascular systems in plants) impose a constraint on the delivery of energy and materials to cells (e.g. West *et al*. 1997). It is also supported by considerable empirical evidence showing that, on average, metabolic rate shows the mass dependence described by equation (1.1) in most taxonomic groups (Savage *et al*. 2004). The Boltzmann–Arrhenius factor in equation (1.1), e^{−E}^{/}^{kT}, describes the exponential increase with temperature of the biochemical reactions that govern metabolism, whereby *E* is the average activation energy of the respiratory complex (approx. 0.6–0.7 eV), *k* is Boltzmann's constant (8.62 × 10^{−5} eV K^{−1}) and *T* is the absolute temperature in degrees Kelvin (K) (Gillooly *et al*. 2001). As previously described, this proposed temperature dependence basically assumes a commonly observed *Q*_{10} value of about 2.5, which indicates a 2.5-fold change in rate for a change in temperature of 10°C (see Allen & Gillooly 2007 for review). Both the body size and temperature dependence of metabolic rate described by equation (1.1) build upon decades of physiological research (see Allen & Gillooly 2007 and references therein). This equation has been shown to explain considerable variation in metabolic rates among ectothermic and endothermic animals (Gillooly *et al*. 2001).

We extend equation (1.1) to derive predictions on acoustic communication by making three general assumptions that relate sound production to the underlying energetics of these processes. First, we assume that heterogeneity in two basic temporal features of acoustic signals (i.e. call frequency and call rate) are driven primarily by variation in rates of muscular activity that produce sound (Skoglund 1961; Suthers *et al*. 1999) rather than by morphological features that can alter sounds after production, e.g. resonators (Bradbury & Vehrencamp 1998). Second, we assume that the rate of muscular activity in a species, which is under neuronal control, occurs at a rate proportional to individual metabolic rate. This latter assumption is consistent with the data showing that, in general, both neuron firing rates and rates of muscle contraction occur at a rate proportional to mass-specific metabolic rate (Prestwich 1994; Medler 2002; Hempleman *et al*. 2005; Zhantiev *et al*. 2006). It is also consistent with the data showing that sound frequency is approximately equal to the vibration frequency of muscle producing the sound (Fish: Skoglund 1961; Fine *et al*. 2001, 2004; Connaughton *et al*. 2002; Connaughton 2004; Frogs: Martin 1971, 1972; Birds: Elemans *et al*. 2004 and Snakes: Rome *et al*. 1996), which in turn is proportional to the amount of energy fluxing through the muscle (e.g. Martin 1972; Girgenrath & Marsh 1997; Elemans *et al*. 2004). Together, these two assumptions imply that call frequency (i.e. pitch), *f* (cycles s^{−1}), should show the same size and temperature dependence as metabolic rate such that
1.2
where *f*_{0} is a normalization constant that represents the number of cycles per joule of metabolic energy flux through a gram of tissue (cycle J^{−1} g). Moreover, these two assumptions imply that call rate, *r* (calls s^{−1}), defined as the inverse of call period, should also show the same body size and temperature dependence as metabolic rate such that
1.3
where *r*_{0} is a normalization constant that represents the number of calls per joule of metabolic energy flux through a gram of tissue (call J^{−1} g). Equation (1.3) is consistent with the data showing that call rates are governed by the rates of activity that produce the sound (e.g. wing closure rates), which in turn are governed by muscle contraction rates (Josephson 1973; Schneider 1977; Brozovich & Pollack 1983; Mitchell *et al*. 2008) and ultimately metabolic rate (e.g. Pough *et al*. 1992; Bailey *et al*. 1993). Equation (1.3) also implies that inter-call intervals are fixed fractions of the total calling bout. As such, call duration (s call^{−1}) is predicted to scale inversely with call rate when defined as the difference between the call period and the inter-call interval. Thus, call duration should scale inversely with mass-specific metabolic rate as follows:
1.4
where *d*_{0} is a normalization constant that represents the number of joules of metabolic energy flux per gram of tissue per call (J g^{−1} call^{−1}).

The third assumption is that the power (W) of a single call, *p*, is constrained by whole-organism metabolic rate such that the fraction of metabolic energy invested in a single call is approximately invariant. This assumption implies that call power should exhibit the body size and temperature dependence described above for whole-organism metabolic rate, *B*, as follows:
1.5
where *p*_{0} is a normalization constant that represents the amount of sound energy produced per joule of metabolic energy flux at the level of the organism. This assumption is consistent with the empirical data showing that acoustic communication is energetically demanding for many species (Ryan 1988; Prestwich 1994; Hauser 1998; Oberweger & Goller 2001), and that, on average, animals often devote an approximately constant fraction of their total energy budget for a specific activity (Brown *et al*. 2004). Note, also, that equations (1.3)–(1.5) combined imply that the rate of sound energy production during a calling bout, which is the product of call rate, call duration and call power (i.e. *rdp* in watts), should also show the body size and temperature dependence described by equation (1.5) because *r* an *d* scale with body mass in exact opposite directions.

Equations (1.2)–(1.5) yield testable, quantitative predictions on basic properties of animal acoustic signals. First, equations (1.2) and (1.3), respectively, predict that the natural logarithms of temperature-corrected call frequency and call rate (i.e. ln (*f*e^{E}^{/}^{kT}) and ln (*r*e^{E}^{/}^{kT})) are linear functions of the natural logarithm of body mass with slopes of −0.25, reflecting the size dependence of metabolic rate. Second, equations (1.2) and (1.3), respectively, predict that the logarithms of mass-corrected call frequency and call rate (i.e. ln (*fM*^{1/4}) and ln (*rM*^{1/4})) will be linear functions of inverse absolute temperature (i.e. 1/*kT*) with slopes of −0.65 (average of 0.6–0.7 range), reflecting the temperature dependence of metabolic rate. Third, equation (1.4) predicts that the natural logarithm of temperature-corrected call duration will be a linear function of the logarithm of body mass with a slope of 0.25, and that the natural logarithm of mass-corrected call duration will be a linear function of 1/*kT* with a slope of 0.65. Fourth, combining equations (1.3) and (1.4) yields the prediction that the fraction of time spent calling during a calling bout, *rd*, should be a constant independent of body size and temperature. Fifth, equation (1.5) predicts that the natural logarithm of temperature-corrected call power will be a linear function of ln (body mass) with a slope of 0.75, and that the logarithm of mass-corrected call power will be a linear function of 1/*kT* with a slope of −0.65. Finally, combining equations (1.2) and (1.5) and solving for power (i.e. ) predicts that a log–log plot of call power versus call frequency should have a slope of −3. This final prediction exemplifies how the call parameters considered here can be related to each other since they are all similarly related to metabolic rate. Moreover, note that in all cases, equations (1.2)–(1.5) predict substantial changes in basic call properties such that all sound properties considered here are predicted to vary by approximately one order of magnitude for every approximately four orders of magnitude change in body size at a given temperature, and by approximately 35-fold across a 40°C temperature range at a given body size.

## 2. Material and methods

We evaluated model predictions using field and laboratory data on species' calls for a diverse suite of endotherms and ectotherms (species number: fishes = 27, amphibians = 79, reptiles = 15, invertebrates = 45, aquatic and terrestrial mammals = 46 and birds = 285). Species varied in size from about 10^{−3} g for the water bug *Micronecta poweri* to 10^{8} g for the blue whale *Balaenoptera musculus*, and in temperature from 6°C for the frog *Hyla labialis* to 42.50°C for the cicada *Okanagana vanduzeei*. We included all known data (497 species) with exceptions to avoid arbitrary sounds, or species violating model assumptions. Excluded taxa include: domesticated or echolocating endotherms, measures based on playback experiments, data pre-corrected for temperature, invertebrates producing substrate-borne signals or those exhibiting asynchronous muscle dynamics (e.g. dipterans, coleopterans; see Dudley 2000), and the fish family Gobiidae for which the mechanism of sound production is unclear. Additionally, marine mammal sounds were generally restricted to ‘typical’ calls (e.g. moans in cetaceans). For amphibians exhibiting multiple dominant frequencies, the second lowest frequency was used as this is the frequency that more often acts as the primary signal during courtship. Given differences in terminology, efforts were also made to standardize acoustic measures based on descriptions and sonograms. In particular, maximum call rates during courtship were used in endotherms, assuming these were most comparable to rates reported for ectotherms.

Most of the calls considered here were for the purposes of courtship. Call frequency measures considered here were restricted to the ‘dominant’ frequency, or the frequency with the most sound energy. The rate of calling and the duration of calls, which typically consists of a series of syllables or notes, were measured during a calling bout.

Maximum call power measurements (dB) were standardized to root-mean square values by multiplying values by 0.707. Power measurements were included and corrected for differences in reference pressure when noted (1 µPa versus 20 µPa; approx. 26 dB) and for medium density (air and water) and sound speed (approx. 35.5 dB). Sound pressure level was converted to milliwatt using a standard equation that accounts for measurement distance and assumes hemispherical spreading (Bass & Clark 2003). One outlier was excluded from power analyses (*Palmacorixa nana*; Aiken 1982).

Unless reported otherwise, temperatures of mammals and birds were taken to be 37°C and 40°C, respectively (Gillooly *et al*. 2001). Ectotherm body temperatures were estimated based on ambient temperatures, except when body core temperatures were reported. For field studies, data were restricted to those taken under conditions varying less than 5°C, and mean values were reported. For laboratory studies, mean values were included in analyses for species measured over constant temperatures of less than 10°C, but minimum and maximum values were included for ranges greater than 10°C. However, in two species of insects, where call frequency was relatively invariant with temperature, average temperature was used in frequency analyses. Studies reporting body mass as a range varying greater than 10-fold were excluded, whereas ranges less than 10-fold were averaged and included. Multiple measures were reported for species with data on individuals varying in mass by greater than 10-fold.

## 3. Results and discussion

Data are largely supportive of model predictions. First, with respect to call frequency, the natural logarithm of temperature-corrected call frequency is a linear function of the natural logarithm of body mass with a fitted slope (−0.21) close to the predicted slope of −0.25 (figure 1*a*; 95% CI: −0.20 to −0.23). This relationship accounts for about 68 per cent of the variation in call frequency across species. The logarithm of mass-corrected call frequency is also a linear function of inverse absolute temperature as predicted, though the observed slope, −*E*, is significantly lower than the predicted value of −0.65 eV (−0.53 eV; *r*^{2}=0.50; 95% CI: −0.48 to −0.58).

Second, the natural logarithm of temperature-corrected call rate is a linear function of the logarithm of body mass with a slope of −0.23, which is statistically indistinguishable from the predicted value of −0.25 (95% CI: −0.18 to −0.28; *r*^{2} = 0.72; figure 2*a*). This relationship accounts for 72 per cent of the variation in call rates across species. The logarithm of mass-corrected call rate is also a linear function of inverse absolute temperature, with a slope (*b* = −0.82) that is statistically indistinguishable from the predicted value of −0.65 (95% CI: −0.53 to −1.12; *r*^{2} = 0.51; figure 2*b*).

Third, the natural logarithm of temperature-corrected call duration is a linear function of the logarithm of body mass with a slope of 0.23, which is close to the predicted value of 0.25 (95% CI: 0.20–0.27; *r*^{2} = 0.70; figure 3*a*). The logarithm of mass-corrected call duration is also a linear function of inverse absolute temperature, with a slope of 0.56, close to the predicted value of 0.65 (95% CI: 0.39–0.74; *r*^{2} = 0.37; figure 3*b*). However, for species with calls described as ‘trills’, many of which ‘reuse’ air during calling, call duration was substantially higher than for other species, and call rate was significantly lower, for reasons we cannot explain (see appendix S1, electronic supplementary material). Still, for all species, in agreement with prediction 4, the fraction of time spent calling during a calling bout (i.e. calling effort = product of call rate, *r* and call duration, *d*) is approximately constant across species (approx. 25%).

Finally, we observed that the natural logarithm of temperature-corrected call power is a linear function of the logarithm of body mass with a slope of 0.72, which is not significantly different from the predicted value of 0.75 (95% CI: 0.60–0.84; *r*^{2} = 0.75; figure 4*a*). The predicted temperature dependence was stronger than expected; however, the correlation was relatively weak and thus the observed slope was not significantly different from the predicted value of −0.65 (ln (*P*/*M*^{0.75} = −1.18(1/*kT*) + 36.75; 95% CI = −0.23 to −2.13; *p* < 0.05; *n* = 49; *r*^{2} = 0.12). Still, the log–log plot of call power versus call frequency yielded a linear relationship with a slope, *b* = −3.15, which is not significantly different from the predicted value of −3.0 (95% CI: −1.79 to −4.15; *r*^{2} = 0.46; figure 4*b*).

Together, these results indicate that body mass and temperature, through their effects on individual metabolism, account for considerable heterogeneity in basic features of animal acoustic signals across species and environments. Indeed, the rates and times of these signals show the same relationship to mass-specific metabolic rate as many other biological rates and times (e.g. growth rates, rates of molecular evolution and lifespan; Brown *et al*. 2004). Thus, despite the seemingly vast differences in the proximate mechanisms that produce sound, first-order predictions of basic features of animal calls are possible using the models presented here. In other words, after standardizing for size and temperature, acoustic signals from organisms as diverse as crickets, fishes and whales should sound similarly (i.e. similar loudness, pitch, call rate and call duration).

However, in light of these results, three caveats should be mentioned. First, we are not implying that body size and temperature alone influence acoustic communication, nor that this model applies equally well to all species. Many factors probably contribute to the substantial variation in figures 1–4, including species-specific adaptations to optimize vocal performance in particular habitats (e.g. air sacs and sonic muscle). Moreover, there are sound-producing species not considered here that are probably inconsistent with model predictions (tail rattling, substrate-borne acoustic signals, etc.).

Second, we recognize that the mechanistic links proposed here between individual metabolic rate and sound production require further research. Our hope is that these models provide a new perspective on animal communication that will inform fine-scale experimentation aimed at linking the complex muscle dynamics that produce sound and individual energetics. In some cases, our models appear inconsistent with previously proposed hypotheses for one or more of the relationships shown here. For example, the hypothesis that call frequency scales to the −1/3 power with body mass owing to morphological constraints (Bradbury & Vehrencamp 1998) is not supported by our results, in part because it does not account for the exponential temperature dependence of call frequency.

And third, we wish to briefly address the issue of ‘phylogenetic correction’ that is often raised in comparative studies. It is not at all clear that such an analysis is appropriate in this case given the assumptions underlying phylogenetic analyses, and the questions addressed here (see Weathers & Siegel 1995; Westoby *et al*. 1995; Ricklefs & Starck 1996; Bjorklund 1997). Moreover, given the taxonomic breadth of the data, such an analysis is likely infeasible using a well-supported tree. Still, we note that estimates of the slope and strength of correlation in log–log plots of dependent variables on body mass within taxonomic groups using traditional regression techniques yield estimates that are very similar to those obtained using phylogenetically independent contrasts (Ricklefs & Starck 1996). If we could perform such an analysis, this would also probably be the case with our data given that we examine relationships across taxonomic groups.

In concluding, we wish to point out how this theoretical framework may provide useful insights into current research on acoustic communication and related biological phenomena. At present, much research in the area of acoustic communication addresses the demonstrably important effects of various taxon-specific morphological features (e.g. resonator volume), or environmental features (e.g. habitat type), on acoustic performance. But, interspecific comparisons that quantify the relative importance of such factors are limited since no theoretical ‘baseline’ exists from which to compare these effects (Dawkins 1993; Hasson 1997). The models presented here may provide just such a baseline as they describe average values, which are independent of taxon-specific sound-producing adaptations. Similarly, this theoretical framework may provide a baseline for understanding how acoustic signals vary across gradients in body size and environmental temperature (for ectotherms) and suggests how other related features of communication systems that depend on these signals (e.g. sound attenuation distance and receiver morphology) may also vary. Lastly, these results could lead to first-order predictions on rates of acoustically driven behaviours (e.g. courtship), alternative tactics (e.g. satellite behaviour or eavesdropping) or the spatial structure of populations. As such, these results may be useful in understanding hypotheses related to aspects of natural selection on acoustic signalling (i.e. good genes hypothesis, honest signalling hypothesis; Dawkins & Krebs 1978; Hamilton & Zuk 1982; Maynard Smith & Harper 1988). To the extent that evolutionary fitness can be described in energetic terms, this framework may provide a means of quantifying the fitness tradeoffs associated with call production. This is because, for all species, acoustic communication is primarily controlled by individual metabolism.

## Acknowledgements

We thank A. Allen, A. Hein, C. Hou, M. Moses and V. Savage for helpful comments that improved this manuscript. We also thank the numerous authors who have generously contributed data to this study.

## Footnotes

- Received November 23, 2009.
- Accepted December 10, 2009.

- © 2010 The Royal Society