The concept of a macroevolutionary trade-off among sexual signals has a storied history in evolutionary biology. Theory predicts that if multiple sexual signals are costly for males to produce or maintain and females prefer a single, sexually selected trait, then an inverse correlation between sexual signal elaborations is expected among species. However, empirical evidence for what has been termed the ‘transfer hypothesis’ is mixed, which may reflect different selective pressures among lineages, evolutionary covariates or methodological differences among studies. Here, we examine interspecific correlations between song and plumage elaboration in a phenotypically diverse, widespread radiation of songbirds, the tanagers. The tanagers (Thraupidae) are the largest family of songbirds, representing nearly 10% of all songbirds. We assess variation in song and plumage elaboration across 301 species, representing the largest scale comparative study of multimodal sexual signalling to date. We consider whether evolutionary covariates, including habitat, structural and carotenoid-based coloration, and subfamily groupings influence the relationship between song and plumage elaboration. We find that song and plumage elaboration are uncorrelated when considering all tanagers, although the relationship between song and plumage complexity varies among subfamilies. Taken together, we find that elaborate visual and vocal sexual signals evolve independently among tanagers.
Trade-offs among important life-history traits, such as fecundity and survival, constitute a prominent paradigm of evolutionary biology . The idea of a generalized, macroevolutionary trade-off between the elaboration of different sexual signals, such as song and plumage in birds, was first conceptualized by Darwin  and has been termed the ‘transfer hypothesis’ . The theoretical rationale underlying this hypothesis is that if two sexual signals are both costly for males to develop or express, then an inverse, interspecific correlation, or evolutionary trade-off, is expected among related taxa [1,4]. This prediction also relies on the assumption that female choice favours the expression of a single sexual signal rather than multiple elaborate traits [4,5]. Evidence for an evolutionary trade-off between sexual signals of different modalities has been found in arthropods , African cichlids [7,8] and certain avian lineages ; however, recent studies on birds have uncovered conflicting evidence for this prediction.
Badyaev et al.  found an inverse correlation between male plumage brightness and song complexity in cardueline finches; in contrast, Shutler & Weatherhead  identified a positive correlation between the degree of dichromatism and time spent singing among wood warblers. More recently, Ornelas et al.  found no correlation between dichromatism and song complexity among trogons, whereas Gonzalez-Voyer et al.  found a positive correlation between song length and the number of coloured patches among Asian barbets. This inconsistency may be due to methodological and/or biological differences among studies and focal taxa, including variation in the strength and targets of sexual selection among different lineages of birds, differences in the lineages' capacity to evolve sexual signals, discrepancies among studies in terms of how signal elaboration is quantified or differences in statistical frameworks. Moreover, other life-history traits, such as habitat, impart selective pressures on both plumage  and vocal signals , which may conflate inferred associations (or lack thereof) between sexual signals if these covariates are ignored. The presence of certain coloration mechanisms, such as carotenoids or structural coloration, may also influence the relationship between song and plumage elaboration [9,15,16]. Taken together, it is still unclear whether a generalized macroevolutionary association between sexual signals of different modalities exists.
Here, we undertake a large-scale, macroevolutionary analysis of the relationship between song and plumage elaboration in a large clade of songbirds, the tanagers (Thraupidae). More specifically, we consider correlations between multiple indices of song and plumage elaboration using an avian visual model . Thraupidae is an exemplary clade to study evolutionary associations among phenotypic characters due to its large size and phenotypic diversity. Quantifying interspecific correlations requires a phylogeny of the taxa at hand [17–19]; recently, Burns et al.  inferred a well-supported, multi-locus phylogeny that identifies 371 species within a monophyletic Thraupidae. As now composed, Thraupidae is the largest family of songbirds, representing nearly 10% of all songbird species.
Tanagers were previously considered ‘poor songsters’, due in part to the typically simple vocalizations of the large and wide-spread genus Tangara . However, the recent finding that tanagers include species with more complex songs, such as the flowerpiercers , the warbling-finches [13,20] and the seed-eaters and seed-finches , have greatly expanded the range of vocal abilities within the clade. Tanagers are also remarkably variable in terms of coloration mechanisms, patterning, UV reflectance, sexual dichromatism and plumage complexity; plumage colour diversity in tanagers encompasses the full range seen within songbirds . This previously underappreciated variation in both song and plumage elaboration provides an excellent system to test the predictions of the transfer hypothesis. The recently published phylogeny of tanagers identifies several major clades that share similar stem ages near the root of the thraupid phylogeny [20,25]. These clades vary markedly in important life-history characters, such as habitat, feeding morphology, vocal behaviours and coloration mechanisms, suggesting that evolutionary associations among sexual signals may operate differently in these diverse groups. Here, we examine the relationship between song and plumage complexity across all tanagers, and also consider the effect of potentially important covariates, such as habitat, the presence of carotenoid or structural coloration, and subfamily grouping, to test whether a generalized pattern exists within a methodologically robust statistical framework.
2. Material and methods
(a) Taxonomic sampling
We used the recently published molecular phylogeny of Burns et al.  to delimit a monophyletic Thraupidae for taxonomic sampling. We followed the species-level taxonomy of Clements et al.  with two exceptions. We included Sicalis luteoventris and Poospiza whitii because they were included in the phylogeny of Burns et al. . Clements et al.  treats these two taxa as subspecies of Sicalis luteola and Poospiza nigrorufa, respectively, although other authorities have treated them as full species [27,28].
(b) Song measurements
We used RavenPro (v. 1.4; Cornell Laboratory of Ornithology, Ithaca, NY, USA) sound analysis software to generate and extract measurements from spectrograms. Within Raven, we used a Hann spectrogram window with 300 samples, a DFT size of 512 samples, a hop size of 3.4 ms, a sampling frequency of 44.1 kHz and a time resolution of 11.6 ms. We followed Price et al.  in defining ‘songs’ as any vocalization that included tonal elements, exceeded 0.5 s in duration, and was preceded and followed by intervals greater than 1 s. We distinguished note types by eye and delimited notes as vocalizations that were temporally separated by at least 10 ms. Measurements were taken by cross-referencing spectrogram and waveform windows by eye to determine when vocalizations began and ended. We measured 20 temporal and frequency-related song characters and averaged measurements across individuals for each species. Detailed descriptions of each song character are available in the electronic supplementary material, table S1. We measured all available recordings for each species at the time of the study, up to a maximum of 38 individuals. In total, we measured 2737 songs (each from a different recording) from 321 species of tanagers (mean = 8.5, s.e. = 0.48 individuals per taxon).
We quantified song complexity with three different indices: (i) we performed a phylogenetic principal component analysis (PPCA ) on correlation matrices using the ‘lambda’ model with all 20 song characters included. Prior to conducting the PPCA, we examined the distributions of each character and applied either a square root transformation or a log transformation as appropriate (electronic supplementary material, table S2). Individual character loadings for principal component 1 (PC1) ranged between −0.87 and 0.61 (table 1). Taken together, the PC1 axis accounts for 25.5% of the total variation and the character loadings describe an axis along which more positive values represent vocal displays that occupy a greater frequency range, are longer in length and have more note types. This index of vocal complexity is similar to those used by Badyaev et al.  and Ornelas et al. ; (ii) we also used song length as an index of song complexity as described by Gonzalez-Voyer et al.  and (iii) we used the number of note types as an additional index of song complexity as described by Cardoso & Hu . We decided not to examine correlations among principal component axes other than PC1 because the loadings of other principal component axes were not interpretable in terms of song complexity and were thus unsuitable for testing the predictions of the transfer hypothesis.
(c) Plumage measurements
Plumage data were taken from Shultz , which used reflectance spectra of museum specimens captured with a reflectance spectrophotometer. All reflectance data were collected as in . From these data, male plumage characters were quantified in an avian tetrahedral colour space framework  for 303 of the 321 species that had vocal measurements. Detailed descriptions of each of the nine plumage characters used in this study are included in the electronic supplementary material, table S3.
We quantified plumage complexity with three different indices: (i) we performed a PPCA on all colour measurements using correlation matrices and the ‘lambda’ model . We examined the distribution of each variable for normality and log-transformed variables as necessary (electronic supplementary material, table S2). Individual character loadings for PC1 varied between −0.88 and 0.01 (table 1). Taken together, this axis accounts for 60.6% of the total variation and the character loadings describe an axis along which more negative scores indicate species that occupy greater expanses of tetra colour space (i.e. larger minimum convex polygons), have greater contrast between plumage patches (i.e. colour span) and have more saturated colours (i.e. chroma). To ease interpretation of this principal component axis, we reversed the sign of these loadings so that more positive scores reflect greater plumage elaboration; (ii) we also considered brilliance as a separate character, which has been used to describe plumage complexity in previous studies ; and (iii) we also included colour volume, which reflects the distribution of species’ plumage patches in avian tetrachromatic colour space. While previous studies of the transfer hypothesis have used sexual dichromatism as an index of plumage elaboration [10,11], we chose not to include sexual dichromatism here because it can be affected by changes in both male and female ornamentation or crypsis [33,34].
For each patch of each species, we classified carotenoids or structural colour as being present or absent. We classified a patch as containing carotenoids if it had a red, orange, yellow or green hue, and if the shape of the reflectance curve had a distinctive minimum in the shorter wavelengths (less than 500 nm) followed by a steep increase in higher wavelengths [35,36]. We classified structural colour as present if the patch was green, blue, violet or changed in colour depending on the angle of observation, and the reflectance curve contained one or two distinct visual peaks in the avian visual spectrum from 300 to 700 nm [35,37]. For each species, if any patches had carotenoid or structural coloration, that mechanism was considered present for that species. We identified 142 species with carotenoids and 159 without carotenoids, while 116 possess structural coloration and 185 species lack structural coloration.
(d) Habitat categorization
To examine the effect of habitat as a covariate of song and plumage evolution, we assigned species to different habitat types. Each species was assigned to a ‘closed’ (forest) or ‘open’ (non-forest habitats, including grassland, marshland and scrub) habitat group according to the database in . Many species occur in multiple habitats; thus, we focused on the primary habitat designation as indicated by Parker et al.  and ignored intraspecific variation in habitat. In total, 169 species were assigned to ‘closed’ and 132 were assigned to ‘open’.
(e) Comparative analyses
Using phylogenetic generalized least squares (PGLS) [18,39,40], we evaluated nine different models for each pairwise combination of song and plumage elaboration indices across the full Thraupid phylogeny. For each model, we implemented Pagel's λ model  within a PGLS framework [40,42], which simultaneously estimates phylogenetic signal and transforms internal branch lengths accordingly, to test for significant effects. The models we considered include: (i) a base model with no covariates; (ii) habitat type as a main effect; (iii) habitat type as an interaction effect; (iv) subfamily as a main effect; (v) subfamily as an interaction effect; (vi) carotenoid plumage as a main effect; (vii) carotenoid plumage as an interaction effect; (viii) structural coloration as a main effect; and (ix) structural coloration as an interaction effect. For models that include subfamily as a main or interaction effect, Thraupinae is used as the comparison group because it has the most species. Two of the 15 tanager subfamilies are monotypic (Catamblyrhynchinae and Charitospizinae) and result in convergence issues when fitting models where major clade is included as a main or interaction effect; thus, we omitted these taxa from comparative analyses, which were subsequently run on 301 species. Taxa from two subfamilies (Saltatorinae and Emberizoidinae) were treated as a single group because they form a strongly supported clade . The relationships among the remaining subfamilies remain unresolved and they were therefore treated separately. We considered the model with the lowest AICC score the best-fit model and regarded any models within ΔAICc ≤ 2 as competing models. We also evaluated models that included multiple interaction effects and higher order, three-way interaction effects, but these models always performed poorly and are not presented here for simplicity. In order to discuss the effect of possible covariates on the relationship between song and plumage elaboration, we also consider models that include habitat type, carotenoid or structural pigmentation even if they provide a worse fit to the data (ΔAICc ≥ 2). For each pairwise comparison of song and plumage, we visually inspected the distribution of residuals of the best-fit model and log-transformed three variables (song length, colour volume and note types) to ensure that residuals conformed to a normal distribution . We also removed species with studentized residuals more than or equal to 3 as outliers (electronic supplementary material, table S4), following the recommendations of Jones & Purvis  and Garland et al. . We adjusted our false discovery rate using the Bonferroni correction to account for the nine comparisons of song and plumage complexity considered here .
To account for phylogenetic uncertainty and variation in branch length estimations, we ran PGLS analyses over a set of 50 trees randomly sampled from a post-burn-in distribution of phylogenies from Burns et al.  and extracted mean values from the resulting distributions of each parameter for all models. Thus, the standard errors that we report are averaged for each term in our PGLS model PGLS across 50 trees. This procedure was performed using functions included in the ape , nlme  and geiger  packages within the R programming environment .
After comparing nine different PGLS models for each pair of song and plumage characters, we found that two models were consistently favoured (table 2). More specifically, a model wherein subfamily was included as an interaction term was the best-fit model for five of the nine pairwise comparisons (song PC1 and colour PC1; song PC1 and colour volume; song PC1 and brilliance; song length and brilliance; note types and brilliance; table 2). A model with no covariates was favoured among the remaining four pairwise comparisons (song length and colour PC1; song length and colour volume; note types and colour PC1; note types and colour volume; table 2). Among all pairwise comparisons, only two combinations (song length and colour PC1; song length and colour volume) had multiple competing models within ΔAICc ≤ 2; in both of these cases, a model with carotenoid plumage included as a main effect was the sole competing model (table 2).
Under best-fit models that include subfamily as an interaction effect, the interaction terms that affected the relationship between song and plumage complexity varied markedly among different subfamilies (figure 1 and electronic supplementary material, table S5). However, after correcting for multiple hypothesis testing (n = 9), none of these interaction terms are significant (all adjusted p > 0.05), suggesting that these differences are marginally, but not strongly significant. Moreover, among best-fit models that included no covariates, song and plumage elaboration were uncorrelated (p > 0.05; electronic supplementary material, table S5). In the two instances where carotenoid plumage as a main effect was considered a competing model, there was no difference in ln(song length) between lineages with carotenoid and without carotenoid plumage (p > 0.05; electronic supplementary material, table S5). Taken together, these results suggest that song and plumage elaboration largely evolve independently, although the relationship between vocal and visual signal elaboration can vary among subfamilies.
We also determined whether a significant main or interaction effect of habitat, carotenoids or structural coloration exists when considering the relationship between song and plumage elaboration despite these models performing relatively poorly in terms of AICc scores. We found that the presence of carotenoid coloration was marginally significant as an interaction effect when considering song PC1 and colour PC1 (β = −1.337 ± 0.639; p = 0.037; electronic supplementary material, table S6) as well as song PC1 and colour volume (β = −1.931 ± 0.872; p = 0.028; electronic supplementary material, table S6), which suggests an inverse correlation between song in plumage elaboration only in those taxa with carotenoid patches. However, we uncovered the opposite pattern when considering song length and colour PC1 (β = 0.095 ± 0.043; p = 0.028; electronic supplementary material, table S6) as well as song length and colour volume (β = 0.115 ± 0.058; p = 0.050; electronic supplementary material, table S6), wherein song and plumage are positively correlated only in lineages with carotenoids. Again, all of these effects are marginally significant and do not persist when the type I error rate is corrected for multiple hypothesis testing. Finally, structural coloration (electronic supplementary material, table S7) and habitat type (electronic supplementary material, table S8) were not considered significant interaction terms for any of the characters considered here.
We found that song and plumage elaboration are largely uncorrelated across the evolutionary history of more than 300 species of tanagers using nine different pairwise comparisons of auditory and visual signal complexity. The large-scale taxonomic and phenotypic sampling of our dataset combined with the phylogeny of Burns et al.  enabled us to test the predictions of the transfer hypothesis while considering numerous potentially important covariates. Overall, we find that models wherein subfamily is included as an interaction term provide the best fit to the data in the majority (five out of nine) of comparisons. However, the statistical significance of patterns among models that include subfamily as an interaction effect depends on an accepted type I error rate in the presence of multiple hypothesis testing. Whether or not an adjusted p-value is used, uncorrelated evolution between song and plumage complexity is still the prevailing pattern among the songbirds considered here.
Previous studies have documented the presence of an evolutionary trade-off between song complexity and carotenoid but not melanin coloration , suggesting that the relationship between song and plumage complexity may operate differently when certain coloration mechanisms are present. In concordance with this finding, we found a marginally significant interaction term of the presence of carotenoids when considering four pairs of song and plumage characters (electronic supplementary material, table S6). However, these models performed relatively poorly in terms of ΔAICC scores and the directionality of this interaction was inconsistent, suggesting that there is not a generalized relationship between song and plumage elaboration among lineages with carotenoid pigmentation that is distinct from lineages that lack carotenoids. Furthermore, habitat has been shown to impart selective pressures on both plumage  and vocalizations . However, our findings posit that the relationship between song and plumage elaboration is not affected by broad-scale differences in habitat (electronic supplementary material, figure S3 and table S8).
If no correction for multiple hypothesis testing is applied, a complex mixture of positive, negative and non-significant correlations appears that depends on both the characters and the clade in question. This heterogeneity may be unsurprising given that negatively correlated , positively correlated  and uncorrelated  evolution between secondary sexual signals have all been documented in birds. We found the strongest correlations appeared in relatively species-poor subfamilies of tanagers, such as Dacninae (n = 9) and Nemosiinae (n = 4). However, small sample sizes can lead to inflated effect sizes , suggesting that these correlations should be interpreted with caution.
Differences in methodological approaches make it difficult to directly compare our findings to those of earlier studies. For example, previous studies [11,52] have used phylogenetic independent contrasts , which assumes a model of Brownian motion (BM) that may be inappropriate when considering with traits that are evolutionarily labile, such as song or plumage . Under a model of BM, we would conclude that colour PC1 and song PC1 are positively correlated among all tanagers (β = 0.64 ± 0.24, p = 0.009). However, the BM model provides a substantially worse fit to the data compared with Pagel's λ (ΔAICc = 165.7), suggesting that comparative studies should not rely solely on independent contrasts. Moreover, our study demonstrates that different components of visual and acoustic signal complexity may display different correlational patterns; therefore, it behoves future studies to consider multiple elaboration indices, as we have done here, or provide strong a priori justification for using a single character.
Despite methodological differences, the pattern of uncorrelated evolution between song and plumage elaboration found here has also been recovered by previous studies . Similar to the rationale provided by Ornelas et al. , the observed lack of an evolutionary trade-off between song and plumage elaboration in tanagers is consistent with multiple, non-exclusive interpretations. One underlying assumption of the transfer hypothesis is that song and plumage investment both rely on individual condition . Support for bird song as an honest indicator of individual condition has come mostly from experimental studies that demonstrate the effect of developmental stress during the ontogenetically early phase of song learning on adult vocal performance [54,55]. Furthermore, bouts of prolonged singing also take away from time that could be spent foraging  and induce increased thermoregulatory costs and predation risks . However, the metabolic cost of song production does not appear to scale with song complexity among species . Thus, while it is generally accepted that singing is a costly behaviour at some level, the precise physiological pathway and the specific condition-dependent song feature under sexual selection appear to vary greatly among lineages [54,59,60]. This variation could mask any large-scale patterns among species involving song characters that rely on individual investment. Moreover, quantifying signal complexity is a complex process in itself; the song and plumage characters considered here almost certainly do not encapsulate all aspects of signal complexity. Thus, we may have missed ‘true’ correlations among levels of signal elaboration because of the inherently complicated nature of quantifying multifaceted phenotypes, such as song and plumage.
Tanager coloration is the result of carotenoid and melanin pigmentation and feather microstructure . There is longstanding evidence that carotenoids act as condition-dependent signals that are linked to dietary intake in certain avian systems [61–64]. However, the evidence for condition dependence in melanin and structural colour signals is far more contentious. A trade-off between resource allocation in melanin-based pigmentation and other physiological processes, such as resistance to oxidative stress or immunocompetence, has been documented in certain species [65–69] but not others [70–73]. Additionally, a recent meta-analysis on the relationship between eumelanin-based coloration and fitness in birds concluded that the directionality and magnitude of selection exerted on melanin-based ornamentation varies substantially among species and the plumage patch in question . If a consistent link between individual condition and plumage brilliance, colour volume or a composite index of plumage complexity does not exist across taxa, then it is perhaps unsurprising that we were unable to recover evidence of evolutionary trade-offs between song and plumage at the macroevolutionary scales considered here.
Another key assumption of the transfer hypothesis is that sexual selection, both in terms of male–male competition and female choice, favours the expression of a single trait in males. However, there are many avian systems where different sexual signals serve to convey important information to separate receivers, such as red-collared widowbirds (Euplectes ardens ) and common yellowthroats (Geothlypis trichas ). Thus, song and plumage elaboration may be maintained, rather than inversely correlated, across tanagers if investment in these signals reflect different sources of sexual selection [5,77]. If both signal types convey information to conspecifics, variation in the strength of sexual selection among taxa could result in either uncorrelated or positively correlated patterns of elaborate sexual signal evolution across species .
Moreover, interspecific variation in mating systems and patterns of extrapair paternity may account for variation in inferred correlations between song and plumage elaboration among clades . Theory predicts that if mating opportunities (or other resources that affect male fitness) follow a negative binomial distribution among species, wherein a minority of species possess disproportionately more extrapair copulations than the majority, then song and plumage complexity may be positively correlated to reflect variation in the strength of sexual selection . Unfortunately, the mating systems and levels of extrapair paternity in tanagers remain largely unknown. Thus, we are unable to comment on whether variation in breeding systems has any influence on the diversity of correlations observed here.
In sum, our study uncovered a broad-scale pattern of independent evolution of song and plumage complexity in a large clade of phenotypically diverse songbirds. These patterns are consistent across different indices of signal elaboration, although the strength and directionality of correlations varies among subfamilies. Moreover, the relationship between song and plumage elaboration is not affected by variation among other life-history traits, such as habitat and the presence of carotenoids or structural coloration. Thus, elaborate sexual signals of differing modalities evolve independently in tanagers, the largest radiation of Neotropical songbirds.
Additional supporting data are available via the Dryad data repository: doi:10.5061/dryad.7mj61. Through Dryad, we have provided raw vocalization data, song and plumage elaboration data used for comparative analyses and the set of 50 phylogenies used in comparative analyses as a nexus file.
This research was funded in part by the National Science Foundation (IBN-0217817, DEB-0315416, and DEB-1354006), the National Geographic Society, a Ralph W. Schreiber Ornithological Research Award from the Los Angeles Audubon Society (A.J.S.) and an American Ornithologists’ Union Research Award (A.J.S.). For additional financial support, we are grateful to the CSU Sally Casanova Predoctoral Fellowship (N.A.M.), the Crouch Scholarship for Avian Behavior (N.A.M.), the Mabel Meyers Memorial Scholarship (N.A.M.) and the National Science Foundation Graduate Research Fellowship (A.J.S.; 2008074713).
We are very grateful to M. Young and T. Bishop at the Macaulay Library for facilitating the transfer of thousands of recordings. We would like to thank A. Bernabe, D. Emmerson, J. L. Espinoza, H. Macdonald and C. Threlkeld for their assistance with collecting song data. Additionally, we also thank the curators and staff at the following museums for access to specimens in their care for colour measurements: Louisiana State University Museum of Natural Science, Natural History Museum of Los Angeles County, Academy of Natural Sciences of Philadelphia, Museum of Vertebrate Zoology at the University of California, Berkeley, California Academy of Sciences and American Museum of Natural History. We thank R. Clark, P. Pryde, I. Lovette, R. Harrison, P. Title, L. Klicka, members of the Lovette and Harrison lab groups, and two anonymous reviewers for valuable feedback on earlier versions of this manuscript.
- Received April 22, 2014.
- Accepted May 20, 2014.
- © 2014 The Author(s) Published by the Royal Society. All rights reserved.