We ask whether rates of evolution in traits important for reproductive isolation vary across a latitudinal gradient, by quantifying evolutionary rates of two traits important for pre-mating isolation—avian syllable diversity and song length. We analyse over 2500 songs from 116 pairs of closely related New World passerine bird taxa to show that evolutionary rates for the two main groups of passerines—oscines and suboscines—doubled with latitude in both groups for song length. For syllable diversity, oscines (who transmit song culturally) evolved more than 20 times faster at high latitudes than in low latitudes, whereas suboscines (whose songs are innate in most species and who possess very simple song with few syllable types) show no clear latitudinal gradient in rate. Evolutionary rates in oscines and suboscines were similar at tropical latitudes for syllable complexity as well as for song length. These results suggest that evolutionary rates in traits important to reproductive isolation and speciation are influenced by latitude and have been fastest, not in the tropics where species diversity is highest, but towards the poles.
One explanation for the well-known latitudinal gradient in species diversity is that rates of phenotypic change in traits important to speciation and the build-up of biodiversity are accelerated in the tropics [1,2], but there have been few direct measurements . One such trait is the advertisement vocalizations used by birds, amphibians, insects and other groups to communicate with conspecifics. Bird song is regularly invoked as an important species recognition trait [4–6]. Demonstration of faster rates of song divergence in the tropics would support the role of accelerated tropical evolution in contributing to the build-up of high biodiversity there. In this paper, we study rates of avian song evolution over the past few million years, across the entire New World latitudinal gradient.
Selection acting on song variants could drive different rates of song evolution in tropical and temperate regions in at least three ways [6,7, ch. 12]: (i) adaptation to various ecological factors, including acoustic adaptation to local environmental conditions [8–11], (ii) new mutations might be more effective in stimulating conspecific receivers of the signal (i.e. sexual selection), and (iii) interspecific interactions may lead to both convergence and divergence of song (e.g. [6,8, ch. 14; 12]). If any of these selective forces varied with latitude, latitudinal gradients in rates of song evolution could result. Sexual selection by female choice, for example, has been proposed to be more intense in seasonally variable environments , which could drive greater divergence, assuming that mutational input generates a variety of alternative attractive signals.
Mutational input may be both cultural and genetic. Song transmission in most non-passerines and in suboscine passerines is primarily genetic. However, in some groups—notably the oscine passerines, hummingbirds and parrots—vocal transmission has a strong cultural component [14–18], and is usually learnt from one or more unrelated individuals . Cultural transmission of song is subject to imprecise copying or incorporation of novel elements during learning, which can be accelerated by inadequate tutoring [19–22]. Cultural mutations are thought to arise far more frequently than genetic mutations, and as a result, song evolution is often proposed to be faster in groups with cultural song transmission (e.g. [15,23,24]), and has been proposed by some to drive the high species diversity of oscines (e.g. , but see ).
Here, we use a comparative approach to quantify evolutionary rates of syllable diversity and song length across the New World to determine whether rates of song evolution vary with latitude, and between oscines and suboscines.
(a) Sister pairs
A total of 116 phylogenetically independent sister pairs were identified from molecular phylogenetic trees. We included New World passerine sister pairs for which we could obtain genetic distance estimates (for data available in GenBank up to September 2009) and songs from at least three individuals. These consisted of 95 pairs representing sister species (i.e. most closely related species to exclusion of all other recognized species) and 21 pairs of phylogroups (phylogenetically differentiated clades of one or more subspecies) within species. No phylogroup contrasts were nested within sister species contrasts, which means each data point is statistically independent. We excluded sister pairs from Mimidae due to the high rate of mimicry of other species songs in this family. Given the stochastic nature of sequence evolution, estimates of genetic distances for very young sister pairs are subject to error, so we excluded sister pairs whose genetic distance were less than 0.75 per cent. Sister pairs did not differ significantly in genetic distance between tropical and temperate latitudes (p = 0.14, two-tailed t-test; mean tropical = 6.6%, n = 71; mean temperate = 5.8%, n = 45).
Sister species and phylogroup splits within species represent a young set of taxa and their current midpoint latitudes (measured as the mean of the two absolute midpoint latitudes for each member of the sister pair) provide a rough approximation of their placement on the latitudinal gradient. We excluded species with latitudinal ranges greater than 45°, and species pairs whose absolute midpoint latitudes differed by more than 25°. While it is hard to compress the concept of geographical range into a single number (i.e. midpoint latitude), we note that this measure does uncover meaningful latitudinal patterns in rates of evolution (see below). Latitudes were measured from GIS shapefiles of each species' geographical range . Latitudes may change through time and while this will have some effect on the ranges of sister species and phylogroups, it will have a much larger effect on deeper nodes in phylogenetic trees. For this reason, we restrict our analysis to sister species pairs or to phylogenetic splits within species.
Parameters of the GTR-gamma model of sequence evolution were estimated using maximum likelihood in PAUP 4.0b10  from a neighbour joining tree rooted for the whole dataset with Struthio (see ). These parameter estimates were used to calculate genetic distances under the GTR-γ model from cytochrome b sequences. Importantly, mutation rate estimates for cytochrome b do not appear to vary with latitude in birds [30,31]. Thus, we used genetic distance as a rough approximation for evolutionary time. In addition, we included three sister pairs for which cytochrome b sequences were not available, but for which sequences in similar evolving protein coding mitochondrial genes were. These were included to increase coverage of temperate southern South America. Genetic distance measures haplotype coalescence, which generally predates population divergence by a few hundred thousand years in birds at both temperate and tropical latitudes [32,33].
Each sister pair was categorized into data subsets (see §2c) based on geographical overlap (allopatric or sympatric) or taxonomic placement within the passerines (oscine or suboscine as well as major divisions within the suboscines; see below), and we calculated evolutionary rates of song divergence for each subset (see below). Sister pairs were classified as allopatric if breeding ranges of each member of the pair did not come into geographical contact, were separated by a barrier (i.e. Amazonian rivers), or were parapatric, but not overlapping (except locally along the contact zone); or sympatric if the pair overlapped in geographical range (typically by 100 km or more). New World suboscines are further divided into tracheophone suboscines (e.g. flycatchers, manikins, cotingas and relatives) and non-tracheophone suboscines (e.g. antbirds, antpittas, ovenbirds, woodcreepers and relatives) which differ in the musculature of the syrinx used to vocalize . We analysed song evolution in each of these groups separately (see below).
(b) Song analysis
Songs were obtained from the Macaulay Library of Natural Sounds (http://macaulaylibrary.org), the Xeno-canto database (http://www.xeno-canto.org), commercially available recordings, and from field recordings by the first author. Whenever possible, we analysed songs from five or more individuals per species or phylogroup (but included species with as few as three individuals provided songs were not highly variable between individuals in those species; total individuals = 1073; average = 4.6 individuals per species or phylogroup; 35 species had three individuals; 54 had four, 113 had five, 20 had 14, and 28 had six or more individuals). Except for recordings that possessed only a single song, we measured two or more songs per individual (total number of songs = 2546; average per species = 10.5; average per individual = 2.3). In-depth studies of song in individual species generally include much larger sample sizes of individuals and songs than used here. However, for the song measures we use, sampling a few songs from several individuals, provides a reasonable estimate of mean trait values; correction for small sample size in our maximum likelihood analysis of rates makes little difference (see below).
To our knowledge, all songs used were recorded from adult birds with established songs (e.g. subsongs, often produced by juvenile birds, were not included in our analysis). However, we generally lacked information regarding sex. Female song is more common in the tropics and in some species may differ from male song in complexity and duration [9,15,35–37]. Our sample may therefore include a higher proportion of female songs in the tropical versus temperate dataset. Comparing male and female songs more often in the tropical dataset should bias our analysis towards greater song divergence in the tropics. Despite this potential bias, we still find greater song divergence at high latitudes in most comparisons. Males and females of some species pairs duet or countersing. We avoided inclusion of duets where male and female songs overlapped temporally (some duets may have been included for some wren species where male and female songs blend so well that it is often difficult to determine if a duet is involved), but included countersinging birds where male and female songs did not overlap. In such cases, we measured the first (i.e. longer) song of each countersinging episode, assuming it to be the male song. We also made no distinction between whether songs were from solo males or counter singing males on adjacent territories.
We measured song duration (in seconds), the number of syllables per song, and the number of distinct types of syllables per song for each species (figure 1). In most species, song comprises a discrete sequence of one or more syllables followed by a period of silence before another song is given. In a small number of passerine species song comprises a long string of syllables without obvious breaks that may last several minutes at a time. We analyse only species with discrete songs (the longest song included was 62 s). We define syllables as the shortest repeated cluster of notes within individual songs. Syllables consist of either a single note (single trace on a spectrogram) or multiple notes when different note types are clustered in a repeat sequence, as shown in figure 1. In some species, a syllable may be repeated multiple times, gradually changing from one type to another. We classified such cases as possessing two syllable types. We used the number of syllable types per song as a measurement of syllable diversity after correcting for covariance with song length (see principal component analysis below). We consider songs with many syllable types to be complex (figure 1c) while those with few syllable types are simple (figure 1a). We note that complexity may also occur in other aspects of song that we have not measured. For example, our measure treats all syllable types equally, while syllables themselves may be composed of one or multiple notes.
Songs were visualized in RAVEN 1.4b (website: www.birds.cornell.edu/raven) by the lead author. The order in which sister pairs were measured was randomized with respect to taxonomy and latitude, but songs for each pair were analysed sequentially. For each song measurement, values were first averaged across songs within an individual, and then across individuals in each taxon. A principle component analysis based on the covariance matrix of all taxa was performed on log transformed measurements of song duration, number of syllables per song and number of syllable types per song. Euclidean distances between values from the first two principal components were used as our measure of evolutionary divergence.
(c) Likelihood analysis
We modelled evolutionary change in trait divergence between sister species pairs under two models: a random walk model (modelled as Brownian motion, BM hereafter) and a random walk model within a constrained trait space such that departures in trait values away from the starting value (i.e. when sister species last shared a common ancestor) become more difficult as distance from the starting values increases (modelled as an Ornstein–Uhlenbeck process, OU hereafter). The expected squared difference between trait values for two species (V) for each model is [38,39] 2.1 2.2where β is the rate parameter (i.e. it measures the intensity of random fluctuations in trait value), Ti is sequence divergence (i.e. twice the relative time since divergence from a common ancestor) and α is the constraint parameter. The expected absolute divergence in trait values for a pair of species µi1 and µi2 is derived from a half-normal distribution as : 2.3Variance and absolute divergence in trait values increase through time indefinitely under the BM model, but approaches an asymptote set by α under the OU model. The value of the asymptote increases as α decreases. As α approaches zero the OU model reverts to BM model. The probability that sister species pair i of age Ti with Euclidean distance Ei evolved under evolutionary rate β for both the BM and OU models is derived here from a half-normal distribution as: 2.4The likelihood computed across all sister pairs of taxa is the product of all pi 2.5
The squared distance (µi1– µi2)2 is generally biased upwards by sampling and measurement error within species, and this bias can become important when few individuals are measured and the true distance between species is small. The bias is directly analogous to the situation in ANOVA where the between group variance (variance in sample means) includes a component owing to true differences between the groups and a component owing to sampling error within groups: even if all groups had the same true mean, the sample means would differ. In ANOVA, an unbiased estimate of the variance between groups is obtained by subtracting the variance expected from the sampling process, which is itself estimated from the variance within groups and the number of observations within each group. The unbiased estimate is given as where MSbetween is the group mean square, MSwithin is the error mean square, and No is the weighted number of individuals measured in each pair of species (, p. 214). When just two groups are considered, the variance among groups is exactly equal to half the squared distance. We therefore calculated corrected squared distances as and replace (µi1− µi2)2 in equation (2.5) with this value. Corrected squared distances with negative values were set to 0, no divergence.
Because we wanted to determine whether diversification in song varies along the latitudinal gradient, both α and β were assumed to be linear functions of latitude, α = bα Lati + cα, β = bβLati + cβ, where Lati is the absolute midpoint latitude for sister pair i. This model has two parameters under the BM model (bβ,cβ) and four under the OU model (bαcα, bβ,cβ). All sets of parameter values within 1.92 log-likelihood units of the maximum log-likelihood are considered not significantly different from the best-fit values at a 95% CI.
We estimated model parameters and likelihoods for the entire dataset or separately for the following data subsets: (i) allopatric and sympatric sister pairs, (ii) oscines and suboscines, (iii) oscines, tracheophone suboscines and non-tracheophone suboscines, (iv) allopatric and sympatric groups of oscines and suboscines, and (v) allopatric and sympatric groups of oscines, tracheophone suboscines and non-tracheophone suboscines. For each, we estimate rates with and without latitude (all bβ and bα set to 0). For each data subset we tested 1.44 million combinations of parameters for BM (1200 values of cβ ranging between 0.001 and 3, each with 1200 different slopes were tested) and approximately 146 million combinations for OU (216 values of cβ ranging between 0.0025 and 3, each with 216 slopes; 56 values of cα ranging between 0 and 3, each with 56 slopes). Maximum likelihood analyses were calculated in R and the code has been submitted to the package GEIGER .
We compared the maximum likelihood values for each analysis using the Akaike information criterion corrected for sample size (AICc hereafter; ) which provides a goodness of fit that discourages parameter overfitting by penalizing models by their number of parameters. The model with the lowest AICc value best explains the data.
The entire dataset of sister pairs, age, latitudes and PC values is found in the electronic supplementary materials. The first two PCs together explained 88 per cent of the variance (PC1 55.4%, PC2 32.1%). PC1 had positive loadings for all three measurements, but was influenced primarily by song duration (loading = 0.68) and the number of syllables per song (0.69), while the number of syllable types per song had less influence (0.25). We interpret PC1 as a song length component with positive values representing long songs with many syllables and negative values representing short songs with few syllables. PC2 had negative loadings for song duration (−0.26) and number of syllables per song (−0.10), but was influenced primarily by number of syllable types (0.96). We interpret PC2 as a syllable diversity component that has been corrected for song length.
We obtained similar parameter estimates with support for the same models regardless of whether we used corrected or uncorrected Euclidean distances and report only rates for the corrected dataset. Corrected Euclidean distances for PC1 and PC2 are shown in figure 2, separately for tropical (sister pairs with midpoint latitudes less than 22°) and temperate latitudes (greater than 22°). For both PC1 and PC2, the best-fit models (i.e. with lowest AICc values) were BM models which included latitude (table 1). The complex OU models had little impact on AICc values for both PCs, failing to support the presence of bounded evolution on song length and syllables over the time spans covered by our dataset. This is in direct contrast to song frequency, which for the same set of species included in this dataset, strongly supports an OU model .
For PC1, the best-fit model was a BM model that included latitude. However, the effect of latitude on PC1 is much weaker than for PC2 (see below): the 95% CI envelop a negative slope (figure 3c) and the next best model for PC1 (without latitude) received only marginally larger AICc values (table 1). The third best-supported model (BM with latitude estimating separate rates for oscines and suboscines) found that suboscines had slightly faster rates (β at equator = 0.19, at 66° = 0.40) than oscines (β at equator = 0.08, at 66° = 0.32) at all latitudes, opposite to the pattern expected if song learning drives faster song length divergence. Other models for PC1 in table 1 received weak support.
The best-fit model for PC2 was a BM model that included latitude and supports separate evolutionary rates for oscines and suboscines. Between 0° and 14° latitude, estimated rates are similar for oscines and suboscines (figure 2c). However, the evolutionary rate increases significantly in oscines with latitude while the rate in suboscines did not vary much with latitude (figure 2c and 3a,b). More complex models that estimated separate rates for allopatric and sympatric sister pairs of oscines and suboscines were not supported by AICc. Likewise, models that included separate rates for tracheophone and non-tracheophone suboscines were not supported by AICc, though sample sizes of tracheophones were low.
Comparisons between evolutionary rate in oscines and suboscines are also reflected in the levels of variance between individuals of the same species. Variance in PC1 within species did not vary significantly between oscines and suboscines (p = 0.17, two tailed t-test). In contrast, variance in syllable diversity (PC2) across individuals within species was more than five times greater in oscines than suboscines and the difference was highly significant (p < 0.0001; two tailed t-test).
We find a clear effect of latitude on rates of evolution in oscine syllable diversity (PC2), and a weaker effect on song length (PC1) in both oscines and suboscines. For song length (PC1), our maximum likelihood estimates of evolutionary rates increased twofold with latitude with comparable rates at each latitude in oscines and suboscines (figure 2b,d). For syllable diversity (PC2), rates in oscines and suboscines were similar in the tropics, while evolutionary rates in oscines were more than 20 times higher at temperate latitudes, but showed no latitudinal pattern in suboscines (figure 2a,c).
We suggest that lack of a latitudinal increase in evolutionary rates of suboscine syllable diversity may reflect an evolutionary constraint on this trait. Song length (this study) and frequency  vary considerably between suboscine species, but the number of syllable types per song rarely exceeds two (suboscines with mean = 1.8 + 0.08 s.e. cf. oscines with a mean = 3.3 ± 0.21 s.e). Thus, the evolution of syllable diversity appears to be constrained in subsoscines. This constraint might result from the fact that suboscines transmit their songs genetically and not culturally. For example, the introduction of new song syllables might be relatively slow without song learning, but song length evolution might occur with relative ease, even in the suboscines, who could combine or separate existing syllables to alter the length of their songs.
Both natural and sexual selection, and the interaction between the two, could drive faster song divergence at high latitudes. Pleistocene glacial cycles, for example, had a greater impact on diversification at high latitudes (e.g. ), and may have caused selection pressures to fluctuate greatly in time and space. Divergence in habitat mediated by climatic cycles may have caused correlated divergence in song complexity in order to optimize song transmission (with dense habitats favouring simpler songs, [8–11,45]). High extinction rates that accompanied glacial advances at high latitudes , may have further created vacancies in vocal space in which other species might rapidly evolve to fill.
Seasonal changes in climate are also more intense at high latitudes and could intensify selection pressures compared with tropical regions (e.g. [7,13]). For example, the intensity of sexual selection could increase with latitude because the shorter breeding season at high latitudes increases the importance of male song in mate attraction and territorial defense [7,47,48], the burst of food availability during the high latitude breeding season decreases the cost of singing , or the higher population density at high latitudes increases female choosiness or male competition for territories . Intersexual selection generally favours increased exaggeration of song [7,13,47,49]. The specific traits that become exaggerated by sexual selection in each species belonging to a species pair may be somewhat arbitrary: song length, syllable diversity or other aspects of song, plumage coloration or other traits may become targets of selection in different species, which would promote divergence [7,50].
Once members of a species pair become sympatric, selection may drive divergence in song characters in order to reduce maladaptive hybridization (e.g. reinforcement). Secondary sympatry in birds evolves about one and a half million years faster on average at high latitudes versus the equator . The faster speed of sympatry at high latitudes has previously been invoked to explain the more rapid divergence of colour patterns in high latitude birds . Our analysis did suggest that sympatric pairs show faster latitudinal increases in rates of complexity (oscines only) and song length than allopatric pairs, but the differences are not significant. When evolutionary rates are calculated for allopatric sister pairs alone, rates of evolution in syllable diversity in oscines still increase greatly with latitude. Strong latitudinal effects on song evolution do not require sympatry.
Our analysis suggests that recent divergence in key species discrimination traits—oscine syllable diversity, and oscine and suboscine song length—has happened more quickly, not in the species-rich tropics, but in depauperate faunas at high latitudes. A similar pattern of faster evolution in birds at high latitudes was recently reported for plumage coloration—another key species discrimination trait , and for song frequency . Together, these studies on song and colour evolution challenge the notion that areas with high species diversity also have the highest evolutionary rates in traits important to speciation, at least over the past several million years. Because species recognition traits like song and colour provide important pre-mating reproductive barriers [4,6,52], the faster rates at high latitudes may result in faster rates of premating reproductive isolation there. Recent estimates of speciation rates support high latitude regions as a hotbed of evolutionary divergence, but suggest that diversity remains low there owing to high extinction rates . Alternatively, rates of speciation—and of divergence in species discrimination traits like song length and syllable diversity—might have at one time been faster than at present in the tropics and slowed as the number of species there increased. Either way, our results for syllable diversity and length suggest that current evolutionary rates in species discrimination traits are not positively correlated with levels of biodiversity.
We thank Trevor Price, Dolph Schluter, Natasha Bloch, Emma Grieg, Elizabeth Scordato, Ben Taft, Thomas Tietze, Nate Lovejoy and three anonymous reviewers for comments. Trevor Price suggested the ANOVA approach to correct for sample size. J.W. was supported by start-up funds from the University of Toronto, a National Geographic Waitt Grant, and the Natural Sciences and Engineering Research Council of Canada. D.W. was supported by a grant from the National Science Foundation to T. Price.
- Received September 22, 2010.
- Accepted October 22, 2010.
- This Journal is © 2010 The Royal Society