Royal Society Publishing

Positive selection at the ASPM gene coincides with brain size enlargements in cetaceans

Shixia Xu, Yuan Chen, Yuefeng Cheng, Dan Yang, Xuming Zhou, Junxiao Xu, Kaiya Zhou, Guang Yang


The enlargement of cetacean brain size represents an enigmatic event in mammalian evolution, yet its genetic basis remains poorly explored. One candidate gene associated with brain size evolution is the abnormal spindle-like microcephaly associated (ASPM), as mutations in this gene cause severe reductions in the cortical size of humans. Here, we investigated the ASPM gene in representative cetacean lineages and previously published sequences from other mammals to test whether the expansion of the cetacean brain matched adaptive ASPM evolution patterns. Our analyses yielded significant evidence of positive selection on the ASPM gene during cetacean evolution, especially for the Odontoceti and Delphinoidea lineages. These molecular patterns were associated with two major events of relative brain size enlargement in odontocetes and delphinoids. It is of particular interest to find that positive selection was restricted to cetaceans and primates, two distant lineages both characterized by a massive expansion of brain size. This result is suggestive of convergent molecular evolution, although no site-specific convergence at the amino acid level was found.

1. Introduction

There has been a dramatic expansion in brain size during mammalian evolution [1,2]. Primates and cetaceans are the most remarkable examples of such massive brain size expansion, especially of the cerebral cortex—the brain area associated with higher cognitive functions [3]. Previous research indicates that the evolution of larger brain sizes in these groups has been driven by selection for life in increasingly complex social environments and the need for highly developed cognitive abilities [4,5]. Accordingly, the primate brain, and especially the human brain, has been recognized as one of the most striking evolutionary adaptations [6,7]. However, it is still unclear whether this is also the case for cetaceans.

Cetaceans diverged from terrestrial artiodactyls approximately 56–53 million years ago (Ma) [8], representing one of the most enigmatic events in evolution. Early cetaceans (called archaeocetes) diversified through amphibious stages to become fully aquatic by 40 Ma [9]. Extant cetaceans, consisting of highly diversified species subdivided into two suborders, Mysticeti (large rorqual and baleen whales) and Odontoceti (dolphins, porpoises and toothed whales), evolved from archaeocetes at about 34 Ma and dispersed into the world's oceans and estuaries and even some rivers [10,11]. It has been proposed that the transition from land to water selected for the development of distinct, and in some cases more complex, physiology, morphology, sensorial and cognitive traits, such as underwater vision [12] and hearing ability [13], and comprehension of artificial communication systems [14]. Therefore, relatively larger brain sizes were favoured during this transition. Indeed, paleontological and neuroanatomical data have confirmed that modern cetacean brains are among the largest across all mammals both in absolute and relative size (expressed relative to body size as encephalization level or quotient, EQ [1]). Almost all odontocetes (toothed whales) have above average levels of encephalization compared with relative terrestrial mammals. In particular, some odontocetes possess EQs in the range of 4–5, second only to modern humans (EQ ≈ 7) and significantly higher than any of the modern nonhuman anthropoid primates (highest EQ ≈ 3.3) [15,16]. Recent studies have contributed substantially towards the identification of the putative genetic basis underlying the enlarged brain of primates [6,7,17,18]. The most striking finding in this regard was that positive selection or accelerated evolution was found in a group of genes associated with primary microcephaly (MCPH), a developmental defect characterized by a severe reduction in brain size. Seven such loci (MCPH 1–MCPH 7) with recessive mutations that lead to MCPH have been identified to date (reviewed in [19]). Of these, MCPH 5 (or abnormal spindle-like microcephaly associated, ASPM) is essential for normal mitotic spindle function in embryonic neuroblasts, with ASPM mutations being the most common cause of MCPH in a clinical sample of humans [17,19,20]. Additionally, the functions of Drosophila and mouse orthologues of ASPM have also certified that mutations in ASPM reduce brain size (reviewed in [19]). Interestingly, there is strong evidence that some nucleotide changes in ASPM were subject to positive selection in the lineage leading to humans, consistant with a possible role of ASPM in the evolutionary enlargement of human brain size [6,17,20]. Still, the investigation of the genetic basis underlying brain size evolution in cetaceans has just begun very recently, although the adaptive significance and anatomical basis of their brain diversity has long been studied [5,15,16,21,22]. A recent study investigated the MCPH1 gene in cetaceans [23], but did not find compelling evidence of an association between its evolution and brain size enlargement in the group.

In this study, the ASPM gene was investigated in representative species of major cetacean lineages and compared with corresponding sequences from other mammals to determine its contribution to brain size enlargement during cetacean evolution. First, we tested whether positive selection on the ASPM gene across cetacean phylogeny corresponded to brain size expansion events in this group. Second, in comparison with homologous sequences of other mammals, we evaluated whether different cetacean clades or other mammals have experienced different selective regimes, particularly whether cetaceans and primates had similar adaptations (or convergent evolution) at the molecular level.

2. Material and methods

(a) Taxonomic coverage

Fourteen cetacean species from eight families, including nine species of the superfamily Delphinoidea, which includes the highest EQ, were sequenced (see the electronic supplementary material, table S1). In addition, ASPM sequences of nine primates and one artiodactyl (Hippopotamus Hippopotamus amphibius) were downloaded from GenBank (see the electronic supplementary material, table S1). We also retrieved Ensembl-predicted ASPM sequences from published genomes ( and of an additional eight mammal species from six orders. In total, 32 ASPM sequences of high quality and integrity were used (see the electronic supplementary material, table S1 for information on sequence data and accession numbers).

(b) Amplification and sequencing of cetacean ASPM

We designed primers for the two largest ASPM exons (exons 3 and 18; approx. 60% of the transcribed ASPM protein) based on the alignment of the genomic data for the cow, horse and bottlenose dolphin (Tursiops truncatus). These exons were selected as they are characterized by the highest concentration of non-synonymous substitutions, possess most of the mutations that cause human primary microcephaly [17,20] and encompass the main functional sites of the protein [24].

Genomic DNA was extracted from myologic samples by using the DNeasy tissue kit (Qiagen), following the manufacturer's protocol. Polymerase chain reaction (PCR) was carried out in a total volume of 50 μl comprising 2.5 mM MgCl2, 10 mM Tris–HCl (pH 8.4), 50 mM KCl, 0.2 mM of each dNTP, 0.4 μM of each primer, 1.0 unit Ex-Taq DNA polymerase (Takara) and 10–100 ng DNA template. The amplification profile consisted of 5 min at 94°C, followed by 35 cycles of 30 s at 94°C, 30 s at 55°C and 30 s at 72°C, with a final extension of 8 min at 72°C. The amplified PCR products were purified and sequenced in both directions with an ABI 3730 automated genetic analyser.

(c) Cetacean ASPM gene polymorphism

Nucleotide and deduced amino acid sequences of the mammalian ASPM gene were aligned using the Clustal X v. 1.83 program [25]. Average pairwise nucleotide distances (Kimura 2-parameter model, K2P) and Poisson-corrected amino acid distances were computed using MEGA v. 5 [26]. Standard errors of estimates were obtained by running 1000 bootstrap replicates.

(d) Test for selection on the ASPM gene

The non-synonymous to synonymous rate ratio ω (dN/dS) indicates changes in selective pressures, where ω = 1, ω < 1 and ω > 1 correspond to neutral evolution, purifying and positive selection, respectively [27]. The ω ratio was estimated using a codon-based maximum-likelihood method implemented in CODEML program of PAML v. 4.4 package [28]. A well-accepted phylogeny of Laurasiatheria and primates was used as the input tree in all analyses. The topology of Laurasiatherian mammals was based on analyses of approximately 2.1 M base pairs (bp) of 1608 genes from 15 mammalian species [29] and approximately 1.4 M bp of 110 genes from 15 cetacean species [30], whereas the phylogeny of primates was based on approximately 8 M bp of 54 nuclear genes from 186 species [31]. We also used the concatenated two-exon ASPM dataset to estimate the phylogenetic relationships using maximum-likelihood (ML) and Bayesian inference (see the electronic supplementary material, appendix S1). The resulting tree was similar to the well-accepted phylogeny, with only some minor differences within Delphinidae (see the electronic supplementary material, appendix S1 and figure S1). The analyses of selection using the ASPM tree produced results nearly identical to those obtained using the well-accepted phylogeny; hence, only the latter analysis is reported.

A combination of branch, site and clade models was used to analyse the datasets including all mammals or only cetaceans. In the case of the branch model, the ‘free-ratios’ model (M1), which assumes an independent ω ratio for each branch, was compared with the ‘one-ratio’ model (M0), which assumes the same ω ratio for all branches [32]. Subsequently, to identify those sites under positive selection for the ASPM gene, site models in which ω can vary among sites were implemented. Specially, two pairs of site models were tested: M1a (nearly neutral: ω0 < 1, ω1 = 1) versus M2a (positive selection: ω0 < 1, ω1 = 1, ω2 > 1) and M8a (nearly neutral; beta distribution: 0 < ω0 < 1 and ω1 = 1) versus M8 (positive selection; beta distribution: 0 < ω0 < 1 and ω1 > 1) [33] (details on models M1a and M2a in [34]). Finally, to detect divergent selection acting on groups of related key taxa, a pair model (Clade model C versus M1a) was implemented in each dataset. Three site classes are assumed in the clade model C, which includes two clades, i.e. focal clade and the background clade. Site class 0 and 1 separately represent purifying selection (0 < ω0 < 1) and neutral evolution (ω1 = 1), whereas in site class 2, branches in the two clades are evolving with ω2 and ω3 (ω2ω3), respectively [34]. Because clade model C does not perform well without an outgroup [28], hyrax as well as cow + hippopotamus were used as outgroups for the datasets, including ‘all mammals’ and ‘all cetacean’, respectively. In the latter case, clade models were separately undertaken for odontocetes (branch A), delphinoids (branch B), delphinids (branch C) and mysticetes (branch D), whereas, in the former case, the clade models were determined for all cetaceans (branch I), primates (branch II), Whippomorpha (whale + hippo, branch III), carnivores (branch IV) and eulipotyphlans (branch V). In addition, the selection pattern on the cetacean clade was estimated, considering all mammals with the exception of primates in order to exclude the effect of primates.

The significance of differences between the two nested models was evaluated using likelihood ratio tests (LRTs) by calculating twice the log-likelihood (2ΔL) of the difference following a chi-square distribution, and the degrees of freedom were the difference in the number of free parameters between models. For all PAML-based analyses, all models corrected transition and transversion rates and codon usage biases (F3 × 4). Different starting ω values were also used to avoid the local optima on the likelihood surface [35]. Considering that selection analyses including alignment gaps (setting clean data = 0) was basically the same as that of removing alignment gaps (setting clean data = 1, see the electronic supplementary material, table S2); thus, only the former analysis is presented here.

To evaluate the probabilities of positively selected sites on ASPM for the cetacean species examined, we first used a Bayes empirical Bayes (BEB) analysis to calculate posterior probabilities of positively selected sites implemented in the CODEML program of PAML v. 4.4 package [28]. Those sites with a posterior probability > 0.8 were considered as candidates for selection. We also used the fixed effects likelihood (FEL) method implemented in the DATAMONKEY web server [36] to infer positive selection sites with the default settings and a significance level of 0.2. The DATAMONKEY has the advantage that they can improve the estimation of the dN/dS ratio by incorporating variation in the rate of synonymous substitution [36]. Finally, only positively selected sites detected by both the M8 and FEL methods were used to estimate conservative or radical changes between ancestral and present ASPM sequences. Ancestral ASPM sequences were inferred based on empirical Bayesian methods implemented in the CODEML program of the PAML v. 4.4 package [28]. Conservative or radical non-synonymous substitutions were estimated according to charge, polarity and volume.

3. Results

(a) Characterization of the cetacean ASPM gene

All ASPM gene sequences from the cetacean species were intact, without premature stop codons or frame shift mutations, thus suggesting the presence of a functional ASPM protein. The sequences from the two exons of the ASPM genes (exon 3 with 1494 bp and a predicted amino acid sequence of 498 amino acid (aa), and exon 18, with 4710 bp and 1570 aa) were examined in all 14 cetacean species. In total, 6204 bp (2068 aa) were sequenced, representing 59.48 per cent of the coding region according to the human ASPM gene (3477 aa). An alignment of 2068 aa revealed a total of 394 variable sites (19.05%) and two separate amino acid deletions in the minke whale (Balaenoptera acutorostrata) and Omura's whale (B. omurai; see the electronic supplementary material, figure S2). In addition, the alignment of the dataset including the 14 newly sequenced cetaceans and the published sequences of 17 additional mammal species revealed that protein translations ranged from 1896 aa (in the horse) to 2081 aa (in the macaque).

(b) Signatures of positive selection

The site model analyses of all mammals (dataset I: 31 species) showed that models that incorporate selection (i.e. M2a and M8) fitted significantly better than neutral models (i.e. M1a and M8a; table 1). With model M8, the most stringent model implemented in PAML, a small proportion of codons (4.98% or 103 codons) was estimated to be under selection, with a ω value of 1.906. Of those sites under positive selection, 16 and 14 were identified by the BEB approach with posterior probabilities above 0.8 and 0.9, respectively. When only the cetaceans (dataset II: 14 species) were considered, the associated LRT reached statistical significance (M8a versus M8: p = 0.050) or is very close to statistical significance (M2a versus M1a: p = 0.052) in the site model analysis, with an estimated ω value of 5.780–5.983 at 0.87–2.47% sites at this loci (table 1). For M8, 10 codons were identified by the BEB approach with posterior probabilities above 0.8, whereas 17 codons were identified as under selection by FEL method. Of these putative positively selected sites, nine were picked simultaneously by both methods. These nine codons were therefore further investigated their radical or conservative nature (table 2). Seven of these (77.78%) were identified as having undergone radical changes: six in the suborder Odontoceti (including three in the superfamily Delphinoidea) and one in the suborder Mysticeti.

View this table:
Table 1.

CODEML analyses of selective pattern on the ASPM gene in mammals (including alignment gaps: clean data = 0).

View this table:
Table 2.

Candidate amino acid sites under positive selection identified in cetaceans using different methods.

To test whether the evidence for positive selection was restricted to some specific lineages, several models were compared (table 1). In dataset I (all mammals), the free-ratio model was significantly better than the one-ratio model (p < 0.001, table 1), suggesting heterogeneous selective pressures on different lineages. Interestingly, infinite ω (dS = 0 and dN > 0) or ω values greater than one were restricted to those branches characterized by enlarged brain size, both for cetaceans and primates. In the former case, there were four branches: the ancestral branch of delphinids (T. truncatus, Stenella attenuata, Stenella coeruleoalba, Delphinus capensis), the ancestral branch of S. coeruleoalba + Delphinus capensis, the ancestral branch of Phocaenidae (Neophocaena asiaeorientalis) + Monodontidae (Delphinapterus leucas) and the branch leading to B. acutorostrata. In the case of primates, there were five branches: the ancestral branch of New World monkeys, Old World monkeys and Great Apes, and the branches leading to Homo sapiens, Pan troglodytes, Macaca mulatta and Macaca fascicularis (figure 1).

Figure 1.

ω Values for distinct evolutionary lineages of cetaceans and mammals, with a phylogenetic tree derived from Zhou et al. [29,30] and Perelman et al. [31]. The ω values for individual branch according to the free-ratio model are shown. Branches with ω > 1 are shown in red. The ω values of branches I–V and A–D (marked with different colours) were estimated according to the two-ratio models and clade models (with detailed results listed in table 1).

Clade models separately undertaken for cetaceans, primates, Whippomorpha, carnivores and eulipotyphlans revealed evidence of significant divergent selection. Of these clades, the ω value of the focal clade was greater than one only for cetaceans and primates, indicating the action of positive selection on both clades (table 1). When all mammal ASPM sequences (31 sequences and one outgroup) were considered, the ω value of the cetacean focal clade was lower than that of the background. However, when primates were excluded (21 sequences and one outgroup), the reverse was observed (table 1). When only the cetacean dataset (14 sequences and two outgroups) was considered, the ω values of the focal clade were higher than those of the background clade for odontocetes (2.018 versus 1.038), delphinoids (10.100 versus 0.478) and delphinids (30.959 versus 0.342; table 1).

4. Discussion

(a) Molecular evolution of the ASPM gene and brain size enlargement in cetaceans

Two events involving the enlargement of relative brain size in cetaceans were identified from anatomical research and computed tomography: one at the early origin of Odontoceti, and the second during the evolution of Delphinoidea [16]. Interestingly, the analyses of molecular evolution of the ASPM gene conducted in the present study showed evidence of positive selection matching these events. Clade models showed evidence of positive selection acting on odontocetes and delphinoids (table 1). Additionally, nine amino acid sites under positive selection were identified by the two ML approaches (M8 and FEL; table 2). Of these, seven were potentially radical changes associated with charge, polarity or volume of the amino acid. Most of such radically changed sites were found in suborder Odontoceti and/or Delphinoidea, whereas only one site occurred in suborder Mysticeti (table 2). Much higher rates of radical (as opposed to conservative) non-synonymous substitution identified in cetaceans (exact binomial test: p = 0.089) may have been taken as evidence of positive selection. More importantly, such radical amino acid changes at the ASPM gene occurred in toothed whales and some Delphinoidea species, which are further evidence for the association of adaptive ASPM evolution with cetacean brain size enlargement. Although only two exons were investigated, the selection detecting in the present study mainly relied on the PAML, which allows for detecting positive selection in specific codons as opposed to whole genes [37]. In such a case, the ascertainment bias may not be a problem because the selection signal is independent of the fragment length examined. Actually, seven other conservative exons (5–11) of the ASPM gene have also been obtained, and the additional exons did not have any effect on the pattern of selection (data not shown).

Although the large and well-developed cetacean brain has been hypothesized to be a direct product of adaptation to a fully aquatic lifestyle, the present analyses do not support this hypothesis. There was no evidence of positive selection acting on the ASPM gene in the latest common ancestor of whales + hippo (Whippomorpha, branch III, two-ratio branch model: ω = 0.292) and Cetacea (i.e. toothed whales + baleen whales, branch I, two-ratio branch models: ω = 0. 548), which separately represented the transition of the ancestral terrestrial cetaceans to the semi-aquatic habitat and the early stage of adaptation to a fully aquatic life. Actually, this is congruent with evidence from anatomy and computed tomography that the first enlargement of brain size did not occur immediately after the ancestral terrestrial artiodactyls occupied the aquatic environment and early adaptation of cetaceans to the full aquatic habitat [16]. Here, further evidence for this was provided by ω > 1 in the toothed whale clade (clade model C: ω = 2.018 in table 1), which suggested positive selection matching with the first brain size expansion event in the evolutionary history of cetaceans.

The hypothesis that the enlarged brain size of cetaceans was primarily a response to social forces and cognitive demands found, conversely, more support in the present analyses. Indeed, the events of relative brain size enlargement that occurred in the evolution of cetaceans coincide with the two rapid radiation periods associated with the origin of Odontoceti and the diversification of Delphinoidea [38]. Relatively larger brain sizes during these periods may have been favoured by enabling a higher degree of behavioural flexibility, hence the ability to cope with novel environmental conditions and larger and more complex social groups [21,39,40]. In addition, the larger brain sizes during the origin of Odontoceti might also be related to the emergence and development of echolocation to process high-frequency acoustic information, which is associated with a greatly enlarged high-frequency auditory system in toothed whales [41]. It is worth noting, however, that echolocation is likely not to be a primary or sole factor affecting brain size evolution, as bats also have the echolocation ability yet do not show exceptionally large brains.

Additionally, positive selection at the ASPM gene coinciding with brain size enlargements in cetaceans is contrasted with a recent study that analysed another primary microcephaly gene (i.e. MCPH1) in cetaceans but did not find significant association between high relative brain size and level of positive selection [23]. This suggested that the ASPM gene may have played a more important role than MCPH1 in the evolution of large brain in cetaceans. However, considering the lack of direct data on the biochemical function of ASPM in cetaceans, further research on the functional properties of this gene in cetaceans is necessary to determine its role in the brain size enlargement.

(b) Similar selective pressures on the ASPM gene in cetaceans and primates: convergent molecular evolution?

Although cetaceans and primates are highly diverged, different in habitat and neuroanatomical organization, they showed striking convergence in social behaviour and self-recognition ability [42]. Consequently, they both have significantly enlarged brain size [3]. Morino [42] proposed that cetaceans and primates have apparently undergone similar pressures for increased brain mass in their evolutionary history. However, this speculation has not yet been experimentally established at the molecular level. On the basis of molecular evolutionary analyses of the ASPM gene from cetaceans and related terrestrial mammals, the present study provided two lines of evidence to support the notion of similar selective pressures acting on the ASPM gene in primates and cetaceans. First, according to the results from the free-ratio model, the ω > 1 were restricted to some branches within cetaceans and primates (figure 1). Second, the clade models also suggested that positive selection was restricted to these two clades (table 1).

Thus, convergent evolution might underlie the observation of similar selective pressures acting on the ASPM gene in the cetaceans and primates, two mammalian groups highly diverged or distantly related with each other but both showed convergent brain size enlargement. However, in contrast to the strong evidence of molecular convergent evolution found in earlier studies that the motor protein Prestin of echolocating dolphins formed a well-supported group with that of echolocating bats at the amino acid level [43,44], the phylogenetic reconstruction of the ASPM gene conducted in the present study did not group cetaceans with primates at nucleotide or amino acid level (see the electronic supplementary material, figure S1). Moreover, site-specific convergent ASPM evolution was not detected in the two groups at the amino acid level. This is not unreasonable, considering that the evolution of brain size is most likely controlled by multiple genes; thus a relatively weak convergent effect might have occurred in individual ASPM genes. Further studies using more MCPH loci are necessary to determine whether and how brain size evolution in cetaceans and primates converged at the molecular level or whether different mutations could have similar functional effects.


We thank Mr Xinrong Xu for collecting the samples for several years; Tong Shen, Zhuo Chen and Wenhua Yu for helpful discussion; and Qi Wu for valuable comments on an early version of the manuscript. Financial support was provided by the National Natural Science Foundation of China (NSFC) to G.Y. (grant no. 30830016) and S.X.X. (grant no. 31000953), the Priority Academic Programme Development of Jiangsu Higher Education Institutions to G.Y. and S.X.X., Specialized Research Fund for the Doctoral Programme of Higher Education (grant no. 20103207120010), the Ministry of Education of China to S.X.X. and the Natural Science Foundation of the Jiangsu Higher Education Institutions of China (grant no. 10KJB180002).

  • Received July 25, 2012.
  • Accepted August 21, 2012.


View Abstract