Schizophrenia poses an evolutionary-genetic paradox because it exhibits strongly negative fitness effects and high heritability, yet it persists at a prevalence of approximately 1% across all human cultures. Recent theory has proposed a resolution: that genetic liability to schizophrenia has evolved as a secondary consequence of selection for human cognitive traits. This hypothesis predicts that genes increasing the risk of this disorder have been subject to positive selection in the evolutionary history of humans and other primates. We evaluated this prediction using tests for recent selective sweeps in human populations and maximum-likelihood tests for selection during primate evolution. Significant evidence for positive selection was evident using one or both methods for 28 of 76 genes demonstrated to mediate liability to schizophrenia, including DISC1, DTNBP1 and NRG1, which exhibit especially strong and well-replicated functional and genetic links to this disorder. Strong evidence of non-neutral, accelerated evolution was found for DISC1, particularly for exon 2, the only coding region within the schizophrenia-associated haplotype. Additionally, genes associated with schizophrenia exhibited a statistically significant enrichment in their signals of positive selection in HapMap and PAML analyses of evolution along the human lineage, when compared with a control set of genes involved in neuronal activities. The selective forces underlying adaptive evolution of these genes remain largely unknown, but these findings provide convergent evidence consistent with the hypothesis that schizophrenia represents, in part, a maladaptive by-product of adaptive changes during human evolution.
Schizophrenia is characterized by delusions and auditory hallucinations, loss of coherence and logic in thought and language, and emotionality inappropriate to social context (Crow 1997; Horrobin 1998; Tamminga & Holcomb 2005). This disorder has a polygenic basis involving multiple genes of small effect (Riley & Kendler 2006), and its phenotypic effects represent one end of a continuum that grades into schizotypal cognition and to normality (Rossi & Daneluzzo 2002; Mata et al. 2003).
Schizophrenia has also been associated with creativity throughout recorded history (Claridge et al. 1990; Nettle 2001). Diverse evidence from psychological experiments, biographical survey studies and neurophysiology has been accumulating for many years in support of the hypothesis that schizotypal cognition is associated with creativity and divergent thinking (see reviews in Claridge et al. 1990; Nettle 2001, 2006a; Barrantes-Vidal 2004). An untested prediction of this hypothesis is that schizophrenia has evolved, and is maintained, in part as a maladaptive by-product of recent positive selection and adaptive evolution in humans (Crow 1997; Horrobin 1998).
To assess the hypothesis that genes associated with schizophrenia have been subject to positive selection, we analysed the molecular evolution of 76 genes, which have been inferred to increase schizophrenia risk. We used two complementary approaches. First, we used linkage disequilibrium-based methods to test for signatures of recent selective sweeps, which involve relatively large coding and non-coding genomic regions (extended haplotypes) inferred to have recently and rapidly increased in frequency in human ancestry, such that they have not yet been broken up by recombination (Biswas & Akey 2006; Sabeti et al. 2006; Voight et al. 2006). Second, we employed phylogeny-based maximum-likelihood methods (Yang 1997; Zhang et al. 2005) for detecting positive selection, in humans and related primates, which are based on the ratio of non-synonymous to synonymous substitution rates. We also review the literature from previous studies on the molecular evolution of schizophrenia-related genes along the human lineage.
2. Material and methods
(a) Criteria for inclusion of genes
Schizophrenia risk is mediated by many genes, each of small effect. The unambiguous identification of such genes has been difficult, because detecting small effects requires very large samples, different combinations of alleles can mediate liability, risk alleles often vary between populations and schizophrenia exhibits considerable clinical heterogeneity: it is not a simple, unitary disorder (Tamminga & Holcomb 2005; Riley & Kendler 2006). Our criterion for inclusion of schizophrenia-linked genes was genetic association with the disorder via association studies (comparing allele frequencies or genotype distributions between cases and controls), or via family-based transmission–disequilibrium studies that test for differential inheritance of alleles between affected and non-affected siblings. We excluded all genes that were linked with schizophrenia in single studies, which were subject to failed replication attempts. The list of 76 such genes used here (electronic supplementary material) is highly congruent with that of Schmidt-Kastner et al. (2006) and was fully assembled prior to our analyses.
The strength of the evidence for linkage of these genes to the development of schizophrenia varies considerably, and it is very likely that some proportion of these linkages will turn out to be false positives given the history of failed replications in this field and the high degree of clinical, genetic, gender and ethnic heterogeneity in this disorder. We appreciate this fact and, as this is the first analysis of its kind, we have chosen to initially be relatively permissive in the inclusion of genes. However, we are careful to highlight cases where evidence of selection is observed among genes for which especially strong and highly replicated association data exist.
We also note that the presence, in our analyses, of any genes that are eventually shown not to be associated with schizophrenia will make our test of enhanced positive selection on schizophrenia-associated genes (compared with control genes associated with neuronal activities) more, rather than less, conservative. Indeed, with regard to this test, the main issue at hand is whether or not the set of genes that has been statistically linked with schizophrenia in one or more studies is more likely to actually exhibit association with schizophrenia than is a random sample of control genes. We believe that this assumption is justified.
(b) Tests for positive selection using the human HapMap
Recent positive selection can usefully be inferred by the identification of selective sweeps, whereby selection for a specific allele causes a relatively large block of surrounding DNA (an extended haplotype) to increase in frequency. This method is most effective for recent selection, as recombination breaks up such extended haplotypes over time, and it can detect selection on both coding and non-coding regions.
Voight et al. (2006) have developed tests for recent selective sweeps in human populations using data from the human haplotype map. For the three genotyped populations, one European, one African and one Asian, they tested for evidence of positive selection as indicated by the tendency of recently selected alleles to sweep a set of tightly linked sites to relatively high frequency. They also provide a web interface, Haplotter (hg-wen.uchicago.edu/selection/haplotter.htm), which we used for testing the presence and significance of selective sweeps for specific genes and sets of contiguous genes, and for localizing inferences of selection to specific gene regions and SNPs. Our criterion for positive selection in these data was a probability value of 0.05 or lower for one or more of the three populations.
To test for a statistically increased signal of positive selection in schizophrenia-associated genes, we compared the frequency of positive selection in the set of 76 schizophrenia-associated genes with the frequency of positive selection in a set of 300 control genes that were derived from the Panther gene-ontology category ‘neuronal activities’ (Mi et al. 2005).
(c) PAML analyses
Analyses of historical selection on specific lineages, using the ratios of non-synonymous to synonymous amino acid substitutions for coding DNA, were carried out with the Codeml program in the PAML program package (Yang 1997; Zhang et al. 2005). This program implements maximum-likelihood methods to estimate the ratio of non-synonymous to synonymous substitution rates for each codon in a gene, across an aligned set of sequences from that gene in different species, in a phylogenetic framework (electronic supplementary material). For the analysis of each gene, we used estimates of the phylogenetic relationships of the specific taxa involved using recently published phylogenetic analyses of mammalian relationships from the literature. Only the topological relationship among the species involved was recorded for these trees. Initial branch-length estimates were determined using a preliminary Codeml run with the single ω model (M0).
We obtained orthologous mammalian gene sequences for each candidate human gene in the following manner. Orthology information was used from the Homologene database (release 50.1) at NCBI (http://www/ncbi.nih.gov/) where available. Additional mammalian orthologues were then obtained using reciprocal best BlastP hits from the GenBank protein database. Corresponding nucleotide sequences were obtained and in-frame ClustalW alignments of the coding regions were conducted. Aligned nucleotide sequences were then manually converted into PAML input files.
We used codon and branch-specific models (Zhang et al. 2005) to investigate the action of positive selection on each of the three branches in each tree: the human lineage; the human–chimpanzee lineage; and the basal-primate lineage, for each gene. In some cases, the primate lineage was the same as the human–chimpanzee lineage (i.e. no other primate sequence was available). These three lineages were chosen a priori because they represent periods when evolutionary changes underlying schizophrenia are likely to have been established: (i) at the origin of distinctive primate traits including increased encephalization and complex social behaviour, (ii) at the origin of large social-group sizes and complex communication in the lineage ancestral to chimpanzees and humans, and (iii) in the evolution of humans, the lineage expressing schizophrenia. The human lineage is most salient for analyses of positive selection in schizophrenia-associated genes, but the human–chimpanzee and primate lineage may also provide useful insights in that these lineages presumably underwent relatively strong selection for aspects of social cognition.
For the analyses of positive selection, we used ‘model A’ as implemented in Codeml, which detects selection on specific codons along specified branches of a phylogenetic tree (Zhang et al. 2005). Model A allows four categories of selection on codon sites in a sequence: two categories (ω1 and ω0) with uniform selection across all branches (ω1=1 and 0<ω0<1; estimated from the data) and two categories for which selection pressure differs in one or a few pre-specified ‘foreground’ branches where selection is assumed to have changed. The method finds the model with the maximum likelihood, given the data. If the maximum-likelihood model includes a category of sites with an ω>1, then this provides evidence that positive selection has acted on those sites along the specific lineage analysed. To test the significance of the results, the log likelihood of the maximum-likelihood model can be compared to similar models lacking selection using a log-likelihood ratio test (LRT). We compared the maximum-likelihood model to itself with ω constrained to equal one (i.e. a category of sites under positive selection was not allowed; MAofix test in tables 2 and 3). The asymptotic null distribution for the test is a 50 : 50 mixture of point mass 0 and Χ12 (Zhang et al. 2005). Simulations have shown that this test is ‘slightly too liberal’, so we also note the results for the Χ2-test with 1 d.f., which was demonstrated to be ‘overly conservative’ (Zhang et al. 2005). Our statistical tests using PAML are expected to be prone to false-negative (rather than false-positive) results, given the relatively small number of species that were available for analyses (mean of 7, range 5–15; see electronic supplementary material). For genes showing significant evidence for selection using the LRTs, we reran the analyses with varying starting values for the parameters κ (transition: transversion ratio) and ω, to check that we had not converged on a local likelihood peak. See electronic supplementary material for further explanation of PAML methods.
To test for a statistically increased signal of positive selection in schizophrenia-associated genes, we compared the frequency of positive selection in these 76 genes with the frequency of positive selection in 120 control genes, randomly chosen from the total set of 300 total control genes noted above. For these genes, the taxa analysed included human, chimpanzee, mouse, rat, macaque, orangutan, dog and cow. To test for any influence of gene length on the statistical power of the PAML analysis, we used the non-parametric Kolmogorov–Smirnov test to compare the distribution of gene lengths between the schizophrenia-associated gene and control gene datasets (following the removal of outliers using Tukey's outlier filter) and found no significant difference.
(a) HapMap analyses of positive selection in recent human evolution
HapMap-based analyses provided evidence of recent positive selection on 14 of the genes associated with schizophrenia (table 1 and electronic supplementary material). For six of these genes, the signal of selection was localized specifically to the focal gene, and for three genes, only one adjacent gene also showed significance (p<0.05). Of particular interest is the identification of a particularly strong and localized focus of selection for DTNBP1, which exhibits one of the strongest genetic and functional associations with schizophrenia of any gene analysed to date (Riley & Kendler 2006).
Of the 14 genes inferred as selected from the HapMap data, four displayed significant evidence for recent selection in more than one human population, while the rest were found to be significant in only one population. This is suggestive of extremely recent selection acting on these genes since the time of human demographic expansion out of Africa, and it is also generally consistent with the presence of variation among human populations in the genetic basis of schizophrenia (e.g. Bulayeva et al. 2007).
The proportion of schizophrenia genes inferred as selected in Haplotter (18.4%, 14 of 76) was significantly higher than the proportion of neuronal activities control genes inferred as selected (9.0%, 27 of 300, Χ2=5.52, p=0.019). Similar results were obtained when we restricted the analyses to the subset of schizophrenia genes (N=44) for which differences in gene expression levels or gene-associated neuroanatomy have been documented between schizophrenics and controls (electronic supplementary material): 9 (20%) of 44 positively selected compared with 27 (9%) of 300 for neuronal activities controls (Χ2=5.37, p=0.02).
These statistical comparisons with control genes indicate that schizophrenia-associated genes may show an enriched signal of positive selection in recent human evolution, and that the evidence of selection is often heterogeneous across human populations. Additionally, 12 genes, AHI1, CHL1, GRIA4, GRIN2A, GRIN2B, NOS-1, PDLIM5, PLA2G4A, RELN, SYN2, TNXB and TPH1, exhibited non-significance in gene-wide analyses using Haplotter, but had one or more single nucleotide polymorphisms that were inferred to have been subject to positive selection.
(b) PAML analyses of positive selection in primate evolution
PAML analyses yielded significant evidence of positive selection on four genes on the human lineage (table 2 and electronic supplementary material). Of particular interest is the identification of NRG1 (neuregulin 1) that shows notably strong genetic association with schizophrenia (Hall et al. 2006; Riley & Kendler 2006; Thomson et al. 2007). This gene shows an absence of evidence for selective sweeps in recent human evolution (Gardner et al. 2007), which suggests that the positive selection that we have inferred may have taken place earlier in the human lineage, such that selected haplotypes have been broken up by recombination. An additional seven genes displayed evidence of positive selection either in the human–chimpanzee lineage, in the ancestral primate lineage leading to humans, or in both lineages (table 3).
Overall, 11 (15%) of the 76 schizophrenia-associated genes showed evidence of positive selection in the PAML analyses (tables 2 and 3), compared with 10 (8.3%) of the 120 control genes analysed (Χ2=1.83, p=0.176). For the human lineage, a higher proportion, 4 (5.6%) of 76 of schizophrenia-associated genes, showed evidence of positive selection than did the control genes (none of 120; Fisher's exact test, p=0.022). However, there was no such difference for the human–chimpanzee lineage (with three positively selected control genes versus three positively selected schizophrenia-associated genes; Fisher's exact test, p>0.50), or for the primate-origin lineage (seven positively selected control genes versus five positively selected schizophrenia-associated genes; Fisher's exact test, p>0.50). If the overly conservative (Zhang et al. 2005) test using Χ2-test with 1 d.f. is used, then one of the schizophrenia-associated genes, TP53, shows marginally non-significant evidence of positive selection in the human lineage, such that three genes show positive selection in this lineage (Fisher's exact test, p=0.057).
(c) Positive selection of DISC1
DISC1 (disrupted in schizophrenia), a gene with strong, well-replicated genetic ties to this disorder (Mackie et al. 2007), was inferred as positively selected in both the human–chimpanzee and primate-origin lineages. Remarkably, this gene showed 16 amino acid sites as having evolved under the influence of positive selection, of which six are within the HEP3 haplotype that segregates with schizophrenia (see figure 1 and the following; Porteous et al. 2006). To further investigate the DISC1 coding region in the HEP3 haplotype, we conducted a sliding-window analysis and found a localized peak of accelerated evolution in this region (figure 1). DISC1 thus evolved extremely rapidly during primate evolution, with significant evidence for acceleration in the only coding region of the gene showing reproducible evidence of a genetic association with schizophrenia.
Our results provide evidence for a significantly stronger signal of positive selection specific to the human lineage for schizophrenia-associated genes than for a control set of neuronal activities genes, for both the HapMap-based analyses of recent selective sweeps and the PAML analyses of positive selection. Furthermore, significant evidence of positive selection was found for several genes that exhibit some of the best-replicated and best-understood genetic and functional associations with this disorder, including DISC1, DTNBP1 and NRG1.
The primary, non-exclusive implications of these results are that schizophrenia may represent, in part, a maladaptive by-product of adaptive changes in human and primate evolution, and that some alleles increasing liability to the cognitive impairments associated with schizophrenia may have been selected against. These hypotheses can be evaluated by testing the same sets of polymorphisms for association with schizophrenia and for signatures of positive selection or other evolutionary-genetic processes (Mutsuddi et al. 2006). Such integrated evolutionary–genomic analyses will also provide a novel, powerful perspective into the functions of genes, alleles and neurological substrates that underlie the aetiology of schizophrenia and schizotypal cognition, since positively selected variants are expected to exhibit functional significance. Genetic and phenotypic changes along the human lineage are most directly salient to the evolutionary basis of schizophrenia, but changes earlier in primate evolution, as apparently exemplified by DISC1, may have generated some of the basic social-cognitive substrates that are dysregulated in this disorder.
Additional convergent evidence that selection has impacted the evolutionary underpinnings of schizophrenia comes from two sources. First, neurological studies have shown that brain areas differentially dysregulated in schizophrenia include the regions most-notably subject to differential evolutionary change along the human lineage (Randall 1998; Brüne 2004; Burns 2006). Second, Wayland et al. (2006) found that genes exhibiting positive selection for differential expression between humans and chimpanzees are differentially dysregulated in dorsolateral prefrontal and orbitofrontal cortices of individuals with schizophrenia (Wayland et al. 2006). These findings link recent evolutionary changes in human neuroanatomy and gene expression with alterations of these phenotypes in schizophrenia and demonstrate that the phenotype substrates of this disorder, as well as its genetic basis, show evidence of effects from positive selection and adaptive evolution.
(a) Evidence for positive selection from previous studies
Our maximum-likelihood protein-coding and haplotype-based analyses probe only for a subset of the possible signatures of selection on genes increasing liability to schizophrenia. Moreover, these methods detect signals over different time scales, and for different types of positively selected variants, such that the general absence of concordance of genes identified as selected between the two types of method (HapMap and PAML) is not unexpected. For example, the great majority of positively selected alleles in humans and other mammals are non-coding (Williamson et al. 2007; Resch et al. 2007), such that recent selective sweeps are expected to involve mainly non-coding regions that will be detected only by analyses based on HapMap data.
Given that selection involving only one or few amino acid sites, selective sweeps older than HapMap data can distinguish, or balancing selection that maintains human polymorphisms, may have influenced the genetic architecture of schizophrenia, we also reviewed the literature for evidence of selection or rapid evolution of genes associated with schizophrenia risk, from previous studies. Sixteen genes that exhibit genomic and functional links with schizophrenia, ADCYAP1, AHI1, APOE, CCR5, DRD4, DTNBP1, FOXP2, GABRB2, GRM3, HOPA, LRRTM1, MAOA, MCPH1, OLIG2, PDLIM5 and SLC6A4, also show evidence of positive or balancing selection in the human lineage from previous published studies, and five genes, CHRM5, GRIK4, ENTH, HLA-DRB and NPAS3, show evidence of rapid evolution (table 4 and electronic supplementary material). Furthermore, Voight et al. (2006) noted an unusual preponderance of phosphatidylinositol 3-kinase signal-transduction pathway genes subject to recent selective sweeps, and evidence from functional analyses, physiology, pharmacology and genetic-association studies indicates that dysregulated activation of this pathway (via effects of genes such as AKT1, DTNBP1, ENTH, ERBB4, NRG1, PIK3C3 and RELN) represents an important risk factor for schizophrenia (Kalkman 2006). Finally, the schizophrenia-related dopamine receptor genes DRD2, DRD3 and DRD4 exhibit heterosis in some of their effects on human physiology and behaviour (Comings & MacMurray 2000). These findings provide additional evidence for selection and rapid evolution of genes underlying schizophrenia, which is generally concordant with the results from PAML and HapMap analyses and localizes signatures of selection to the human lineage.
(b) Potential mechanisms mediating the evolution of genes that underlie schizophrenia
Positive selection of genes implicated in schizophrenia may be mediated by an array of neurodevelopmental, neurophysiological and psychological mechanisms. Our analyses provide complementary types of evidence for positive selection of genes underlying this disorder, but do not provide any direct information concerning how selective pressures at the phenotypic level may have impacted the evolution of these genes.
Evidence from previous work salient to the hypothesis that the adaptive evolution of genes underlying schizophrenia has involved phenotypic selection comes from studies of genetic linkages between schizophrenia-risk genes and aspects of creativity. Allelic variants of three genes associated with schizophrenia, SLC6A4, TPH1 and DRD2, have recently been associated with measures of creativity and imagination in normal population (Bachner-Melman et al. 2005; Reuter et al. 2006). Notably, the STin2.12 allele of an intronic polymorphism in the SLC6A4 gene and the A allele of the A779C polymorphism in the TPH1 gene are each associated with both increased creativity and increased risk of schizophrenia (Bachner-Melman et al. 2005; Reuter et al. 2006). Alleles of several genes associated with schizophrenia also influence measures of schizotypal cognition and ‘openness to experience’ in non-clinical populations, including COMT and HTR2A (Ott et al. 2005), MAOA (Samochowiec et al. 2004) and SLC6A4 (Golimbet et al. 2003). Description of the linkages of creativity and cognition with schizotypy and schizophrenia is provided for the four genes, SLC6A4, TPH1, COMT and DRD2, with sufficient integrative information available for detailed analysis (electronic supplementary material). We also describe here the neurological and psychological evidence that links schizotypy with aspects of creativity and divergent thinking, and epidemiological data for variation in measures of fitness between schizophrenics, first-order relatives and controls.
The mechanisms connecting schizotypal cognition and creativity with postulated fitness benefits are unclear, but may include sexual selection, creative and artistic skills, or general benefits from insight problem solving (Shaner et al. 2004; Nettle 2006a; Karimi et al. 2007). Such processes could potentially help to explain the paradoxical high heritability and persistence of schizophrenia, owing to some combination of multilocus balancing selection, antagonistic pleiotropy affecting fitness and mutation–selection balance.
We stress that enhanced creativity is not expected to be a function of schizophrenia itself (which involves profound cognitive impairments), but rather is inferred to be a function of schizotypal cognition (Claridge et al. 1990; Nettle 2001, 2006a,b; Barrantes-Vidal 2004). Indeed, alleles underlying the deleterious aspects of schizophrenia or schizotypal cognition are expected to be selected against (Keller & Miller 2006; Nettle 2006b). We also note that most studies of the cognitive correlates of genetic liability to schizophrenia have focused on documenting deficits in clinical populations, rather than testing for potentially positive aspects of cognition in healthy relatives or other individuals who carry schizophrenia-associated alleles or genotypes.
(c) Limitations and implications
Interpretation of our results is limited by the observation that few of the genes associated with increased schizophrenia risk are specific to this disorder: some such genes are also known to affect liability to bipolar disorder, and most of them influence a diverse range of neurodevelopmental and physiological processes, and exert only a small effect on schizophrenia risk that may vary among populations (Kendler 2005; Riley & Kendler 2006). Bipolar disorder and schizophrenia in particular exhibit substantial overlap in cognitive symptoms as well as their genetic basis, and links of bipolar disorder with creativity and imagination have also been repeatedly described (Claridge et al. 1990; Nettle 2001).
A second limitation of our study is that there remains considerable controversy about what constitutes adequate statistical demonstration of association between allelic variants of a specific gene and schizophrenia. Taking the most conservative approach on this question would result in too few genes for any meaningful analyses, whereas including all genes that have ever been linked with schizophrenia in a single study or population would be relatively liberal, although it would not be expected to bias analytic results towards a finding of an enhanced signal of positive selection, relative to control genes. We have taken a middle ground and used genes that have not been subject to failed replication attempts. However, it is important to note that significant evidence for selection was found in multiple genes showing the strongest, best-replicated associations with schizophrenia to date, including DISC1, DTNBP1 and NRG1. We have also repeated the HapMap analyses using only the 44 genes for which expression or gene-associated neuroanatomy differences have been recorded between schizophrenics and controls, and found the same results.
An important implication of our findings is that evolutionary–genomic analyses can provide novel insights into the nature and functions of the genes that underlie the aetiologies of schizophrenia. More generally, analyses that probe for signatures of selection on genes implicated in schizophrenia may localize functional sites affecting gene expression, enzyme activity, neurocognitive phenotypes and disease risk, and thus highlight potential allelic variants salient to the cognitive changes that have driven the origin of modern humans (Crow 1997; Horrobin 1998).
We thank F. Breden and B. Voight for advice and comments and four anonymous reviewers for their insightful suggestions. This work was funded by grants from NSERC and the Canada Council for the Arts to B.C., an ECU College Research Award to K.S. and an NIH Ruth L. Kirschstein National Research Service Award to S.D.