A number of biophysical and population-genetic processes influence amino acid substitution rates. It is commonly recognized that proteins must fold into a native structure with preference over an unfolded state, and must bind to functional interacting partners favourably to function properly. What is less clear is how important folding and binding specificity are to amino acid substitution rates. A hypothesis of the importance of binding specificity in constraining sequence and functional evolution is presented. Examples include an evolutionary simulation of a population of SH2 sequences evolved by threading through the structure and binding to a native ligand, as well as SH3 domain signalling in yeast and selection for specificity in enzymatic reactions. An example in vampire bats where negative pleiotropy appears to have been adaptive is presented. Finally, considerations of compartmentalization and macromolecular crowding on negative pleiotropy are discussed.
Selective pressures on protein-encoding genes include selection on factors that lead to proper function of the protein. This involves selection of a protein to fold into a structure that will enable its function, binding and, in some cases, catalysis. It therefore also involves selection to enable inter-molecular interaction with partners to maintain fitness. Recent discussion in the molecular evolution literature has focused on robustness to translation errors as an important constraint on amino acid substitution rates . What may also be an important part of selective pressure on sequences is selective pressure on what not to bind to. This involves both selective pressures on enzymes to catalyse a reaction only on substrates where catalysis is not deleterious and on signalling proteins to interact only with other proteins where interaction is not deleterious. One line of supporting evidence for translational robustness as a hypothesis is the correlation between expression level (concentration) of the gene/protein and evolutionary rate, but this correlation is also expected when specificity of interaction is a selective constraint . As the concentration of a protein increases, so does the potential for non-specific interactions.
At a biophysical level, high-affinity binding is easy to achieve. It can be accomplished through hydrophobic interactions, where the non-specific exclusion of solvent drives binding. However, protein–protein interfaces are not simple hydrophobic patches, but also include charged residues and residues capable of other specific interactions, and are regions of proteins that evolve much more slowly than if they just maintained unspecified hydrophobic residues .
Thus, it may be that NOT statements are important determinants of evolutionary rates, and correspondingly of selective constraints on proteins, driven by which proteins they should not bind. The consequences of this in molecular evolution and comparative genomics remain to be explored.
2. Pleiotropy and not statements
Pleiotropy has long been recognized as an important evolutionary constraint on genes and proteins. At a biophysical level, pleiotropy can reflect the necessity for a protein to bind to multiple interacting partners. This is easy to accomplish if there is no NOT statement, as a hydrophobic patch will work. This first type of pleiotropy might be termed as positive pleiotropy and this becomes harder in the presence of NOT statements, which is subsequently termed as negative pleiotropy. Negative pleiotropy also places a constraint upon sequences. If one considers sequence space in the context of Venn diagrams (figure 1), then positive pleiotropy would reflect the intersection of sequences that favourably interact with all partners that confer fitness. As suggested for protein–protein interactions, hydrophobic-rich sequences will form a large intersecting space. However, the introduction of negative pleiotropy will severely restrict this space and force it to regions of the spaces of each individual binding sequence that are less likely to intersect, making positive pleiotropy more constraining. In figure 1b, positive pleiotropy of involving A + B with C becomes almost impossible because of the negative pleiotropy of interacting with C′.
3. The expected interplay between negative pleiotropy, protein fold and system-level constraints
Both positive and negative thresholds for interactions are determined (Boltzmann distribution) by constraints on physical parameters. Two factors play into such constraints. One constraint is the protein fold, which dictates the orientation and size of the binding interface, further constrained by binding interface interactions with the shell residues of the hydrophobic core. The composition of the binding interface under this constraint dictates the potential for interaction with different partners with different affinities based upon accessible amino acid composition. This is reflected in the Venn diagrams in figure 1. Folds differ in the number of sequences that will fold into a structure, and this is related to the size and the thermostability of the protein [4,5], where more stable proteins can explore larger parts of sequence space while maintaining a properly folded structure. Similarly, folds with larger surface areas can more readily evolve new binding interactions and are also more likely to be under selective constraint to restrict these. Larger neutral walks through sequence space have been linked to the evolvability of proteins , and these are likely to be restricted by the actions of negative pleiotropy.
The second link is that between the actual physical constants (binding constants for signalling networks, binding and enzyme constants for metabolic pathways) and the overarching system-level constraints. Each link in a pathway will have a range of physical constants that give optimal flow through the pathway, where mutations that lead to deviation from this range will generally be deleterious. The size of the range of values will be very different depending upon the position in the pathway and the physical constants of other members .
Negative pleiotropy will reflect the opposite. It reflects a selective pressure to not interact in a pathway. Therefore, at the system level, this will be reflected in a threshold of physical constants that are deleterious to cross. Activating the wrong signalling cascade or metabolizing the wrong substrate can have clear deleterious effects on a cell and organism .
The evolution of pathway structure in Escherichia coli has been found to have been dominated by a process where duplicate enzymes change their specificity, while retaining a catalytic profile . This is consistent with the process by which bacteria evolve the capabilities of metabolizing anthropogenic compounds [9,10]. While the pathways are not initially particularly efficient, it is clear that this occurs with a relative evolutionary ease when there has not been active selection against binding a particular compound, as moonlighting reactions in enzymes appear to be common [11,12]. Given this, there is an evolutionary potential for evolving new pathways at high rates, but this contrasts with the relative conservation of pathways over very long evolutionary periods, especially in multicellular eukaryotes .
4. The expected interplay between negative pleiotropy, mutation rate, population size and ease of neofunctionalization
It is thought that neofunctionalization is hard to achieve, especially for orthologous proteins involving a build-up of (positive) pleiotropic constraint. This is evidenced by the small fraction of orthologous gene tree lineages showing positive selection [14,15]. Even for duplicates, the neofunctionalization process is dependent upon the waiting time for acceptable beneficial changes, and most duplicates are non-functionalized [16–19]. It is expected that negative pleiotropy is at least partially responsible for the difficulty in neofunctionalizing, given the restrictions to sequence space placed by its constraints.
Negative pleiotropy as an active selective pressure is distinct from neutral loss of a binding interaction, as in the subfunctionalization model of duplicate gene retention . In this case, even before the binding interaction is lost, the duplicate is no longer under selective pressure to bind to an interacting partner so long as the other copy still does. When the interaction between two partners is neutral, there is no restriction on the available sequence space to prevent the re-emergence of the interaction, unlike in the negative pleiotropy case.
When negative pleiotropy is considered, the population genetic underpinnings of neofunctionalization become important. Finding pockets of sequence space that yield neofunctionalized proteins may be dependent upon sampling of variants that contain multiple co-segregating mutations or that find their way through bottlenecks in sequence space. It is expected that organisms with larger mutation rates and higher population sizes would be better able to evolve rapidly in this context. Metazoans with generally low mutation rates and small population sizes would seem to have the most constrained networks with the strongest selective pressures on NOT statements. It might be expected, then, that these regulatory cascades emerged earlier in metazoan evolution and have then been relatively conserved as population sizes decreased in the evolution of chordates. In fact, many regulatory and metabolic pathways are indeed highly conserved and slow-evolving within the chordates. This is evidenced by the conserved domain structures through metazoans of many signalling proteins (for example ; or, more generally, SH2 and SH3 domain trees in Pfam ).
SH2 and SH3 domains will be used as examples in this paper. These are important proteins mediating signalling through specific protein–protein interactions in eukaryotic systems. SH2 domains bind to phosphorylated tyrosines, dependent upon the amino acids surrounding the tyrosine to generate specificity . SH3 domains, which are also found in prokaryotes, also play an important role in signalling specificity, binding to proline-rich sequences in a PPII helical structure with specificity driven by interactions with non-proline residues .
5. Negative pleiotropy in simulated evolution
A hypothesis has been generated that describes an important role for negative pleiotropy as an evolutionary constraint on sequence evolution. A sequence simulation framework has previously been developed that enables evolution of sequences in a population with a designated mutation rate constrained to fold into a given structure and bind to a given ligand. In an evaluation of SH2 sequences that were selected to bind to an original ligand (figure 2), the sequences were evolved under this constraint and the mutations in the next generation of random sampling were evaluated for their ability to also bind a second ligand. It is observed that relatively few (but some) sequences would have been specific for this second ligand, but that evolution of binding to both ligands is easy in the absence of negative pleiotropy. In this system, it is too easy for neofunctionalization to occur. It is the case that this ease decreases depending upon the difference in binding energy between the original and new ligands, but, biologically, changes in specificity frequently involve changes between chemically related binding partners (see  for a phylogenetic analysis of SH2 domain-binding specificities from the human proteome).
To evaluate the effect of negative pleiotropy on sequence diversity, SH2 domains were simulated as above with and without negative pleiotropies on binding, and the ultimate sequence diversity that was sampled is compared (figure 3). It is clear that negative pleiotropy for binding inhibits progression through sequence space and ultimately substitution rates. An example of the structural underpinning of this effect is shown in figure 4. This constraint may in fact enforce more compensatory co-variation than would occur in more neutral scenarios. One general rule is that charge seems to play a role in granting specificity, where matched charge–charge interactions complement affinities generated through hydrophobic interactions. An aspect of this simulation that should be noted is that only one binding partner evolved, whereas coevolution occurs in naturally evolving systems.
The protein design community has also noticed the importance of negative design (negative pleiotropy) in optimizing sequences. It is suggested that negative pleiotropy restricts sequence space, but owing to the metastable nature of energetic distributions of protein folding it does so without a major deviation from optimal energetics [27,28]. In fact, sequences with lower overall affinities are less likely to enable non-specific interactions. Correspondingly, the importance of negative design (in protein design, corresponding with negative pleiotropy in evolution) increases with the contact density, as these are the interactions that are most likely to lead to high-affinity non-specific interactions .
6. Enzymes and negative pleiotropy
Enzymes have tight control over substrate specificity and there are many classic cases of negative pleiotropy among enzymes. Alcohol dehydrogenases in primates include five classes with different physiological roles, and corresponding differences in substrate specificities, expression profiles and kinetic constants . Differences in specificity involve the length of the chain on the alcohol, with preferences for short-chain versus long-chain alcohols. Ethanol is a short-chain alcohol, while retinol is a long-chain alcohol. Retinol is oxidized ultimately to retinoic acid, a transcription factor, when bound by the retinoic acid receptors. These transcriptional activators regulate a number of developmentally important pathways. Specific enzymes have evolved to independently oxidize ethanol and retinol as cross-talk between the pathways is expected to be deleterious, although the physiological extent of cross-talk in various tissues is unclear ([31,32]; Matthew Carrigan & Steven A. Benner 2011, unpublished data and personal communication).
More generally, optimization of binding affinities or kinetic constants can be seen as a quantitative version of a NOT statement (although not formally negative pleiotropy), whereby an enzyme that is very active or binds a necessary partner very tightly will be deleterious and restrict sequence diversity in a similar manner to selective pressures not to bind something. This would be the case when there is a NOT statement against binding and the concentration of a binding partner is very low. However, owing to the metastable nature of evolutionary selection, this effect may be smaller .
While preventing cross-talk is important, negative pleiotropy can drive adaptation as well. In vampire bats, the most basal species, Diphylla ecaudata, has only a single plasminogen activator (inhibiting blood clotting) that can bind mammalian-specific plasminogen activator inhibitor I (PAI-1), enabling it to feed only on bird blood. The common vampire bat, Desmodus rotundus, has evolved plasminogen activator paralogues that have lost PAI-1 inhibition , and this negative pleiotropy for not binding to PAI-1 enables using mammalian blood as a food source. While adaptation is achieved through domain loss in this particular case, it reflects the mechanism by which addition of negative pleiotropy can drive evolution. Processes like this (through whatever mechanism is most easily accessible evolutionarily) may be common in organismal coevolutionary scenarios.
7. Sh3 domains and negative pleiotropy
While there are many examples of negative pleiotropy acting on enzymes, Zarrinpar et al.  present an equally compelling case for signalling via protein–protein interactions, using SH3 domains. The Saccharomyces cerevisiae proteome contains 27 different SH3 domains, each involved in specific signalling interactions. In this case, the SH3 domain from Sho1 is highly specific for its interaction with Pbs2. Replacing the Sho1 SH3 domains with less specific SH3 domains that introduced cross-talk generated a fitness cost, which was not owing to a defect in the native pathway. The SH3 domains that could reconstitute pathway activity were also shown to bind Pbs2 in vitro. Thus, there is biological evidence of a major role for negative pleiotropy in signalling as well as metabolism.
8. Compartmentalization, the intracellular milieu and solution thermodynamics
Eukaryotic organisms have other mechanisms of dealing with negative pleiotropy as well. Compartmentalization is a solution that prevents proteins that would interact from seeing each other and having the ability to interact. When proteins are expressed in different compartments (either in different cell types or in different intracellular compartments within a cell type), they will not experience selection for negative pleiotropy. Proteins expressed at different times will also not be subject to selection for negative pleiotropy.
Another consideration is that binding interactions in vivo are not the same as binding interactions in vitro . The intracellular milieu is extremely dense with proteins that are restricted in their diffusion. This results in excluded volume effects, where effective local concentrations of molecules about each other will be higher, potentially leading to stronger non-specific interactions. In addition to altering the affinities of molecules for each other, however, regional differences in effective local concentrations of molecules about each other can also serve as a type of compartmentalization, even in bacterial cells. Ultimately, molecules behave very differently in intracellular environments, but the effects of this on selection for negative pleiotropy are not all unidirectional.
With these caveats, it is still clear that negative pleiotropy or NOT statements play a role in governing selective pressures on sequences. It is evident in patterns of substitution and in selective pressures on protein functions. As molecular evolution increasingly integrates principles from systems biology and biophysical chemistry into models, negative pleiotropy will be an important consideration, and an important hypothesis to consider.
We thank Matthew Carrigan, Ashley Teufel, Jan Kubelka and Jessica Siltberg-Liberles for helpful discussions. D.A.L. receives funding from NSF DBI-0743374 and by NIH-INBRE award P20 RR016474, which also funds J.A.G. and M.D.M.T.
- Received December 7, 2010.
- Accepted March 21, 2011.
- This Journal is © 2011 The Royal Society