Ecological selection pressures for C4 photosynthesis in the grasses

Grasses using the C4 photosynthetic pathway dominate grasslands and savannahs of warm regions, and account for half of the species in this ecologically and economically important plant family. The C4 pathway increases the potential for high rates of photosynthesis, particularly at high irradiance, and raises water-use efficiency compared with the C3 type. It is therefore classically viewed as an adaptation to open, arid conditions. Here, we test this adaptive hypothesis using the comparative method, analysing habitat data for 117 genera of grasses, representing 15 C4 lineages. The evidence from our three complementary analyses is consistent with the hypothesis that evolutionary selection for C4 photosynthesis requires open environments, but we find an equal likelihood of C4 evolutionary origins in mesic, arid and saline habitats. However, once the pathway has arisen, evolutionary transitions into arid habitats occur at higher rates in C4 than C3 clades. Extant C4 genera therefore occupy a wider range of drier habitats than their C3 counterparts because the C4 pathway represents a pre-adaptation to arid conditions. Our analyses warn against evolutionary inferences based solely upon the high occurrence of extant C4 species in dry habitats, and provide a novel interpretation of this classic ecological association.


INTRODUCTION
The majority of terrestrial plant species use the C 3 photosynthetic pathway. However, the efficiency of this process is compromised by photorespiration, and its rate is strongly limited by CO 2 diffusion from the atmosphere. Photorespiration increases at low CO 2 concentrations and high temperatures, and CO 2 limitation is accentuated by the reduction of stomatal aperture under arid conditions (Björkman 1971;Osmond et al. 1982). The evolution of C 4 photosynthesis has solved each of these problems via a suite of physiological and anatomical adaptations that concentrate CO 2 at the site of carbon fixation, minimize photorespiration and raise the affinity of photosynthesis for CO 2 at low mesophyll concentrations (Bjö rkman 1971;Osmond et al. 1982). As a consequence, C 4 plants have the potential to achieve higher rates of photosynthesis than their C 3 counterparts, particularly at high irradiance (Black et al. 1969). Since C 4 photosynthesis draws mesophyll CO 2 down to lower concentrations than the C 3 type, it also allows stomatal conductance to be reduced, leading to greater water-use efficiency than the C 3 pathway under the same environmental conditions (Downes 1969). The C 4 pathway is therefore classically viewed as an adaptation to declining levels of atmospheric CO 2 ( Ehleringer et al. 1991), andhot, open, arid environments (Bjö rkman 1971;Loomis et al. 1971).
Approximately half of the world's grass species use C 4 photosynthesis (Sage et al. 1999a), and these plants dominate grassland and savannah ecosystems in warm climate regions (Sage et al. 1999b). They also include economically important food crops such as maize and sugarcane, and biofuel crops such as switchgrass and Miscanthus. Recent phylogenetic data suggest that the C 4 pathway evolved in 9-18 independent clades of grasses during the past 32 million years (Myr) (Christin et al. 2008;Vicentini et al. 2008). However, only the earliest of these evolutionary origins coincided with the major decline in CO 2 that occurred during the Oligocene (32-25 Myr ago; Pagani et al. 2005;Christin et al. 2008;Roalson 2008;Vicentini et al. 2008). One phylogenetic analysis suggests that the evolution of the C 4 pathway became more likely after the CO 2 decrease (Christin et al. 2008), and complementary studies suggest that the C 4 origination events were clustered in time ( Vicentini et al. 2008), and occurred in grass clades that were already adapted to warm climates (Edwards & Still 2008). However, adaptive hypotheses about the suite of local ecological factors that are selected for C 4 photosynthesis remain largely untested (Roalson 2008). Chief among these are the hypothesized roles of water deficits caused by aridity or salinity, and the formation of open habitats via disturbance (Sage 2001).
The C 4 photosynthetic pathway offers grasses the potential to achieve higher rates of leaf carbon fixation with a similar or lower expenditure of water than C 3 species (Loomis et al. 1971;Gifford 1974). It also maximizes dry matter production when water is available in limited pulses ( Williams et al. 1998), and allows the conservation of water in a drying soil (Kalapos et al. 1996). These physiological benefits are moderated by a trade-off between the photosynthetic rate and the intrinsic water-use efficiency of C 4 leaves ( Meinzer 2003). However, they are consistent with the common occurrence of C 4 grass species in seasonally arid ecosystems, deserts and on saline soils (Sage et al. 1999b). Compelling evidence for the ecological sorting of C 4 species into drier habitats than C 3 species was provided by a recent comparative study of the largely exotic Hawaiian grass flora (Edwards & Still 2008).
Despite their prevalence in dry habitats, C 4 grasses also occupy a diverse range of mesic, shaded and flooded ecological niches, and the primary importance of aridity for the ecological success of these species has therefore been challenged (Ehleringer et al. 1997;Sage et al. 1999b;Keeley & Rundel 2003). Large-scale spatial patterns also highlight a more complex relationship with climate than predicted by water-use efficiency alone, with the biomass of C 4 grasses relative to other plant functional types increasing, rather than decreasing, with rainfall across the Great Plains of North America (Paruelo & Lauenroth 1996). In fact, the potential for C 4 photosynthesis to drive high rates of productivity means that there are sound theoretical reasons to expect a selective advantage for the pathway in moist soil environments, whenever high temperatures are coupled with moderate-to-high light availability (Long 1999;Sage et al. 1999b;Keeley & Rundel 2003;Sage 2004).
Spatial correlations with environmental variables suggest that some of the observed variation in the ecological niche of C 4 grasses may be explained by the contrasts in the tolerance of aridity between different phylogenetic groups ( Hartley 1950;Taub 2000). Unravelling the confounding effects of physiology and phylogeny will therefore be crucial if we are to make realistic predictions about the future impacts of increasing aridity on community composition in subtropical grasslands (Christensen et al. 2007), and move towards a greater understanding of the role of palaeoclimate change in driving the expansion of C 4 grassland ecosystems in the geological past (Osborne 2008).
The aim of this study is to investigate the ecological selection pressures for C 4 photosynthesis in the grasses, using the comparative method to test the alternative hypotheses of adaptation ( Harvey & Pagel 1991). Drawing upon a recently published phylogeny (Christin et al. 2008), we have compiled a global habitat dataset for 117 genera of grasses, sampling each of the major clades and 15 independent C 4 lineages. Analyses of these data address two key questions. First, which ecological factors have selected for the C 4 pathway, in particular, is it an adaptation to aridity? And secondly, to what extent is variation in the ecological niches of different C 4 plant groups explained by phylogenetic history? Our results are consistent with the hypothesis that selection for C 4 photosynthesis occurred in open habitats but was independent of water availability, whereas subsequent evolutionary transitions into arid habitats were faster in C 4 than C 3 clades.

MATERIAL AND METHODS
(a) Phylogenetic framework Phylogenetic relationships were based on the calibrated consensus tree of Christin et al. (2008). Species sampling for this tree was designed to include all postulated origins of the C 4 photosynthetic pathway within the grasses, and to minimize the distance between the stem group and crown group nodes. The topology was obtained by Bayesian inference using the chloroplast DNA markers rbcL and ndhF, and calibrated using Bayesian molecular dating, with minimum ages for six nodes based on fossil evidence (Christin et al. 2008). Branch lengths are therefore proportional to time elapsed. The grass phylogeny was kindly provided by Pascal-Antoine Christin (University of Lausanne).
Since the complete phylogenetic analysis spanned the entire order Poales, we first extracted the 187 species belonging to the grass family Poaceae. The tree indicated that a number of genera were polyphyletic (e.g. Panicum, Merxmuellera), and these were removed as it was not possible to generate unequivocal trait data for these. One genus that appeared to be paraphyletic (Brachiaria) was combined together with its sister (Urochloa) to form a monophyletic clade. This procedure resulted in a phylogeny of 129 grass genera.
(b) Ecological data The photosynthetic type (C 3 or C 4 ) within each genus was assigned following Sage et al. (1999a). However, a number of genera could not be categorically assigned a photosynthetic type, since they contained C 3 , C 4 and C 3 -C 4 intermediate species (Neurachne, Alloteropsis and Steinchisma). Rather than excluding these genera from the analysis, we assigned photosynthetic type based on the majority of species (Neurachne and AlloteropsisZC 4 and SteinchismaZC 3 ), and tested the sensitivity of our analyses to this assumption by examining the effects of a reversal in the photosynthetic type for these genera.
Habitat and diversity data were then derived from the information compiled by Watson & Dallwitz (1992 onwards). For each genus, we recorded the number of species and type of habitats occupied, including information on water requirements (e.g. hydrophyte, xerophyte), tolerance of saline conditions (halophyte and glycophyte) and the occupation of shaded habitats (shaded and open). Water requirements were assigned a numerical score, giving equal weighting to the extremes (hydrophyteZ5, helophyteZ4, mesophyteZ3 and xerophyteZ1), and resulting in a continuous sequence of values for each genus. The habitat types occupied by each genus were then characterized using the mean and range of these values. Two further binary traits recorded the presence or absence of shade species, and the presence or absence of xerophytes. Since halophytes tolerate physiological drought imposed via high osmotic pressure, we also included genera containing halophytes in the 'xerophyte' category. However, all of the halophytic genera except one (Spartina) contained xerophytes. Habitat data were not available for all clades, and our final dataset included a total of 117 genera, sampling 15 out of the 17 hypothesized origins of C 4 photosynthesis in the grasses (Christin et al. 2008). The full dataset is provided in table S1 in the electronic supplementary material.
(c) Phylogenetic comparative analysis In the first set of analyses we aimed to determine whether photosynthetic pathway is associated with several continuous ecological traits. Photosynthetic pathway was coded as a binary categorical variable (C 3 versus C 4 ). The number of species within a genus, and the mean and range of genus water requirements were coded as continuous variables. To test whether these were correlated with photosynthetic pathway, we used a generalized linear model in which the continuous variable was the dependent variable and the photosynthetic pathway a categorical predictor. In order to control for phylogenetic dependence we simultaneously estimated Pagel's l ( Pagel 1999) using the approach described in Freckleton et al. (2002). This parameter measures, and controls for, the degree to which the residual variation shows phylogenetic non-independence according to the predictions of a simple Brownian model of trait evolution. According to this, a value of lZ0 indicates that there is no phylogenetic dependence in the data, while lZ1 indicates that the residuals show strong phylogenetic dependence.
(d) Modelling evolutionary pathways In the second set of analyses, our objective was to model the transitions between C 3 and C 4 photosynthetic pathways and to determine whether these are associated with transitions between habitat types, specifically shaded versus open habitats, and xeric versus mesic ones. We modelled the evolutionary transitions using approaches described in Pagel (1994Pagel ( , 1999 and Pagel & Meade (2006). In brief, this method is based on a continuous-time Markov model, which models the transitions of discrete characters between states. For a pair of binary traits, there are four possible states (state 1Z00, state 2Z01, state 3Z10, state 4Z11) and eight parameters, which are the instantaneous rates of change between the states (denoted by q ij , measuring the rate of change from state i to j ), assuming that instantaneously only a single change in one character may occur. The model was fitted using the reversible jump Markov chain Monte Carlo methods described in Pagel & Meade (2006) using the package BAYESTRAITS (http://www.evolution.rdg.ac. uk/BayesTraits.html), and parameters were sampled from their posterior distributions.
In the first analysis, we wished to test whether transitions between C 3 and C 4 pathways were dependent on habitat openness. Thus, each genus was coded as either exclusively confined to open habitats (0) or sometimes/always occupying shaded habitats (1), and as being C 3 (0) or C 4 (1). We fitted the full model allowing for all single-step transitions between the states. In order to test the hypotheses concerning the rates of evolution between the states, we conducted three comparisons: firstly, we asked whether the rate of transition between C 3 and C 4 differed between open and shaded habitats (by contrasting rates q 13 and q 24 ). Secondly, we asked whether the rate of transition from open to shaded habitats differed between C 3 and C 4 lineages (by contrasting q 12 and q 34 ). And finally, we asked whether the transition from shaded to open habitats differed between C 3 and C 4 lineages (by contrasting q 21 and q 43 ).
In the second analysis, we tested whether the transitions between C 3 and C 4 pathways were accompanied by changes in the aridity of occupied habitat. Each genus was coded as being either exclusively confined to waterlogged/ mesic habitats (0) or sometimes/always occupying xeric/saline habitats (1), and again we fitted a full model including eight parameters. From the posterior distribution of parameter estimates, we compared the distributions of the estimates of rates of transition from C 3 to C 4 in xeric and mesic habitats. Again, we used the fitted parameters to test three hypotheses: firstly, we asked whether the rate of transition between C 3 and C 4 pathways differed in mesic and xeric habitats (by contrasting q 13 and q 24 ). Secondly, we asked whether the rate of transition from mesic to xeric habitats differed between C 3 and C 4 lineages (by contrasting q 12 and q 34 ). And finally, we asked whether the transition from xeric to mesic habitats differed between C 3 and C 4 lineages (by contrasting q 21 and q 43 ).
To contrast q ij and q kl , for each model in the posterior distribution we calculated the difference q ij Kq kl . For the whole set of models in the posterior distribution, we then examined the distribution of values of these differences to determine whether there were systematic deviations from zero. These differences are presented in the supplementary information together with the estimated parameters for all models (see table S2 in the electronic supplementary material).
The possibility of evolutionary reversals from the C 4 pathway to the C 3 type remains a key area of uncertainty in phylogenetic models. Phylogenetic analyses of the numerous C 3 and C 4 clades in the subfamily Panicoideae suggest that the hypotheses of multiple evolutionary origins and/or reversions are equally parsimonious (Giussani et al. 2001) and, in the genus Alloteropsis, a C 4 to C 3 reversal is the single most parsimonious interpretation (Ibrahim et al. 2009). Although the convergent evolution of amino acid sequences in a C 4 -specific enzyme does provide compelling evidence for multiple C 4 origins in this grass subfamily (Christin et al. 2007), phylogenetic analyses still indicate a high likelihood of reversion events in the Panicoideae ( Vicentini et al. 2008).
However, one issue of concern in such analysis is that, when analysing the evolution of a binary trait, if one of the trait states has a higher speciation rate, reconstructions can appear to support the enhanced rates of reversals from rare to common states (Maddison 2006), and this problem affects the method used here. We note below that we find evidence consistent with higher rates of diversification in C 4 grass clades, raising the possibility of a non-random distribution of extinction probabilities across C 3 and C 4 lineages.
Clearly, the issue of reversible transitions between photosynthetic pathways is contentious and must be considered in ecological models of C 4 grass evolution. We therefore conducted two sets of analysis to consider the sensitivity of our results to this. In the first instance, we conducted the analysis as described above, including the possibility of reversions. We then re-analysed the data, prohibiting reversals from C 4 to C 3 . This constrained model included six rather than eight parameters. We asked two further questions using the full, eight-parameter models; if they are possible, do C 4 to C 3 reversals depend on shading or aridity (q 31 versus q 42 )?
3. RESULTS (a) Comparative analysis Species number is significantly higher within C 4 than C 3 genera (table 1; figure 1a), and the range of habitat water requirements within each genus is significantly greater for the C 4 than the C 3 type (table 1; figure 1b). Species number is 33 per cent greater in C 4 compared with C 3 genera (figure 1a), while the range of habitat water requirements almost doubles (increasing by 85%; figure 1b). Neither shows significant phylogenetic dependence (lZ0; table 1). However, there is a significant linear association between species number and the range of habitat water requirements (F 1,90 Z26.32, pZ1.7!10 K6 ). The range of habitats occupied within each genus explains about a quarter of its species number (R 2 Z0.22). Critically, the introduction of photosynthetic type as a categorical predictor does not significantly improve the fit of this statistical model to the data (F 2,90 Z1.88, pZ0.17). This means that the observed Selection for C 4 photosynthesis C. P. Osborne & R. P. Freckleton 1755 association between species number and photosynthetic type may be entirely due to habitat diversity rather than a direct effect of C 4 photosynthesis per se. In other words, C 4 genera occupy a wider range of habitats and this, in turn, is associated with a larger number of species per genus.
The mean habitat water requirement is significantly lower in C 4 than C 3 genera (table 1; figure 1c), and shows a strong, statistically significant phylogenetic dependence (l/1; table 1). Therefore, C 4 genera occupy a wider range of drier habitats than their C 3 counterparts, but different clades of grasses differ markedly in their habitat water requirements. These results are insensitive to the assumptions made about photosynthetic pathway in the genera Neurachne, Alloteropsis and Steinchisma. Figure 2 summarizes the rates of evolutionary transitions between states, considering the phylogenetic tree as a whole, and all of the postulated origins of C 4 photosynthesis. The rate estimates are summarized in table S2 in the electronic supplementary material, together with the credible intervals based on the distribution of rate estimates in the posterior. All of these results are insensitive to the assumptions made about photosynthetic pathway in the genera Neurachne, Alloteropsis and Steinchisma.

(b) Evolutionary transitions
Evolutionary transitions from C 3 to C 4 photosynthesis are significantly faster in grass clades confined to open habitats (i.e. q 13 Oq 24 ; figure 2a,c), and this result is robust to assumptions about the possibility of reversions from C 4 to C 3 photosynthesis ( figure 2a versus figure 2c). The same analysis shows that grass clades occupying shaded habitats are significantly more likely to become confined to open habitats if they are C 4 than C 3 (i.e. q 43 Oq 21 ; figure 2a,c). However, the rate of evolutionary transitions from open to shaded habitats is independent of photosynthetic type, and C 3 and C 4 species are therefore equally likely to adapt to shade (i.e. q 12 Zq 34 ; figure 2a,c). Again, these results are robust to the assumptions made about C 4 to C 3 reversions ( figure 2a versus figure 2c). If C 4 to C 3 reversals are possible, they occur at the same rate (are equally likely) in open and shaded habitats (i.e. q 31 Zq 42 ; figure 2a).
The likelihoods of ancestral character states at each node in the phylogeny are shown in figure 3, with a key to genera provided in figure S1 in the electronic supplementary material. The model indicates with a high posterior probability that the last common ancestor of the Poaceae was a C 3 shade species (figure 3, node A). It also illustrates the most likely evolutionary pathway to C 4 photosynthesis, whereby a transition into open habitats was a necessary pre-condition for the origin of the C 4 pathway. For example, the model shows with high likelihood that the last common ancestors of the C 4 clades Chloridoideae (figure 4, node B) and xZ10 Paniceae (figure 4, node C) were confined to open habitats. However, the open habitat reconstructions for last common ancestors of the C 4 clades Andropogoneae (figure 4, node D) and the 'main clade' of xZ9 Paniceae (figure 4, node E) have lower associated probabilities.
Unexpectedly, evolutionary transitions from C 3 to C 4 photosynthesis occur at the same rate (are equally likely) in grass clades that contain xerophytic or halophytic species, and those confined to mesic or waterlogged habitats (i.e. q 13 Zq 24 ; figure 3b,d ). However, the rate/ likelihood of evolutionary transitions from mesic to xeric habitats is significantly higher in C 4 than in C 3 grass clades (i.e. q 34 Oq 12 ; figure 3b,d ). By contrast, species are equally likely to become confined to mesic or waterlogged habitats Table 1. Results of generalized linear models testing for an association between photosynthetic pathway (C 3 or C 4 ) and species number or habitat characteristics. ('Species number' indicates the total number of species within each genus. 'Water range' and 'water mean' refer to the range and mean of habitat water categories, taken across all of the species within each genus. The results show the F-ratio, degrees of freedom (d.f.) and significance level ( p-value) for photosynthetic pathway as a categorical predictor in each model. Pagel's l estimates the degree of phylogenetic dependence in the data, ranging from 0 (no dependence) to 1 (strong dependence  if they are C 3 or C 4 (i.e. the rate of evolutionary transition from xeric to mesic habitats is independent of photosynthetic type, q 21 Zq 43 ; figure 3b,d ). As in the previous analysis, these results are robust to the assumptions made about the possibility of C 4 to C 3 reversions (figure 3b versus figure 3d ). If C 4 to C 3 reversals are possible, they depend significantly on habitat water availability, and evolutionary reversion is significantly faster/more likely in mesic or waterlogged habitats than xeric ones (i.e. q 31 Oq 42 ; figure 2b).
The second model of ancestral character states is shown in figure 4 (key to genera in figure S1 in the electronic supplementary material), and indicates that the most likely common ancestor of the Poaceae was a C 3 species confined to mesic habitats (node A). It also illustrates important contrasts between clades in the habitat where the C 4 pathway evolved. For example, the model shows with a high probability (greater than 80%) that the last common ancestors of the C 4 clades Chloridoideae (figure 4, node B), 'Arundinelleae' (figure 4, node F) and the main  figure S1 in the electronic supplementary material for key to genera. Ancestral values were computed for individual traits using the likelihood method of Pagel (1994) and phylogenies drawn using the ace and plot.phylo functions in APE (Paradis et al. 2004).
clade of xZ9 Paniceae (figure 4, node E) occupied xeric habitats, whereas ancestors of the Andropogoneae (figure 4, node D), the xZ9 Paniceae clade containing Echinochloa and Alloteropsis (figure 4, node G) and xZ10 Paniceae (figure 4, node C) were more likely confined to mesic habitats (probability greater than 80%). This contrast in the ancestral state of independent C 4 clades illustrates how the phylogenetic correlation in mean habitat water requirements may arise (table 1).

DISCUSSION (a) Ecological selection
Our three complementary analyses provide robust statistical support for the following adaptive hypothesis of C 4 pathway evolution in the grasses. Selection for C 4 photosynthesis occurs in open habitats, but may take place in mesic, arid or saline conditions. Once the pathway has evolved, C 4 lineages adapt to arid and saline habitats at a faster rate than C 3 lineages, and are more likely to become confined to open environments; C 4 photosynthesis in the grasses therefore represents a pre-adaptation (exaptation) to xeric conditions. However, evolutionary transitions into shaded and mesic habitats are independent of photosynthetic type. If reversals from the C 4 to C 3 type occur, they do so in mesic or waterlogged habitats, irrespective of the habitat light regime. The net result of these evolutionary processes is that extant C 4 genera occupy a drier range of habitats than their C 3 counterparts. This association of photosynthetic pathway with aridity in extant genera may interact with temperature, but we were unable to test this with our dataset. Seasonal aridity, fire, the activity of large mammalian herbivores and edaphic factors increase the availability of open habitats through the reduction of woody plant cover (Sankaran et al. 2008). Our data are therefore consistent with the hypothesis that these factors raise the likelihood of C 4 pathway evolution in the grasses (Sage 2001). The strong statistical dependence of C 4 pathway evolution on habitat openness is also consistent with the environmental responses of photosynthesis in extant C 3 and C 4 grasses: temperature and irradiance are greater in open than shaded environments, especially in the period after a disturbance event (Knapp 1984), which enhances the advantage of C 4 photosynthesis for CO 2 fixation over the C 3 type (Black et al. 1969;Björkman 1971). Our finding that shade adaptation is independent of photosynthetic type is therefore surprising, especially since C 4 grasses are virtually absent from the deep shade of forest floor environments (Sage 2001). However, the shade beneath trees in tropical woodlands and savannahs is associated with high soil moisture and nutrient contents, and the tolerance of low irradiance gives grasses the opportunity to exploit these soil resource patches (Ludwig et al. 2001).
The analysis of evolutionary transitions across the whole grass phylogeny provides no statistical evidence for an overall dependence of C 4 pathway evolution on aridity. However, it does not exclude the possibilities that (i) arid or saline conditions may select for C 4 photosynthesis in some grass clades (e.g. Chloridoideae) and not others (e.g. Andropogoneae) or (ii) high evaporative demand and soil drying between episodic rainfall events ( Williams et al. 1998) or after fire (Knapp 1984) may be important selection pressures for C 4 photosynthesis in mesic habitats. A previous comparative analysis suggested that the C 4 pathway has evolved in grass clades of warm climate regions (Edwards & Still 2008), where high rates of evaporation and shallow rooting systems may lead to leaf water deficits of K1.5 MPa, even when the soil is wet Figure 4. Likelihood of alternative ancestral states for nodes in the phylogenetic tree, showing (a) photosynthetic pathway (yellow circles, C 4 ; blue circles, C 3 ) and (b) preference for habitat aridity (yellow circles, xeric; blue circles, mesic). See figure S1 in the electronic supplementary material for key to genera. Ancestral values were computed for individual traits using the likelihood method of Pagel (1994) and phylogenies drawn using the ace and plot.phylo functions in APE (Paradis et al. 2004).
(Le Roux & Bariac 1998). Although these adaptive interpretations are possible, they are not necessary, because our finding that C 4 photosynthesis is a preadaptation to arid conditions is strongly supported across the whole phylogenetic tree. It is consistent with the wellknown association between photosynthetic pathway and leaf water consumption (e.g. Black et al. 1969;Downes 1969). However, it warns against adaptive inferences based solely upon correlations in extant species between photosynthetic pathway and habitat aridity, such as those observed in our data (table 1) and by previous authors (Edwards & Still 2008).
(b) Diversity and data quality The association between species number and the range of habitats occupied by each genus could arise for a number of reasons. First, the origin of C 4 photosynthesis may represent a 'key innovation' (Hunter 1998) that stimulates evolutionary diversification by increasing the rate of transition into xeric niches compared with the C 3 type. In this case, ecological selection is implicated in both the origins of C 4 photosynthesis and subsequent diversification within C 4 grass clades. However, it is important to note that, while the number of species and range of habitats may on average be larger within each C 4 than C 3 genus, this does not mean that C 4 grasses occupy a wider range of habitats overall. A second possible explanation for the observed correlation is sampling bias. If the sample of C 4 grasses is biased towards large genera, then the wider habitat range could be a statistical artefact arising from the greater probability of encountering species from different habitats in large samples. Testing these alternative explanations will require phylogenetic measures of diversification rates, rather than the genusbased approach used here. This is because different genera may have begun to diverge at different times, and genus size depends crucially on the attention paid to each group by taxonomists. The habitat data used in our analysis are simple, qualitative characterizations of the ecology of each genus. However, despite the basic nature of this information, we still found strong associations between photosynthetic pathway and habitat, with highly significant statistical support. The qualitative agreement between the three different analyses lends further confidence to our findings. While it is possible that the phylogeny may have biased sampling via the selection of species whose phylogenetic position is important, but whose ecology is atypical, this should have been counteracted by the explicit consideration of branch lengths in our analysis. A final sampling issue arises from our use of binary habitat traits, which potentially underestimate habitat diversity in large genera. However, the strong positive correlation between the range of water requirements and species number in each genus suggests that this did not bias our findings.
Our analysis suggested that the distribution of traits is consistent with the possibility of reversions from C 4 to C 3 types. This echoes the findings in other analyses (Ibrahim et al. 2009;Vicentini et al. 2008); however, we should be cautious about this conclusion at this stage. As noted previously, if we analyse traits that shape the phylogeny via speciation (or extinction) rates, then the outcome of the analyses can be misleading. The problem described by Maddison (2006) would arise in our dataset if the rate of speciation were greater in species with one photosynthetic pathway than the other, and the result in figure 1a indicates that this may have been the case, subject to the caveats above.

CONCLUSIONS
We have sought statistical evidence for an adaptive hypothesis of C 4 pathway evolution in the grasses. Our analyses are consistent with the hypothesis that selection for C 4 photosynthesis requires open environments, but indicate that the high occurrence of C 4 clades in dry habitats arises because the pathway is a pre-adaptation to xeric conditions. These results provide a novel interpretation of the classic association of C 4 plants with arid environments.