Many of the macroevolutionary processes that have shaped present-day phylogenetic patterns were caused by geological events such as plate tectonics and temporary land-bridges. The study of spatial patterns of phylogenetic diversity can provide insights into these past events. Here we focus on a western Mediterranean biodiversity hotspot located in the southern Iberian Peninsula and northwest Africa, two regions that are separated by the Strait of Gibraltar. We explore the spatial structure of the phylogenetic relationships within and across large-scale plant assemblages. Significant turnover in terminal lineages tends to occur between landmasses, whereas turnover in deep lineages tends to occur within landmasses. Plant assemblages in the western ecoregions of this hotspot tend to be phylogenetically overdispersed but are phylogenetically clustered on its eastern margins. We discuss our results in the light of potential scenarios of niche evolution (or conservatism) and lineage diversification. The significant turnover between landmasses suggests a common scenario of allopatric speciation that could have been facilitated by the intermittent joining of the two continents. This may have constituted an important stimulus for diversification and the emergence of this western Mediterranean biodiversity hotspot.
The field of ecophylogenetics, which can be viewed as the merging of community ecology, biogeography and macroevolution, has been rapidly expanding over the past decade . However, most work has been conducted on local-scale communities  and less attention has been paid to understanding phylogenetic structures within or across regional species pools (but see [3–6]). At local scales, non-random phylogenetic structure patterns have been mostly interpreted to be the result of ecological sorting processes such as competition [7–9], habitat filtering [10,11] and facilitation [12,13]. At larger geographical scales, macroevolutionary and historical processes (in addition to ecological sorting) are also thought to have shaped the phylogenetic structure of regional assemblages. In particular, past biogeographical events such as orogenies and massive climate changes have affected many of the macroevolutionary processes, and hence have shaped the phylogenetic relationships between extant species [14,15]. One type of biogeographical event that has long fascinated biogeographers and evolutionary biologists is the intermittent connections that have occurred between different regions or continents (the so-called land-bridges [16–18]), which have probably triggered species diversification in numerous plant and animal lineages [19–22].
The southern part of the Iberian Peninsula (Andalusia) and northwest Africa (north Morocco; figure 1) are home to exceptional levels of plant diversity  and account for 18% of the Mediterranean Basin's plant richness . This region is divided between the Iberian and African tectonic plates, whose intermittent connection over geological time (e.g. the Mediterranean desiccation during the Messinian salinity crisis ) has allowed biotic exchanges between both landmasses to take place [26,27]. Thus, the Neogene geological history of this region could have had a profound impact on species distributions and species diversification in certain lineages, and probably contributed to the emergence of this western Mediterranean biodiversity hotspot .
Botanists have long been aware of the floristic links between the southern Iberian Peninsula (Andalusia) and northwest Africa (north Morocco) [23,28]. More recently, Molina-Venegas et al.  found that plant assemblages in the western ecoregions of northern Morocco are more similar to those of Andalusia than to other nearby northern Morocco ecoregions, thereby suggesting that the Strait of Gibraltar acted as a route for plant migration. On the other hand, Lavergne et al.  found a significant relationship between the migration probability across the Strait and certain species’ life-history and dispersal traits, which suggests that the Strait also acted as an effective biotic filter, only allowing migration and gene flows in certain lineages, and triggering allopatric speciation in others. However, this analysis was restricted to only 566 species and subspecies occurring in two narrow ecoregions lying on both sides of the Strait of Gibraltar. This limitation strongly limits the extent to which their conclusions can be extrapolated to the whole hotspot and the over 4000 species and subspecies it harbours.
Our first goal was to explore the large-scale phylogenetic structure across plant assemblages (phylogenetic beta diversity, PBD) of southern Iberian Peninsula (Andalusia) and northwest Africa (north Morocco), to unravel past biogeographical and macroevolutionary processes that shaped plant biodiversity in this western Mediterranean hotspot. To do so, PBD and taxonomic beta diversity (TBD) were considered in tandem. TBD is the change in species composition across any two areas, while PBD is defined as the phylogenetic distance (branch lengths) between species assemblages across the same areas . For example, a scenario of repeated splits in recently coexisting ancestors between two areas would result in a high TBD between assemblages across the areas and a lower PBD (figure 2).
However, widely used indices for PBD incorporate different aspects of assemblages’ evolutionary history that are often confounded. PBD can be broken down into two components: ‘true’ phylogenetic turnover and phylogenetic nestedness [30,31]. The nestedness component of PBD is related to differences in phylogenetic diversity (PD) between species assemblages (sensu ), whereas the ‘true’ turnover component is the fraction of PBD explained without increasing the PD between assemblages (figure 3). Thus, a low value of ‘true’ phylogenetic turnover will indicate a trend towards the segregation of close relatives between assemblages (figure 3, top-left panel); a high ‘true’ phylogenetic turnover will indicate a trend towards the segregation of deep lineages between assemblages (figure 3, bottom-left panel). By contrast, a high nestedness component of PBD will also indicate a trend towards the segregation of deep lineages, but also an increase in the PD between plant assemblages (figure 3, bottom-right panel). Thus, estimating the PBD without taking into account both components of beta diversity may lead to an overestimation of the ‘true’ spatial turnover of lineages if the two compared assemblages have contrasting levels of PD .
In addition, we considered phylogenetic structure within assemblages (phylogenetic alpha diversity) jointly with PBD, in order to capture additional processes shaping spatial patterns of species diversity. At the local scale, a non-random pattern in phylogenetic alpha diversity may arise through a process known as habitat filtering  if locally coexisting species share adaptations to particular environmental conditions and are phylogenetically clustered compared with a regional species pool, or if distantly related coexisting species have converging key adaptations (causing phylogenetic overdispersion). At large geographical scales, ‘habitat filtering’ can be thought of in terms of differential expansion of species’ geographical ranges, according to species’ suitability to the overall environmental conditions of a study region . This large-scale ‘habitat filtering’ is expected to be more determinant for differential expansion of species’ geographical ranges in regions of high environmental heterogeneity and more stable climatic conditions  such as the southern Iberian Peninsula and northwest Africa, which are thought to have acted jointly as refugia during the Neogene and the Pleistocene [35,36]. Finally, differential diversification rates among lineages could also produce phylogenetic clustering if diversifying clades tend towards occupying restricted geographical regions . Such a diversification pattern is expected under a scenario of environment-linked diversification processes through particular combinations of adaptive traits in diversifying lineages .
2. Material and methods
(a) Study area
Andalusia (south Iberian Peninsula) and northern Morocco (northwest Africa) are geomorphologically heterogeneous areas characterized by high mountain ranges (i.e. the Baetic–Rifan complex, up to 3482 m.a.s.l.) surrounded by extensive lowlands of alluvial origin that have been shaped by the rivers Guadalquivir (Andalusia) and Sebou (north Morocco; figure 1). These two landmasses are separated by the Mediterranean Sea, which is around 14 km at its narrowest point (Strait of Gibraltar) and over 150 km at the easternmost end of the study area. The region's current Mediterranean-type seasonal precipitation rhythm has prevailed since around 3.6 Ma , but is clearly affected by the proximity of the Atlantic Ocean .
(b) Floristic dataset
We used the dataset of all the angiosperm plants occurring in Andalusia and northern Morocco in Molina-Venegas et al. , which gave a site-by-species matrix of 48 sites (ecoregions) with 4332 species and subspecies (see Molina-Venegas et al.  for full details on the floristic dataset compilation; see electronic supplementary material, appendix S1 for details on the lineages occurring in the study area and the species numbers they comprise). An ecoregion can be defined generally as a territory characterized by the existence of homogeneous ecological systems involving interrelationships among organisms and their environment . A territorial subdivision of this type, based on natural physiographic boundaries, has been used in a number of recent studies and is appropriate for different analytical methods (e.g. [6,40]).
(c) Phylogenetic trees
In order to account for the phylogenetic relationships between the species and subspecies in the dataset, we used the genus-level time-calibrated phylogeny described by Molina-Venegas & Roquet . This phylogeny was inferred using a maximum-likehood approach with a multilocus supermatrix of chloroplastic and nuclear DNA sequences (rbcL, matK, ndhF, trnl-F, ITS1 and ITS2) available in GenBank (accessed in September 2011) following the pipeline of Roquet et al. . Node support was estimated using bootstrap values (70% of nodes showed a bootstrap support greater than 70), and nodes with values less than 50% were collapsed into soft polytomies (see Molina-Venegas and Roquet  for further details on the phylogenetic procedure).
In order to account for the possible influence of intra-genus phylogenetic uncertainty on phylogenetic structure metrics, we randomly resolved genus-level polytomies by applying a Yule-Harding branching process with constant birth rates (n = 100). The algorithm assigns to each node the same probability of splitting in two lineages, resulting in a balanced topology . Subspecies were constrained to split within their respective species.
(d) Statistical analysis
(i) Phylogenetic beta diversity
In order to test for non-random patterns in PBD among plant assemblages, we decomposed the observed PBD between the turnover and the nestedness components of beta diversity using the betadecompo R function and the PhyloSor index . This procedure allows us to distinguish between differences in PBD arising from (i) differences in PD (sensu , nestedness component) or from (ii) those that are due to the ‘true’ turnover of lineages. This function uses the PhyloSor index as a distance metric as it computes the fraction of non-shared branch length between assemblages (ranging from 0 to 1). Previous empirical studies emphasized that TBD and PBD may be highly correlated [44,45]. Thus, we determined whether the turnover and nestedness components of PBD among plant assemblages were lower or higher than expected given the observed TBD by calculating the standardized effect size (SES) scores of both components (SESturn and SESness) independently as 2.1where Xobs is the observed score between two ecoregions, and mean (Xnull) and s.d. (Xnull) are the mean and standard deviation of a null distribution of scores generated by shuffling taxa labels of the site-by-species matrix 999 times. It is important to note that this shuffling scheme randomizes phylogenetic relationships among the regional species pool but holds species richness and TBD constant between assemblages, thereby allowing us to discard the effect of the different number of species in each plant assemblage . We assessed the statistical significance of the SES scores between ecoregions by calculating two-tailed p-values (quantiles) as 2.2where rankobs is the rank of the observed scores compared with those of their null distributions, and ‘runs’ is the number of randomizations . SES scores with p < 0.05 and p > 0.95 were considered to be significantly lower and higher than expected for the given TBD, respectively. We conducted the calculations across 100 phylogenetic trees after randomly resolving intra-genus phylogenetic uncertainty (see ‘Phylogenetic trees’ section), then considered an SES score as significant when the fraction of p-values (n = 100) below and above the threshold value (0.05 and 0.95, respectively) was > 95%.
We considered three groups of pairwise comparisons in terms of the geographical location of the ecoregions (i) across ecoregions of Andalusia, (ii) across ecoregions of northern Morocco and (iii) across ecoregions between the two landmasses. We tested for non-random distributions of the significant pairwise comparisons within each group by conducting χ2-tests (significance threshold value at 0.05).
(ii) Phylogenetic alpha diversity
In order to quantify phylogenetic structure within plant assemblages, we estimated PD (sensu ) and the mean phylogenetic distance (MPD)  of the species inhabiting each of the ecoregions. We tested for non-random patterns in the PD and MPD by estimating their SES scores (SESpd and SESmpd, respectively) as in equation (2.1), where Xobs is the observed score within an ecoregion, and mean (Xnull) and s.d. (Xnull) are the mean and standard deviation of a null distribution of scores generated by shuffling taxa labels of the site-by-species matrix 999 times. We conducted the calculations across 100 phylogenetic trees after randomly resolving intra-genus phylogenetic uncertainty (see ‘Phylogenetic trees’ section), then extracted the arithmetic mean of the distribution of SESpd and SESmpd scores.
There is some evidence to suggest that estimates of the phylogenetic structure of plant assemblages containing a deep and unbalanced initial bifurcation in the phylogeny (e.g. angiosperms–gymnosperms, eudicots–monocots) are very sensitive to the ratio of the species sustained by each bifurcation . Thus, in order to test for the consistency of our phylogenetic structure estimates, we repeated all the analyses after removing all non-eudicot taxa from the dataset (n = 796, 18% of the species and subspecies in the dataset).
(iii) Climatic and spatial variables
We tested whether climatic conditions are associated with variation in the phylogenetic structure of the plant assemblages in the study area. For the climatic characterization of the ecoregions, we used the same 18 climatic variables as Molina-Venegas et al. . We considered precipitation- and temperature-related variables as two independent subsets of climatic variables. First, we explored the correlation coefficients among the climatic variables and arbitrarily selected those with Spearman's correlation coefficient r < 0.90 for precipitation- and temperature-related variables, respectively. We finally selected (i) total monthly precipitation, (ii) standard deviation of monthly precipitation, (iii) total precipitation of the driest month, (iv) mean monthly maximum temperature and (v) standard deviation of monthly maximum temperature. These subsets of precipitation- and temperature-related variables are representations of the climatic regime of the study area, which is characterized by a marked seasonality in the distribution of precipitation and extreme high temperature values during the dry season [28,48]. We standardized all variables using Gower's standardization of quantitative variables (division by the range). Given that climatic variables had a high degree of collinearity, we conducted separate principal components analyses for the temperature- and precipitation-related variables to reduce their dimensionality, and only used the first principal component of each analysis (PCA1). Finally, we used the geographical coordinates of the centroids of ecoregion polygons as an estimate of their spatial location.
Here, we use current climatic variation across the region rather than past climatic conditions. Nevertheless, former palaeoclimatic studies show that despite strong historical variation in absolute values in precipitation and temperature, the climatic gradient in the study region has been quite similar to that found nowadays at least since the Pliocene .
(iv) Regression analyses
First, we quantified whether temperature- and precipitation-related variables are associated with the phylogenetic structure within plant assemblages. First, we fitted ordinary least-squares models (OLS) between the SESpd and SESmpd scores and both PCA1 subsets of climatic variables, and sought spatial autocorrelation in the phylogentic structure metrics by visually exploring Moran's I spatial correlograms of the residuals of the OLS models (see electronic supplementary material, appendix S2 for details on the procedure). Because the data showed strong spatial autocorrelation, we fitted spatial error simultaneous autoregressive models (SARerr) to incorporate the autocorrelation structure of both phylogenetic structure metrics in the OLS models . Although changes in coefficients between spatial and non-spatial methods may be largely idiosyncratic of the method used , SAR models has been empirically demonstrated to be among the best-performing methods for spatial data analysis in terms of Type I and Type II error rates . Specifically, the SARerr model is preferable to other SAR models when dealing with spatially autocorrelated species distribution data as it generates more accurate estimates of regression model coefficients and performs well for all SAR model assumptions . We arbitrarily defined the neighbourhood structure of each ecoregion (two-dimensional coordinates) using the minimum Euclidean distance required to connect each location to at least one neighbouring location, and weighted neighbouring locations using the row-standardization method. We used the Nagelkerke pseudo R2 as an estimate of the goodness-of-fit of the SARerr models.
All analyses were carried out in R v. 3.1.2  with the packages PICANTE , APE , VEGAN , SPDEP , NCF  and the betadecompo R function . The extraction of the geographical coordinates was carried out in ESRI ArcMap v. 9.3.
(a) Phylogenetic beta diversity
The significant pairwise comparisons (i.e. those differing from expectation due to the TBD) of both components of PBD were non-randomly distributed across the study area. Lower-than-expected standardized scores of the turnover component (SESturn) were found mainly across ecoregions between landmasses (χ2 = 22.54, n = 26), while higher-than-expected SESturn scores were more numerous within landmasses (χ2 = 50.19, n = 170; figure 4). By contrast, lower-than-expected standardized scores of the nestedness component (SESness) were randomly distributed (χ2 = 1.64, n = 26), while higher-than-expected scores occurred mainly across ecoregions between landmasses (χ2 = 172.70, n = 393). Interestingly, the removal of non-eudicot plants from the analyses resulted in a positive reinforcement of the SESturn pattern between landmasses, but a negative reinforcement within landmasses (figure 4).
(b) Climatic variables
The first axis of the PCAs for temperature- and precipitation-related variables explained 65% and 69% of the variance, respectively. Positive values were related to high maximum temperatures values and to high total monthly precipitation and standard deviation of monthly precipitation, whereas negative values were related to high intra-annual variability in maximum temperature values and relatively high precipitations during the driest month. Overall, the PCA1 scores for temperature-related variables, and particularly those for the precipitation-related variables, tend to decrease eastward (see electronic supplementary material, appendix S3). On the other hand, the PCA1 scores for temperature-related variables tend to increase from north to south.
(c) Phylogenetic alpha diversity
When considering all angiosperms as a whole, the relationship between SESmpd and the PCA1 of the precipitation-related variables was strong and positive (R2 = 0.50, p < 0.001; table 1 and figure 5), whereas the effect of the temperature-related variables was non-significant. On the other hand, the relationship between SESpd and the climatic-related variables was also quite strong, but non-significant. When considering only eudicots in the analysis, the relationship between either SESmpd or SESpd and the PCA1 of the precipitation-related variables was non-significant, whereas the relationship between either SESmpd or SESpd and the PCA1 of the temperature-related variables was strong and negative (R2 = 0.53, p < 0.01 and R2 = 0.55, p < 0.001 respectively).
The incorporation of phylogenetic information in the study of species distributions and community assembly can provide valuable insights into the biogeographical and ecological processes that generate biodiversity patterns [3,5], as these processes leave tractable imprints on present-day phylogenetic patterns [4,29,58]. Here, we found that phylogenetic structure patterns of plant assemblages across two continental landmasses (i.e. southern Iberian Peninsula and northern Morocco) and across ecoregions within each land mass are non-randomly structured. This suggests that the particular geographical and historical settings of this region in the western Mediterranean may have constituted an important stimulus for the evolutionary diversification and the emergence of a major biodiversity hotspot .
(a) Phylogenetic structure between landmasses
A primary finding of our study is that both components of PBD showed a significant spatial structure. Lower-than-expected phylogenetic ‘true’ turnover mainly occurred across ecoregions between landmasses, which implies a segregation of closely related taxa between the two continents (figure 4). We believe that this pattern could have emerged because of repeated splits between the two landmasses in recently coexisting ancestors of extant taxa, probably to intermittent connections and spatial isolation (i.e. land-bridges) between the two continents (figure 2). The floristic similarity between the two landmasses is quite high (47% of taxa are shared; data not shown). This confirms that the Mediterranean Sea has biased species migration between northern Morocco and the southern Iberian Peninsula, in such a way that better dispersers ancestors have been less likely to experience vicariance through continuous gene flows, thus favouring speciation only in certain lineages with low long-distance dispersal ability . Molecular evidence further supports varying levels of the Mediterranean Sea and intermittent land-bridges as the most plausible explanation for species differentiation between southern Iberian Peninsula and northern Morocco in disparate angiosperm clades [59–62], and between other disjunct areas within the Mediterranean Basin [63–65]. These results illustrate how land-bridges between Europe and Africa have contributed to the emergence of this western Mediterranean biodiversity hotspot. Other land-bridges (e.g. the Sicilian–Tunisian bridge) may have acted in a similar way .
The predominance of the nestedness component of PBD across ecoregions between landmasses can be explained by their marked differences in PD (figure 5). Thus, the pattern of segregation of closely related taxa between the two continents would be masked by differences in PD if an undecomposed metric of PBD had been considered. This issue highlights the importance of disentangling both components of beta diversity in order to avoid overestimations of the ‘true’ spatial phylogenetic turnover .
(b) Phylogenetic structure within landmasses
Higher-than-expected phylogenetic ‘true’ turnover mainly occurred within landmasses, which implies that the overall floristic differences observed between the western and eastern ecoregions of the study area  are mainly driven by changes in the distribution of deep lineages within continents (figure 4). We also found a strong relationship between the mean phylogenetic distance within plant assemblages (SESmpd) and the precipitation-related variables that match the general decreasing precipitation gradient eastward in the study area (figure 5). This pattern can be explained largely by the differential distribution of deep angiosperm lineages along the precipitation gradient as the observed pattern weakens when considering only eudicots in the analysis. Prinzing et al.  found that a high amount of variance in niche-related traits in vascular plants can be explained at higher taxonomic levels, thereby further supporting the hypothesis of an ecological sorting of deep angiosperm lineages along the precipitation gradient in the study region. This was probably because of a certain tendency of phylogenetic niche conservatism of higher plants; that is, the tendency of taxa to retain their ancestral traits adapted to particular climates . Thus, the ecological restriction and long-term local persistence (thanks to micro-climatic refugia) of some relatively old, water-dependent lineages of non-eudicot plants (e.g. Laurus, Nuphar, Nymphaea, Triglochin, Butomus, Hydrocharis, Wolffia, Spirodela, Ceratophyllum) could have caused the observed phylogenetic overdispersion of plant assemblages in the western ecoregions. By contrast, plant assemblages in the eastern ecoregions of the study area tend to be phylogenetically clustered because of the widespread occurrence of more xeric-adapted lineages that have experienced in situ diversification, such as Helianthemum, Thymus, Sideritis, Teucrium, Arenaria and Armeria, and the absence of the non-eudicot hygrophilous taxa.
The geological materials that shape most of the Baetic and Rifan Cordilleras accreted at the southeast Iberian Peninsula and northwest Africa around 10 Ma . The accretion of such geological materials is roughly coincident with the oldest records of the Mediterranean climate in the region , and was followed by the rapid uplift of the Baetic and Rifan mountains that began around 8 Ma . These new combinations of climatic and geological conditions probably generated unoccupied environmental conditions such as contrasting substrates in eastern areas of the territory  that provided an ecological opportunity for the diversification of certain lineages, thereby contributing to phylogenetic clustering . Further, Verdú & Pausas  found evidence of in situ diversification of certain lineages in the eastern Iberian Peninsula due to the onset of the Mediterranean climate (e.g. Helianthemum, Thymus, Sideritis, Teucrium, Arenaria and Armeria). Nevertheless, future comparative studies of diversification focusing on lineages that are well represented in the Baetic–Rifan complex and beyond will help to improve our understanding of the different ecological drivers of species diversification, and unravel general mechanisms that have contributed to the emergence of Mediterranean biodiversity hotspots [73,74].
The datasets and the phylogenetic tree supporting this article are deposited in Dryad repository: http://dx.doi.org/10.5061/dryad.54137.
We declare we have no competing interests.
R.M.-V. conceived the study, designed the study, collected the data, carried out the statistical analyses and drafted the manuscript. A.A. collected the data, helped in drafting the manuscript and participated in conceiving the study. S.L. participated in data analyses and helped in drafting the manuscript. J.A. helped in drafting the manuscript and participated in conceiving the study. All authors gave final approval for publication.
This work was supported primarily by the Spanish Organismo Autónomo de Parques Nacionales through grant no. 296/2011, and secondarily by the Spanish Ministerio de Ciencia e Innovación and the Regional Government of Andalucía through grant nos. HF-2008-0040 and P09-RNM-5280.
We would like to thank the Centre for Supercomputing of Galicia (CESGA) and the Scientific Computation Centre of Andalusia (CICA) for the computing services they provided.
- Received May 15, 2015.
- Accepted June 30, 2015.
- © 2015 The Author(s)
Published by the Royal Society. All rights reserved.