The megadiverse haplochromine cichlid radiations of the East African lakes, famous examples of explosive speciation and adaptive radiation, are according to recent studies, introgressed by different riverine lineages. This study is based on the first comprehensive mitochondrial and nuclear DNA dataset from extensive sampling of riverine haplochromine cichlids. It includes species from the lower River Congo and Angolan (River Kwanza) drainages. Reconstruction of phylogenetic hypotheses revealed the paradox of clearly discordant phylogenetic signals. Closely related mtDNA haplotypes are distributed thousands of kilometres apart and across major African watersheds, whereas some neighbouring species carry drastically divergent mtDNA haplotypes. At shallow and deep phylogenetic layers, strong signals of hybridization are attributed to the complex Late Miocene/Early Pliocene palaeohistory of African rivers. Hybridization of multiple lineages across changing watersheds shaped each of the major haplochromine radiations in lakes Tanganyika, Victoria, Malawi and the Kalahari Palaeolakes, as well as a miniature species flock in the Congo basin (River Fwa). On the basis of our results, introgression occurred not only on a spatially restricted scale, but massively over almost the whole range of the haplochromine distribution. This provides an alternative view on the origin and exceptional high diversity of this enigmatic vertebrate group.
Cichlid fishes of the haplochromine lineage gave rise to one of the most spectacular vertebrate radiations on our planet, the megadiverse cichlid species flocks endemic to the East African Great Lakes and southern Africa [1,2]. Reconstructing robust phylogenetic relationships of haplochromine cichlids has proved to be difficult owing to limited taxon sampling and lack of phylogenetically informative characters. Until recently, phylogenetic analyses have delivered support for the monophyly of each of the major haplochromine species flocks, including the Lake Victoria superflock , southern African serrannochromines and Tropheini of Lake Tanganyika [1,3–5]. The assumed monophyly of Lake Malawi haplochromines was recently falsified after inclusion of riverine haplochromines [6,7]. In general, available comprehensive analyses of haplochromines including several important riverine lineages have relied exclusively on mitochondrial DNA (mtDNA) [8,9]. Multilocus nuclear datasets are available only for a subset of haplochromine taxa focusing on particular subgroups from the Lake Victoria region , Lake Malawi [6,7] or Lake Tanganyika . In these analyses, riverine haplochromines inhabiting different regions of the Congo basin and Angola are represented by very few Congo basin taxa , whereas those in the Upper Kwanza basin (Angola) have yet to be represented at all. This is unfortunate because it has been assumed that Lake Tanganyika, on the edge of vast Congo basin, is the origin of haplochromine diversification . Furthermore, several studies have identified selected Congolian haplochromines as sister taxon to ‘modern haplochromine’ sublineages , or as sistergroup to members of the southern African serranochromine species flock, which is assumed to have originated in the Kalahari Palaeolakes, previously referred to as the Makgadikgadi flock [1,8,9,11].
Other studies have provided phylogenetic evidence for massive introgression and hybridization among ancient lineages in evolving species flocks [12–15], but phylogenetic tests for a potential contribution of hybridization to the evolution of haplochromine lineages remain scarce (, but see [10,16]). Here, our extensive taxon sampling focuses on riverine haplochromines. It includes key Angolan and most haplochromine species from the Congo basin. Phylogenetic hypotheses are based on more than 2000 amplified fragment length polymorphism (AFLP) loci as well as two mitochondrial genes. Critically, following Seehausen , an experimental approach enabled us to recover otherwise obscured phylogenetic signals in the reticulate dataset. It allowed us not only to decipher the phylogeographic origin of novel riverine hybrid lineages, which had formed after secondary contact of previously isolated drainages, but also to estimate their contribution to the origin and diversification of megadiverse lacustrine cichlid species flocks.
2. Material and methods
Sampling focused on an extensive and representative coverage of all major haplochromine lineages (see the electronic supplementary material, table S1) and biogeographic regions (figure 1). Tilapia bilineata as well as Lamprologus sp., were chosen as outgroups based on results in Schwarzer et al. . The mitochondrial ND2 and part of the cytochrome b gene (Cytb) were amplified and sequenced for 67 taxa (48 species), using primers MET and TRP  and L14724 H15149 . AFLP genotypes were obtained for the same species (n = 68, see the electronic supplementary material, table S1). Peaks between 100 and 499 bases were scored unambiguously for presence/absence. The analysis was conducted automatically using Genemapper v. 4.0. Eight individuals were genotyped twice to test for reproducibility. The error rate per individual was calculated as the ratio between observed number of differences and the total number of scored fragments , resulting in a mean error rate of 0.03. Sequence data of mitochondrial Cytb and ND2 genes and the AFLP matrix have been deposited in GenBank under the accession numbers JX156995–JX157126 and in the Dryad data repository (doi:10.5061/dryad.72h4m).
(a) Phylogenetic inference
Sequence alignment was conducted using the ClustalW  algorithm implemented in Bioedit v. 22.214.171.124. . Identification of ambiguous alignment positions was carried out using Aliscore v. 0.2 under default settings , and identified positions were manually deleted. We used a partition separating first and second codon positions from the third. The GTR + Γ model best-fitted variability in codon positions one and two, whereas the HKY + Γ model was assigned to third codon positions based on results from the Bayes factor test . Bayesian analyses were performed using MrBayes v. 3.1.2  including two parallel runs each having 106 generations starting with random trees and sampling of trees every 1000 generations. To ensure convergence, the first 105 generations of each run were treated as burn-in and excluded. The remaining trees from all Bayesian analyses were used to build a 50 per cent majority rule consensus tree. A maximum-likelihood (ML) phylogenetic analysis was conducted using RAxML v. 7.0.3  using the GTR + Γ model and the rapid bootstrap algorithm with following search for the best-scoring ML tree. Branch support was evaluated based on non-parametric bootstrapping (BS) consisting of 1000 pseudoreplicates. For the AFLP data, a neighbour joining tree was calculated using TreeCon v. 1.3 based on Link et al.'s  algorithm that takes shared fragments into account and ignores shared absent bands . BS values were calculated based on 1000 pseudoreplicates.
(b) Inferring hybrid signal
Following Seehausen , we applied a tree-based method to test for homoplasy excess in our dataset. The expectation is that the inclusion of hybrid taxa increases the conflict in the dataset and reduces support values for affected nodes in a phylogenetic tree more strongly than the inclusion of non-hybrid taxa . The exclusion of a hybrid taxon from the dataset should therefore increase support values only for affected nodes. This detection of potential hybrid signal focuses on the AFLP dataset, as hybridization cannot be reliably detected in maternally inherited mitochondrial markers . All clades showing discordant signal between the nuclear (nc) and mitochondrial (mt) trees as well as all monophyletic clades (in the ncDNA tree) were successively removed from the dataset resulting in 86 removal experiments (see the electronic supplementary material, figure S1). Subsequently, distance trees were built for each reduced dataset with 500 bootstrap replicates using TreeCon v. 1.3. The resulting trees and BS support values were checked manually for all remaining clades. Results of the homoplasy excess test were visualized in boxplots for major phylogenetic nodes with initially low BS support values. To test for random effects on BS support, additional removal experiments were conducted with a certain number of randomly chosen taxa. The number of excluded taxa depended on the number of individuals that caused an effect on node support and ranged from n = 1 to 6. For each n, the random removals were repeated 100 times. Tree generation and BS support evaluation was conducted as described earlier. A heatmap based on BS outliers was generated representing the change in BS support values for all removal experiments over the whole dataset. Outliers were defined following Tukey , as data points located outside the 1.5* inter-quartile distance displayed in boxplots as whiskers.
(c) Dating and diversification rates
On the basis of the mtDNA dataset a relaxed-clock, Bayesian approach implemented in BEAST v. 1.6.2  was applied. Uniform priors were set for the split of lamprologines from haplochromines based on the 95% confidence intervals obtained by Schwarzer et al. . Constraints were set from 10.6 to 20.4 Ma in run A (including the fossil prior Oreochromis lorenzoi† ), and from 12.8 to 28.4 Ma in run B (without the fossil prior). The Bayesian tree was used as starting tree, the Yule model selected as tree prior and an uncorrelated lognormal model was applied to estimate rate variation along branches. The analysis was run for 10 million generations, and the effective sample size was checked using Tracer v. 1.4. .
The final AFLP dataset comprised 68 taxa with 2106 AFLP loci. Of these, 1984 (1889 without outgroups) fragments were polymorphic. The ND2 dataset consisted of 1022 bp and the Cytb dataset of 405 bp (total = 1427 bp) with 399 (ND2) and 130 (Cytb) variable sites and empirical base frequencies of A = 0.26, C = 0.35, G = 0.12, T = 0.27 and A = 0.24, C = 0.30, G = 0.17, T = 0.29, respectively. The mean sequence divergence of the mitochondrial dataset was 0.086 ± 0.029. ‘Haplochromis’ snoeksi failed to amplify for one AFLP primer combination (ACT–CAT*), the missing data were coded as undefined character states. To legitimate this approach, the AFLP analysis was also conducted with a reduced dataset of 11 primer-combinations, showing no topological differences (data not shown).
(a) Phylogenetic hypotheses based on AFLP data and mtDNA
Analyses of AFLPs and mitochondrial genes yielded statistically highly supported phylogenetic hypotheses. In the AFLP dataset, 11 major clades are discernable (figure 2) reflecting a mostly congruent biogeographic division into an eastern, a Congolian and a southern group. ‘Haplochromis’ cf. bakongo and ‘H’. snoeksi from lower River Congo tributaries, however, appear closer to the southern clade (figure 2, node E), rendering the Congolian clade paraphyletic. On the basis of the AFLP dataset, the single-included Pseudocrenilabrus captures a position as sistergroup to all remaining haplochromines (figure 2, BS = 99). Several, but not all of the rheophilic haplochromines currently assigned to Orthochromis, i.e. O. cf. stormsi ‘Kinshasa’, O. stormsi ‘Kisangani’, O. polyacanthus and O. sp. aff. kalungwishiensis, form sistergroup to the Tropheini from Lake Tanganyika, the East African clades, the Congolian clade, ‘H.’ snoeksi and ‘H.’ cf. bakongo and the southern clades (figure 2, node H, BS = 78). Members of the East African clades (figure 2, BS = 92) appear as sistergroup to the Tropheini (figure 2, BS = 99 and node F, BS = 62). On the basis of the ncDNA dataset, Astatoreochromis alluaudi and ‘H.’ burtoni capture a sistergroup position to the remaining East African clades (BS = 92, BS = 100). On the basis of the mtDNA tree, however, A. alluaudi appears together with Pseudocrenilabrus multicolor and O. sp. aff. kalungwishiensis unresolved at the base of the East African clades and Tropheini (BS < 50), and ‘H.’ burtoni appears within the Lake Victoria superflock (BS/Bayesian posterior probability (BPP) = 51/0.64). Lake Malawi haplochromines form a monophyletic group in both trees (BS = 100, BS/BPP = 81/0.93). Interestingly, the recently discovered ‘H.’ sp. ‘Yaekama’ falls into the Lake Victoria superflock clade based on both trees (figure 2, BS = 100 and 100/1.0) even though it is distributed in the northeastern River Congo drainage (near Kisangani). The East African clades and Tropheini (figure 2, node F, BS = 62) appear as sistergroup to the Congolian clade (BS = 100), ‘H.’ snoeksi and ‘H.’ cf. bakongo (BS = 76) and the southern clades (BS = 60) based on the ncDNA dataset. The integrity of the southern clades is only weakly supported, as is the sistergroup relationship of ‘H.’ cf. bakongo/‘H.’ snoeksi and the southern African clades (figure 2, node E, BS = 60). Within the Congolian clade, the Pool Malebo and central Congo basin ‘H.’ polli and ‘H.’ oligacanthus and the three lower River Congo rapids species ‘H.’ fasciatus, ‘H.’ demeusii and ‘H.’ sp. ‘Sanzikwa’ appear monophyletic (BS = 100). Within the southern clades, all included species from rivers Fwa, Kasai and Kwango (all Kasai drainage, Congo basin) form a monophyletic group (BS = 79) that is sistergroup (BS = 60) to species from the Angolan River Kwanza system and another rheophilic species, O. torrenticola (figure 2, BS = 83). Except for the smaller O. torrenticola and the Serranochromis sp. ‘red scales’ clades, none of the major clades supported by the ncDNA phylogeny is recovered in the tree based on mtDNA (figure 2).
Two well-supported, but geographically heterogeneous clades are recovered in the mtDNA phylogeny. One is composed of members of the East African clades and ‘H.’ fasciatus/‘H.’ demeusii and ‘H.’ sp. ‘Sanzikwa’ from the lower River Congo (BS/BPP = 77/0.51) with the Tropheini as sistergroup (BS/BPP = 79/0.96) and the other one of ‘H.’ snoeksi/‘H.’ cf. bakongo and species from River Fwa (BS/BPP = 95/1.0) forming the sistergroup to the Angolan Serranochromis sp. ‘red scales’, O. stormsi/O. cf. stormsi ‘Kisangani’ and O. polyacanthus, members of the southern clades and the Congolian clade composed of ‘H.’ polli and ‘H.’ oligacanthus (BS/BPP = 98/1.0). ‘Haplochromis’ polli and ‘H.’ oligacanthus appear as sistergroup to species from southern African rivers Kwango, Kasai, upper River Kwanza, and the serranochromine radiation in the Kalahari Palaeolakes (BS/BPP = 68/0.99, figure 2).
(b) Dating and diversification rates
Owing to extensive reticulate signal in the dataset, age estimates based on mtDNA do not serve as estimator of the actual age of haplochromine radiations or species flocks, but rather represent the age of mitochondrial diversity and the timing of mitochondrial introgression events. Age estimates for the five major introgression events within the haplochromines (indicated by dotted lines in figure 2) ranged between 2.4 and 11.33 Ma, or between 2.8 and 15.2 Ma depending on calibration priors (with or without fossil prior; electronic supplementary material, table S2). The obtained time frames largely matched those from previously published studies, except for the age of the Lake Victoria superflock, which appears older in our dataset (see the electronic supplementary material, table S2). The lack of calibration points for terminal nodes and the limited taxon sampling for the Lake Victoria superflock probably caused this uncertainty .
(c) Cytonuclear discordance and homoplasy excess test
Cytonuclear discordances indicating hybridization events are present throughout the whole haplochromine phylogeny. Major discrepancies between the mtDNA and AFLP (ncDNA) phylogenetic hypotheses are as follows. (i) The Congolian species (without ‘H.’ snoeksi and ‘H.’ cf. bakongo) appear monophyletic in the tree based on ncDNA (figure 2), but based on the mtDNA data, the two Congolian subclades (‘H.’ polli/‘H.’ oligacanthus and ‘H.’ fasciatus/‘H.’ demeusii/‘H.’ sp. ‘Sanzikwa’) are deeply nested within southern clade species or East African clades, as sistergroup to members of the Lake Victoria superflock. ‘Haplochromis’ snoeksi and ‘H.’ cf. bakongo form a sistergroup to the southern clades based on ncDNA data, but in the tree based on mtDNA, they are sistergroup to species from the River Fwa (Schwetzochromis neodon, ‘H.’ callichromus, ‘H.’ brauschi and Cyclopharynx schwetzi; figure 2). Within the River Fwa species, ‘H.’ brauschi appears as sister to C. schwetzi and ‘H.’ callichromus based on the ncDNA data (BS = 86), but to C. schwetzi based on mtDNA data (BS/BPP = 100/1.0). (ii) The rheophilic haplochromines O. stormsi ‘Kinshasa’/O. stormsi ‘Kisangani’, O. polyacanthus and O. sp. aff. kalungwishiensis are based on the ncDNA dataset monophyletic (BS = 94) but O. stormsi ‘Kinshasa’/O. stormsi ‘Kisangani’ and O. polyacanthus are nested within a clade of southern species and the ‘H.’ polli/‘H.’ oligacanthus clade (BS/BPP = 81/0.95) and O. sp. aff. kalungwishiensis appears as sistergroup to P. multicolor based on the mtDNA data (BS/BPP = 95/1.0). (iii) Astatoreochromis alluaudi forms the sistergroup to the remaining species from the East African clades based on ncDNA data (BS = 92), but remains unresolved at a basal position based on the mtDNA dataset. Further cytonuclear discordances within the East African clades are present concerning the position of ‘Haplochromis’ flaviijosephi and ‘H.’ burtoni. ‘Haplochromis’ flaviijosephi is sister to members of the Lake Victoria superflock based on ncDNA (BS = 95), but sister to the Lake Malawi clade based on the mtDNA dataset (BS/BPP = 66/0.57). ‘Haplochromis’ burtoni appears as sister to the remaining East African clades (without A. alluaudi) based on the ncDNA (BS = 92), but is nested in a clade composed of the Lake Victoria superflock and ‘H.’ desfontainii based on the mtDNA dataset (BS/BPP = 64/0.95). (iv) Within the southern clade, several discordances are obvious concerning within-group relationships (figure 2). Serranochromis robustus clusters within the upper River Kwanza/Kalahari Palaeolakes clade based on ncDNA data (BS = 100), but not so based on mtDNA data (figure 2), where it appears in a clade with rivers Kwango/Kasai and River Kwanza/Kalahari Palaeolakes species (BS/BPP = < 50/0.76).
In 86 removal experiments, effects on different BS values are evident across the whole AFLP haplochromine phylogeny (figures 3 and 4). Strong effects with more than 50 per cent increase or decrease based on the mean BS value exist concerning the support values of five nodes (9, C, D, 70 and F, figure 3; see the electronic supplementary material, figure S1) and medium effects (25% increase or decrease of BS) are present on six additional nodes (A, 8, B, E, 42 and G, figure 3; see the electronic supplementary material, figure S1). Removals causing an increase in BS values (indicating a decrease of homoplastic signal in the dataset) are caused mainly by members of the following subclades: upper River Kwanza/Kalahari Palaeolakes, S. sp. ‘red scales’, O. torrenticola, ‘H.’ snoeksi/‘H.’ cf. bakongo and River Congo clades, Tropheini, Lake Victoria superflock, riverine haplochromines and P. multicolor (figures 2 and 3). Boxplots were generated for BS values of nodes with an initially low support in the AFLP tree (indicated by letters A–H in figure 2). The node support for the Kalahari Palaeolakes species flock increases when two species from the flock (Serranochromis altus and S. angusticeps) are removed from the dataset (node A, BS = 58–86, figure 4). Support for node B comprising species from the Angolian River Kwanza and Kalahari Palaeolakes is increasing when O. torrenticola or S. sp. ‘red scales’ are removed (node B, BS = 74–96 or 97, respectively, figure 4). Effects on node support values of species from River Fwa are present when ‘H.’ brauschi or the sistergroup to the River Fwa clade from rivers Kasai and Kwango is removed (node C, BS = 60–90 or 95, figure 4). An exclusion of ‘H.’ cf. bakongo and ‘H.’ snoeksi entails an increase in BS for node D comprising the southern African clades (BS = 60–89). Exclusion of ‘H.’ polli and ‘H.’ oligacanthus increases the BS of node E, comprising the southern African clades and ‘H.’ cf. bakongo/‘H.’ snoeksi (BS = 60–81), and the removal of the Tropheini member T. moorii (but not ‘H.’ horei) and of the rheophilic clade leads to an increase in BS for the East African clades and Tropheini (node F, BS = 62–80 or 90, figure 4). The BS for the node comprising all haplochromines excluding the rheophilic haplochromines and Pseudocrenilabrus increases, when all rheophilic haplochromines, O. sp. aff. kalungwishiensis or P. multicolor are removed (node G, BS = 54–82, 96 or 99 respectively, figure 4).
Recent molecular studies indicate significant introgression, through hybridization, of riverine haplochromines into the Lake Malawi species flock [6,7]. The inclusion of additional riverine haplochromines covering almost the whole range of their distribution in this study, however, highlights for the first time, to our knowledge, to which large extent hybridization has shaped the evolution of haplochromines. Nuclear data, based on more than 2000 AFLP markers, reflect close relationships of geographically adjacent haplochromine species and clades (figures 1 and 2), whereas the mtDNA data yield clearly conflicting phylogenetic signal. Combining phylogenetic information of both datasets, each reflecting different parts of evolutionary history, with analytical approaches targeted to unravel hybridization has proved necessary to decipher the complex evolution of haplochromine cichlids [5–7]. Theoretically, cytonuclear discordance can also be explained by ancient shared polymorphisms as a result of incomplete lineage sorting . A strong argument against incomplete lineage sorting in the present case is the unequal spatial distribution of well-separated mtDNA haplotypes and evidence from removal experiments based on AFLP data, showing that homoplasy excess is induced by single species or clades and not randomly (figures 3 and 4).
(a) Gene flow patterns can be explained through landscape evolution
The age of haplochromine cichlid diversification is a highly debated topic [8,35,36]. The lack of reliable fossil and geological calibration points and the high amount of reticulate signal within the haplochromines make it difficult to calculate robust age estimates. Here, we have used constraints lying outside the haplochromine cichlid radiations  to estimate the timing of mitochondrial introgression events within the radiation. All major introgression events were constrained in the Late Miocene/Early Pliocene (2.4–11.3 Myr ago or 2.8–15.2 Myr ago, depending on calibration priors; see the electronic supplementary material, table S2). They coincided with pronounced Neogene tectonics that reshaped drainage systems across Central and East Africa, with focal impacts across the northern Kalahari Plateau, especially the Congo–Zambezi watershed and the Eastern African Rift System. This episode of landscape evolution across the core area of ‘explosive’ haplochromine diversification, including the Albertine Rift [37–39], may be of focal relevance to understand the evolution of the lacustrine species flocks. New links between ancient watersheds connected previously isolated cichlid lineages, as tectonic activity caused river captures across the region, e.g. the Chambeshi/upper Kafue disruption . Our results identify several rivers as foci of drainage evolution, specifically river captures, previously not recorded for the Kalahari Plateau [37–39]. Phylogeographic signatures of ancient hybridization testify to these geomorphic events, and shed new light on the role of landscape evolution as one determinant of haplochromine megadiversity: (i) phylogenetically closely related mtDNA haplotypes are shared across present day separated drainage systems, i.e. across upper Kwanza–Okavango and Congo–Zambezi watersheds (figure 2); (ii) close phylogenetic relationships based on mtDNA are evident among haplotypes from ‘Haplochromis’ species of rivers Fwa, Inkisi and Kwilu; and (iii) putative hybrid taxa linking the southern clade with the Congolian lineage formed at geographically intermediate locations in the Lufira (O. torrenticola) and the upper Lucalla River (Serranochromis sp. ‘red scales’).
Our results point to landscape dynamics as the prominent control of cichlid evolution. These events that reshaped drainage topology provided multiple opportunities for accumulation of lineage specific alleles leading to phenotypic divergence, and repeated reconnection of discrete lineages in a dynamically changing landscape. It appears these events have led to substantially increased levels of novel phenotypes in haplochromine species flocks inhabiting the southern Congo basin and the northern Kalahari plateau. The molecular dates on nodes in this expanded haplochromine phylogeny have the potential to complement recent geobiological studies [37,39,40]. Generally, node age estimates of the evolutionary events in aquatic biota set constraints on when formative events reshaped river topologies. Specifically, the here presented geobiological evidence for cichlid evolution sets hitherto unavailable chronological constraints—in the Late Neogene—on when tectonism reshaped the Southern Equatorial divide.
(b) Long-distance dispersal of East African haplotypes
A sistergroup relationship of mtDNA haplotypes is evident between East African haplochromines and a small group of narrow endemics, today confined within the lower River Congo rapids (figure 2). Interestingly, ‘Haplochromis’ species of the lower River Congo rapids are parapatrically distributed, corresponding to a trisection of the river stretch  in upstream (‘H.’ polli), central (‘H.’ demeusii) and downstream (‘H.’ fasciatus) reaches of the lower River Congo. The distribution of ‘H.’ polli expands further upstream into the central Congo basin (figure 1). Surprisingly, ‘H.’ fasciatus/‘H.’ demeusii carry haplotypes closely related to those of the East African haplochromines, whereas mt-haplotypes of their directly neighbouring upstream relatives (according to ncDNA data) ‘H.’ polli and ‘H.’ oligacanthus are rather distinct (figure 2). There is no trace of introgression (indicated by homoplasy excess) detectable in the ncDNA dataset (figure 3). A complete replacement of mtDNA haplotypes can occur without any evidence of nuclear introgression as shown for salmonids, mountain hare, green pond frogs as well as for cichlids [41–45]. The mtDNA haplotype can become fixed by chance (drift) or by positive selection . As there are no obvious indications for a selective advantage, potential drift through spatial isolation of the downstream lower River Congo species, as indicated by Schwarzer et al. , can explain the retention of the ancient, eastern mtDNA haplotypes. A connection of the River Congo with eastern African drainage systems (including the Lake Victoria basin and other rift lakes) is indicated by the recently detected occurrence of ‘modern’ haplochromines in the eastern River Congo drainage (J. Schwarzer & U. K. Schliewen 2009, personal observation): ‘H.’ sp. ‘Yaekama’, from the upper River Congo near Kisangani, for example, carries ocellated egg-spots typical of East African ‘modern’ haplochromines and clusters with members of the Lake Victoria superflock based on our mtDNA and ncDNA datasets (figure 2). Interestingly, Verheyen et al.  identified the mitochondrial sistergroup to the remaining Lake Victoria region modern haplochromines in Lake Kivu, a lake that is geographically intermediate between the Lake Victoria region and the central River Congo area around Yaekama. This ichthyogeographical connection is supported by closely related non-cichlid fish species occurring both on the east and on the west side of the northern Albertine Rift, i.e. poeciliids of the genus Hypsopanchax , and points to an area of connection between the Nile and the Congo basin drainages around the upper Ituri drainage (a northeastern River Congo affluent) and the Nilotic Lake Kivu/Lake Edward region .
(c) Lacustrine origin of haplochromine species diversity?
Hybridization between distantly related lineages can increase genetic and phenotypic potential, and thus favour the onset of rapid adaptive radiations . Empirical evidence comes from different studies in animals and plants [49–51], and initial hybridization is hypothesized to have shaped the cichlid radiations of the Kalahari Palaeolakes  and Lake Malawi . A close relationship of the geographically proximate upper River Kwanza species and the Kalahari Palaeolakes haplochromines is supported by our data (figure 2). The occurrence of closely related mtDNA haplotypes of upper River Kwanza and Kalahari Palaeolakes haplochromines (figure 2), as well as excess homoplasy in the Palaeolakes species flock (figure 4), supports the previously hypothesized hybrid origin of this young lake radiation . Our data further suggest that lineages of the upper River Kwanza, upper River Zambezi and River Congo contributed to the hybrid swarm origin of the proposed Kalahari Palaeolakes radiation .
River Fwa, represented by four of five existing species in our dataset, appears to harbour a monophyletic species flock (though with low BS support). On the basis of ncDNA, the species of River Fwa are closely related to neighbouring species in rivers Kwango and Kasai (figure 2), but based on mtDNA they appear closely related to ‘H.’ cf. bakongo and ‘H.’ snoeksi from the lower River Congo system. Controlling for reticulation by removing species from rivers Kasai, Kwango and Fwa unequivocally increased BS for River Fwa monophyly, especially when removing the River Fwa ‘H.’ brauschi (figure 4, node C). Such evidence for reticulate (‘homoplastic’) genetic signals reveals how the hybrid origin of the radiation was seeded by distantly related Congolian lineages that thereafter became geographically isolated.
Our results clearly show that hybridization among both ancient (Orthochromis, Pseudocrenilabrus and Astatoreochromis) and recent (lower River Congo ‘Haplochromis’) riverine lineages qualifies as a fundamental event in the evolution of haplochromine radiations. This does not refute the assumed lacustrine origin of the East African clades, but underscores a much more complex history for the megadiverse haplochromine radiations. It is important to acknowledge how alternating wet and dry climates over the Late Cenozoic affected the entire East African region, as shown from sediment cores of its lakes [52–55]. These events probably complemented more widespread impacts of the Neogene tectonism across southern and east Africa, where uplift of the Kalahari Plateau [37,38,56] transfigured the drainage, exemplified in the incision of the lower River Congo rapids into Africa's western margin [40,57]. The critical consequence of these impacts on upper River Congo affluents was to reshape haplochromine distributions, thereby facilitating extensive hybridization between previously separated lineages.
Hybridization among even geographically distant haplochromine lineages has had a major influence on the evolution of haplochromines. Our results demonstrate trans-watershed dispersal of mtDNA haplotypes present within haplochromines as well as strong, multiple signals of introgression and potential hybrid speciation, thereby questioning simplistic assumptions about the evolution of some of the major haplochromine lineages. Our results ask for more thorough phylogenetic analyses, including tests of reticulate evolution on the basis of fully representative taxon sampling of both riverine and lacustrine haplochromine lineages.
This work was supported by grants of the Deutsche Forschungsgemeinschaft to B.M. (DFG MI649/8-1) and U.K.S. (SCHL567/4-1) and a graduate student grant of the University of Bonn as well as a travel grant from the supporters of the ZFMK to J.Sch. We gratefully thank E. Schraml, O. Seehausen and L. Rüber for providing tissues. We thank R. Schelly, D. Neumann, J. G. Frommen, P. Alibert, A. Dunz, M. Levy, V. Mamonekene, D. Tweddle, A. Ibala-Zamba, J. Punga, U. Ali-Patho, P. Mongindo, C. Danadu, F. Bapeamoni and T. Kadangé Ngongo for indispensable assistance in the field. Special thanks go to K. Langen, F. Eppler and B. Müller who helped with laboratory routines. The Angolan samples came from surveys that were coordinated by D. Neto, conducted by staff from SAIAB and INIP, and funded by the National Research Foundation (South Africa). We thank two anonymous referees and R. Peters for valuable comments on an earlier version of the manuscript.
- Received July 18, 2012.
- Accepted August 16, 2012.
- This journal is © 2012 The Royal Society