The Palenque, a black community in rural Colombia, have an oral history of fugitive African slaves founding a free village near Cartagena in the seventeenth century. Recently, linguists have identified some 200 words in regular use that originate in a Kikongo language, with Yombe, mainly spoken in the Congo region, being the most likely source. The non-recombining portion of the Y chromosome (NRY) and mitochondrial DNA were analysed to establish whether there was greater similarity between present-day members of the Palenque and Yombe than between the Palenque and 42 other African groups (for all individuals, n = 2799) from which forced slaves might have been taken. NRY data are consistent with the linguistic evidence that Yombe is the most likely group from which the original male settlers of Palenque came. Mitochondrial DNA data suggested substantial maternal sub-Saharan African ancestry and a strong founder effect but did not associate Palenque with any particular African group. In addition, based on cultural data including inhabitants' claims of linguistic differences, it has been hypothesized that the two districts of the village (Abajo and Arriba) have different origins, with Arriba founded by men originating in Congo and Abajo by those born in Colombia. Although significant genetic structuring distinguished the two from each other, no supporting evidence for this hypothesis was found.
In many locations throughout the Caribbean and Latin America during the Atlantic slave trade, runaway slaves in pursuit of freedom established fortified villages. In Colombia, these walled towns (known as palenques) were famed for their resistance to the Spanish military conquest. This reputation is evident from colonial records which tell how inhabitants successfully repulsed attacks by the authorities . Despite their resistance, Palenque de San Basilio is the only Palenque to have survived to the present day .
Palenque de San Basilio (Palenque for short) is located some 70 km south east of the regional capital of Bolivar, Cartagena, in north west Colombia (10.1° N, 75.2° W) (figure 1). The residents comprise a community of about 3500 individuals divided between two major districts, Arriba and Abajo, although the reason for the division is not established [2,3]. They have remained largely isolated from the prevailing Colombian culture, living by subsistence farming together with cattle husbandry . Their oral history is one of descent from a group of male slaves who escaped captivity early in the seventeenth century from nearby Cartagena (then a major centre of the Latin American slave trade ).
Interestingly, Palenque is the only Colombian black community that speaks a creole Spanish known as Palenquero . Linguistic analysis of this creole led to the suggestion that the language of the founding group originated in the area of present-day Congo and/or northern Angola [4,7]. More recently, detailed lexical research has established that Palenquero contains more than 200 words of African origin [8,9] and that Kikongo is the only demonstrable donor of the vocabulary . The Kikongo group of languages encompasses several extant tongues which are spoken by approximately 1 million people in the Republic of the Congo . Although the recorded vocabulary in Palenquero does not suggest a particular origin for it, the ritual vocabulary  and oral history  suggest that Yombe is the most probable source of this language. Today Yombe is spoken by the Yombe people, an ethnic group living mainly in Pointe-Noire (Republic of the Congo). Furthermore, many members of the Palenque community have claimed that (i) Arriba residents are more traditional and have better conserved Palenquero than their Abajo counterparts; and (ii) the founding men of Arriba were born in Congo while Abajo was populated by Maroons born in Colombia (Y. Moñino, field work in Palenque, 2012 and ).
To clarify these questions concerning Palenque history, we undertook a genetic analysis of individuals from both Arriba and Abajo. DNA analysis has proved useful in revealing origins of ethnic communities (e.g. [13–15]). Sex-specific genetic systems (the non-recombining portion of the Y chromosome (NRY) and mitochondrial DNA (mtDNA)) have been analysed to reveal connections between geographically separated diaspora communities sharing a common identity [15–17] and to evaluate support for alternative oral histories . Recently, the geographic distribution of NRY haplotypes and time to the most recent common ancestor of paternal haplogroups were interpreted as suggesting a late, exclusively eastern, expansion of the Bantu speaking people (EBSP) .
The geographic origins of African-derived populations, in particular those created by the Atlantic slave trade, have been investigated using NRY and mtDNA. In studies of the populations of Cape Verde Islands [20,21] and Sao Tome Island [22,23], sex-specific genetic systems were used to elucidate both maternal and paternal origins. In the case of the Palenque, analysis of HLA autosomal markers and antigens [24,25], and recently NRY variation  has suggested a greater proportion of recent African descent (RAD) than other Colombian groups.
Although culturally and geographically isolated for most of its existence, during the past few decades, the Palenque people have experienced more contact with those from outside their group . Therefore, in recent times, an increased level of gene flow may have occurred. Given the substantial geographic structuring of NRY and mtDNA haplotypes at the continental level, and assuming that the founding group was of RAD, genetic analysis can provide evidence of geographic ancestry and potential gene flow from non-RAD groups. If the male founders were of RAD and there has been little gene flow from Europeans and Amerindians, it can then be expected that NRY haplotypes will match those common in sub-Saharan Africa and will have low diversity.
Palenque oral tradition provides a testable hypothesis for NRY but not mtDNA variation. However, from colonial records, it appears that in the second half of the eighteenth century 178 black families occupied Palenque . Therefore, it can be hypothesized that the majority of the females at that time had RAD. If there has been little female gene flow since that time, then the expectation is that mtDNA haplotypes will match those commonly seen in sub-Saharan Africa.
Sub-Saharan Africa is known for its relatively high human genetic diversity [27,28], and geographic structuring of mtDNA haplotypes has been recognized . Furthermore, the considerable increase in NRY polymorphic sites identified in recent years [30,31] has revealed geographic structuring of NRY haplotypes [19,32]. These findings have made it possible, in some cases, to reveal recent shared paternal descent of men with a RAD born outside sub-Saharan Africa with men still living there [21,23,26].
To explore these questions about Palenque history based on anthropological and linguistic studies, we analysed NRY and mtDNA in the Palenque and 42 sub-Saharan groups. We address the following three questions: (a) is there greater genetic similarity between the inhabitants of Palenque and Yombe speakers than between Palenque and non-Yombe African groups? (b) Is there a significant difference between the sex-specific genetic systems profiles of present-day residents of Abajo and Arriba? (c) Are the NRY and mtDNA of the Arriba inhabitants more similar to those of the Republic of the Congo than are the NRY and mtDNA of Abajo residents? Genetic data analysed in this paper support the prior hypotheses that (a) Palenque have a paternal line founding origin in the Yombe and (b) there is significant difference in NRY distribution between Abajo and Arriba, but not mtDNA. There is limited NRY but not mtDNA support for an affirmative answer to (c).
2. Material and methods
(a) Sample collection
In Palenque, buccal swabs were collected from males over 18 years old currently living in, or born in, the community. Donors were initially selected randomly but after questioning, only one of each set of donors having a common paternal grandfather was included in the study. Samples were collected from a very substantial proportion of individuals satisfying the above criterion (estimated at greater than 90%, n = 153: Abajo area n = 88, Arriba area n = 52, others n = 13). Samples from eight groups in the Republic of the Congo (n = 591) were collected at local gatherings in different areas of Brazzaville, Pointe-Noire and in the villages of Kakamoeka and Lovoulou, 90 km and 70 km inland from Pointe-Noire, respectively.
Ethnographic data were gathered from each Palenque individual, adopting the procedure reported in Ansari-Pour et al. . Buccal swabs previously collected from 34 sub-Saharan groups in West, Central West and South East Africa, representing other potential source populations for the Atlantic slave trade, were also analysed in this study (see electronic supplementary material, table S1; samples from all population groups other than Palenque were included in Ansari-Pour et al. ). DNA from all Congolese and Palenque samples was extracted using the Gentra protein precipitation method (Gentra Systems, Minneapolis) while the standard phenol–chloroform method was used for all other samples.
(b) DNA typing
The battery of Y-chromosome presumed unique event polymorphisms (UEPs), consisting of single nucleotide polymorphisms and insertion/deletion polymorphisms, as well as a set of short tandem repeats (STRs) were typed in the Palenque samples as described by Ansari-Pour et al. . Briefly, (i) 16 UEPs (see figure 2) were used to classify NRY into haplogroups, applying the nomenclature of the Y Chromosome Consortium  with the ‘capital letter–mutation’ system, and within each haplogroup, (ii) six STRs (DYS19, DYS388, DYS390, DYS391, DYS392 and DYS393) were used to define haplotypes. Equivalent Y chromosome data for all 42 sub-Saharan African population samples, including the eight Congolese groups (see electronic supplementary material, table S1) were taken from Ansari-Pour et al. .
The mtDNA HVR-1 region of all Congolese groups and Palenque was sequenced as described by Veeramah et al. . For all samples, HVR-1 variable site only haplotypes were determined by comparing sequences of nucleotide range 16 020–16 400 with the revised Cambridge Reference Sequence . Haplotypes were defined by substitutions, insertions and deletions, and their corresponding nucleotide positions. Tentative mtDNA haplogroup assignment, based on HVR-1 sequences, were inferred according to the scheme of Salas et al. , although it should be noted that inferred haplogroup frequencies were not used in our statistical analyses and are only presented for reference. To extend the mtDNA dataset, HVR-1 haplotypes were also determined for 30 out of 34 non-Congolese sub-Saharan population samples considered in the NRY analyses (i.e. all groups except Sena, Tumbuka, Bantu speakers from Pretoria and Yao; unpublished data, MG Thomas 2010 except for the Nigerian groups ). To facilitate comparison of all population samples, the range of the HVR-1 region considered was reduced to 16 023–16 380.
(c) Statistical analysis
Pairwise genetic differences between population samples were assessed using the exact test of population differentiation (ETPD)  which is analogous to Fisher's exact test extended to an m × n matrix, where m is the number of groups and n is the number of distinct haplotypes. Gene diversity and its standard error were estimated using the unbiased formula of Nei . Genetic distances calculated were: FST  based on UEP haplogroups, STR haplotypes (respecting their classification within haplogroups, i.e. UEP + STR haplotypes), and mtDNA HVR-1 haplotypes and imputed haplogroups, RST  based on six STRs on the NRY, and Kimura's two-parameter model with γ-value of 0.47  for mtDNA HVR-1 sequences. It should be noted that FST is used here as in Thomas et al.  as a convenient statistic summarizing multidimensional differences in allele frequencies. No further assumptions regarding the underlying population genetic model were applied in its interpretation, other than a monotonic relationship between FST and genetic differences.
Analyses based on a selection of UEPs may suffer from biases in their ascertainment. However, given the geographic structuring of the NRY variation, the choice of UEPs is appropriate for the comparisons of sub-Saharan and RAD individuals. To test if genetic distances differed significantly from zero, haplogroups/haplotypes were permuted among samples; 1000 permutations were performed to generate a null distribution of pairwise genetic distances.
All the above analyses were performed using Arlequin software v. 3.0 . Principal component analysis (PCA) and the non-parametric ‘Sign test’ were performed using the ‘R’ statistical programming language (www.R-project.org) , using ‘princomp’ and ‘binom.test’ functions, respectively. PCA plots were used to visualize relationships among population samples based on NRY haplogroup frequencies.
(a) Frequencies of NRY haplogroups and NRY-based genetic distances
The frequencies of all observed NRY haplogroups in the Palenque and the 42 sub-Saharan groups analysed in this study are included in table 1. The phylogenetic relationships of the haplogroups can be seen in figure 2. Thirteen NRY haplogroups were observed, of which 10 were present in the Palenque. The modal haplogroup in the Palenque was E-U175 (27%), but the two districts of the village had different modal haplogroups (details below). Notably, there were only three haplogroups present in sub-Saharan African groups not observed in the Palenque dataset: DE-YAP which is found at very low frequency in Nigeria ; A-M13 which forms a very basal clade in the NRY phylogeny and has a wide distribution at low frequency in Africa [44–46]; and E-U181 which has been proposed as a signature of an exclusively eastern EBSP . P-92R7 and R1a1, both widely considered to be ‘non-African origin haplogroups’ , were observed at 18% and 2.7%, respectively, in Palenque while observed as a singleton or at low frequencies and completely absent in sub-Saharan African groups, respectively. STR haplotypes within each haplogroup were then analysed (see electronic supplementary material, table S2). Of note, the two most common STR haplotypes within P-92R7 in Palenque were haplotype 14-12-24-11-13-13 (N = 5) and its one-step neighbour (14-12-24-12-13-13) (N = 5). Both were absent from the sub-Saharan African dataset. The former has been designated the Atlantic modal haplotype (AMH) due to its high frequency in Western European populations [48,49].
Gene diversity in the Palenque based on all haplogroups and E-sY81 component haplogroups was 0.830 ± 0.013 and 0.638 ± 0.035, respectively, while the equivalent statistics in the sub-Saharan African dataset were 0.753 ± 0.007 and 0.679 ± 0.008, respectively (for gene diversity in each individual group, see electronic supplementary material, table S3).
The genetic distinctiveness of Palenque compared with each of the sub-Saharan African groups was apparent as assessed by ETPD (p < 0.001). Also, all FST values between Palenque and the sub-Saharan African groups were significant as assessed by random permutation (see Material and methods) (p < 0.00001) with only two below 0.05 (Chewa, an East African group from Malawi (FST = 0.027) and Yombe (FST = 0.035)) (see electronic supplementary material, table S4). FST between the Chewa and Yombe was not significant. This pattern was also consistently observed based on RST (see electronic supplementary material, table S5). Comparison of haplogroup profiles in Palenque, Chewa and Yombe, revealed 12 NRY haplogroups present in at least one of the groups. Six were observed in all three groups (see electronic supplementary material, figure S1). Of the remaining six, four were observed in Palenque and Yombe, one in Palenque and Chewa and one was observed only in the Palenque (electronic supplementary material, figure S1). Most notably, all the haplogroups observed in the Yombe were also observed in the Palenque, while the proposed signature haplogroup of the eastern EBSP (E1b1a8a1a; E-U181)  was absent in the Palenque and the Yombe.
Similar to the approach taken by Di Giacomo et al. , we compared the distribution of NRY variation within E-sY81 (E1b1a; the signature haplogroup of EBSP [19,35,51,52]), a clade that was present in all population samples including Palenque, and observed only in men of RAD. FST between the Palenque and the other groups, based on the frequencies of the E-sY81 component haplogroups, revealed only two groups with a non-significant FST (Chewa and Yombe) with the Yombe–Palenque FST < 0.001 (electronic supplementary material, table S6). Based on the same dataset, pairwise differentiation between Palenque and all sub-Saharan African population samples were also assessed using ETPD. Interestingly, all were significant at the 5% level except the Yombe (p = 0.507).
A PCA plot using only E-sY81 component haplogroup frequencies showed Palenque as an outlier. While a mixed collection of Bantu speaking groups was nearer than other Niger–Congo groups to the Palenque, the closest population to them was the Yombe (figure 3).
(b) The distribution of mtDNA variation
The Palenque sample contained 26 mtDNA HVR-1 haplotypes. The modal haplotype was at a frequency of 0.166 and five common haplotypes together accounted for 66.2% of the total. The imputed haplogroups were almost all sub-lineages of L (the sub-Saharan African modal haplogroup) with over 70% belonging to either L1 or L3 (see electronic supplementary material, table S7). Nei's gene diversity, based on mtDNA HVR-1 haplotypes, was 0.903 ± 0.010. Across all sub-Saharan African groups (38 groups), 723 mtDNA HVR-1 haplotypes (427 singletons) were observed. Gene diversity for the combined set was 0.993 ± 0.004 and in individual groups ranged from 0.968 to 0.997 (see electronic supplementary material, table S8) with mean of 0.987 and s.d. of 0.007.
ETPD-based on HVR-1 haplotypes was significant between Palenque and all sub-Saharan groups. This was also the case based on imputed haplogroups (except Sundi (N = 25) with borderline p-value of 0.057). FST and K2P between Palenque and the sub-Saharan African groups were also all significant, and all were in the range of 0.037–0.066 and 0.039–0.126, respectively (see electronic supplementary material, tables S9 and S10, respectively).
(c) Intra-village analysis of Palenque
Summary statistics were calculated for both districts (Abajo and Arriba) in Palenque (see electronic supplementary material tables S11 and S12 for NRY and mtDNA raw data, respectively). The modal NRY haplogroups in Abajo and Arriba were E-U175 (34.1%) and E-U290 (23.1%), respectively. The proportion of the E-sY81 clade in Abajo and Arriba was 55.3% and 51.9%, respectively, and not significantly different (p = 0.727). Gene diversity based on all NRY-UEP in Abajo and Arriba was 0.800 ± 0.024 and 0.853 ± 0.019, respectively, and 0.5597 ± 0.0632 and 0.6809 ± 0.0495, respectively, when restricting analysis to E-sY81 NRY types. At the UEP + STR level, the modal haplotype was one STR mutation different from the EBSP modal haplotype (i.e. E-sY81-15-12-21-10-11-13)  in Abajo (E-sY81-16-12-21-10-11-13; 10.6%) and in Arriba (E-sY81-15-12-21-10-12-13; 19.2%), while the EBSP modal haplotype was at a frequency of 5.9% and 5.8%, respectively.
Analysing P-92R7 NRY, the frequency was not significantly different in the two districts (15.3% in Abajo and 19.2% in Arriba, p = 0.639). The distribution of constituent haplotypes was also not significantly different as measured by ETPD (p = 0.738).
To investigate the hypothesis derived from observed cultural differences and local oral history (Y. Moñino, personal field notes, 2012) that the two village districts can be distinguished from each other, several statistical tests were performed using the NRY and mtDNA data (see electronic supplementary material, table S13). Strikingly, genetic comparisons based on mtDNA were not significant based on both haplotypes and imputed haplogroups, while all comparisons using NRY markers were significant at the 5% level. Notably, FST between the two village districts, calculated using E-sY81 component haplogroups only, was both significant (p < 0.05) and greater than between either of them and Yombe (electronic supplementary material, table S14). The distinctiveness of Abajo and Arriba NRY was also confirmed by ETPD (UEP, p = 0.006; UEP-E-sY81, p = 0.012; UEP + STR, p = 0.017). We then examined whether the ETPD was significant because of haplogroups introduced into Palenque probably through non-RAD introgression. Y chromosomes were divided into (i) those collectively belonging to haplogroups K, P and R (all with inferred origins outside sub-Saharan Africa) and (ii) Y(xK,P,R). The difference was driven by the African Y(xK,P,R) NRY at both UEP (p = 0.008) and UEP + STR (p = 0.008) levels and not by the non-African (Y(K,P,R) NRY (p = 0.114 and 0.271 at UEP and UEP + STR levels, respectively). No significant difference was observed between the two districts based on mtDNA HVR-1 haplotypes and imputed haplogroups as assessed by ETPD (p = 0.985 and p = 0.77). When applying the same test both districts differed significantly from all sub-Saharan African groups (p < 0.00001) at the haplotype level. Analysis of imputed haplogroups also showed a consistent pattern. This may be due to the presence of non-African mtDNA haplotypes including those defined by 16290T and 16319A (possibly belonging to the Amerindian A2 haplogroup) and high frequency of founder haplotypes such as that bearing 16294T and 16309G (possibly haplogroup L2a1).
(d) Abajo and Arriba in the context of sub-Saharan Africa
FST between Abajo and Arriba, treated as separate samples, and 42 sub-Saharan African populations were estimated based on all haplogroups and E-sY81 component haplogroups (electronic supplementary material, table S14) but not mtDNA HVR-1 haplotypes, since no significant genetic distance (FST and K2P) was observed between the two districts.
At the NRY-UEP level, all FST estimates were significant (p < 0.05) with the exception of Abajo and Yombe. When considering only E-sY81 component haplogroups, FST was similarly not significant between Abajo and (i) Yombe and (ii) Chewa. In addition, Arriba had a non-significant FST with Bembe and Yombe. Based on the magnitude of FST, at the NRY-UEP level, the Congolese groups were split in half, with four being closer to Arriba than Abajo, and four being closer to Abajo than Arriba. However, at the E-sY81 level, six of seven (excluding Yombe which had a non-significant FST) were closer to Arriba. Comparisons with the complete African dataset presented a clearer difference. At both NRY-UEP and E-sY81 levels, Arriba was closer to 33 (Sign test p = 0.0001) and 36 (Sign test p < 0.0001) out of 42 sub-Saharan groups.
The evolutionary processes of mutation and genetic drift, including founder effect, as well as the possible influence of natural selection, make direct inference of population history from genetic data challenging. However, such challenges become more tractable when clear hypotheses can be formulated from existing anthropological, linguistic and ethnographic research. In such circumstances, genetic data can, in some cases, be analysed to test those hypotheses. Even though the NRY and mtDNA are effectively single loci (because they are non-recombining regions) they can be appropriate systems for testing such hypotheses, particularly, where the prior hypotheses concern only patrilineal or matrilineal history. In the current study, there were three prior hypotheses which we address in turn in the following sections.
(a) The founding fathers of the Palenque community were primarily Yombe
Sex-specific genetic systems are particularly susceptible to genetic drift [53,54]. The larger the population size and the fewer the generations since a postulated event, the less the effect of genetic drift should be. Because forced slaves are recorded to be mainly from Niger–Congo speaking groups, we analysed a set of haplogroups (E-sY81) that are collectively in high frequency in Niger–Congo speaking peoples but at only low frequencies or absent in other groups. This should, at least to some extent, have the added benefit of reducing the effect of any recent contribution from Amerindian and European males. We repeated the analysis including haplogroups within Y(xP,K,R) (figure 2), to which the NRY haplotypes of the great majority of residents in sub-Saharan Africa belong.
Notably, in the PCA visualization, the Yombe are the closest to the present-day Palenque out of all the 42 sub-Saharan groups (figure 3). In addition, Yombe was the only group from the Republic of the Congo for which there was not a significant FST value (Yombe p = 0.378, other seven groups p < 0.001). We also calculated FST distances after including 10 West African groups from Montano et al.  with E-sY81 chromosome set equal to or above the minimum set in this study (Sundi, N(E-sY81) = 22). Yombe remained the closest group to Palenque. Analysing the Y(xP,K,R) set of haplotypes produced a similar outcome but with the Chewa marginally closer to the Palenque than were the Yombe (both FST < 0.02). Interestingly, in both the E-sY81 and Y(xP,K,R) analyses, Yombe and Chewa had an FST < 0.001.
Even though there is considerable genetic similarity among the many widely distributed groups having an origin in the rapid EBSP , the small genetic distance between Chewa (a group from Malawi) and Palenque is so similar to that between the Yombe and Palenque that it would be surprising, were it not for an oral history of the Chewa that records an origin in the ‘Luba country of the southern Congo basin’ . This description could place their origin only about 400 miles east of the region where Yombe is currently spoken and may even reflect a migration from a more western location, passing through Luba country rather than commencing within it. The date of this migration is uncertain with the earliest record of the group as ‘Chévas’ only appearing in 1831–2 . Marwick  also records that the Chewa have an equally prevalent alternative origin story that places their genesis south west of Lake Malawi. Marwick agrees with Hamilton  that the two traditions can be reconciled by the migration from the north being by ‘chiefly invaders’ who gained control over ‘long-established autochthones’. Our results are more consistent with this interpretation of the oral accounts, as the Chewa–Yombe genetic distances were non-significant at the NRY level but highly significant at the mtDNA level (p < 0.001).
Additional support for the Yombe origin of the Palenque comes from the absence of NRY E-U181 chromosomes in both the Yombe and Palenque and their presence in the Chewa. The presence of the E-U181—previously reported as characteristic of East African populations— in the Chewa can be explained by post-migration male gene flow following their arrival in Malawi . Nevertheless, given that a prior hypothesis exists—based on linguistic evidence—for a Yombe origin, and that no such evidence has yet been advanced to support a Chewa origin, it is reasonable to conclude that the genetic analysis of NRY haplotypes supports a Yombe origin.
(b) Is there a significant difference between the sex-specific genetic systems profiles of residents of Abajo and Arriba?
Differences in the paternal demographic histories of the two areas of the village are clearly supported by the presence of different modal haplotypes, a slightly higher haplotype diversity in Arriba compared with Abajo, and a significant ETPD between the two. The summed frequencies of P-92R7 haplotypes and the distribution of STR haplotypes within this haplogroup were, however, similar suggesting a similar extent of non-African genetic introgression into the two districts. These results, in general, contrasted with comparisons using mtDNA where no statistical differences in diversity were observed and ETPD was not significant (p = 0.985). The similarity in mtDNA profiles but dissimilarity in NRY profiles supports field observations of Y.M. that patrilocality is practiced in Palenque with men choosing to live close to their fathers and grandfathers, and women marrying men either from their own district or another, with the latter being common.
(c) Are the NRY and mtDNA profiles of the Arriba inhabitants compared with Abajo residents more similar to those of residents of the Republic of the Congo?
As no significant difference was observed between Arriba and Abajo residents in mtDNA haplotype distribution, the answer to the question posed is ‘no’. With respect to paternal ancestry alone, the proposition has some limited support from the results of NRY when restricting analysis to E-sY81 component haplogroups. Here, where FST was significant, six out of seven of the Congo groups had a smaller FST with the Arriba. More striking is that when compared with all 42 sub-Saharan groups, Arriba had lower genetic distances (p = 0.0001). Although genetic drift cannot be discounted as the cause, one possible explanation is that practices associated with Africa such as matrilinearity (involving inheritance from a maternal uncle to his nephew, as seen in the Congo) were retained longer in Arriba than Abajo. There might therefore be an association between cultural practice and patterns of genetic diversity but not necessarily a causative relationship.
This study has explored an important aspect of the genetic ancestry of a fugitive African slave community in Colombia and contributed to a fuller understanding of their history. Further analysis of DNA of the Palenque alongside that of Colombian, European and sub-Saharan African groups using genome-wide markers and a more detailed characterization of NRY and mtDNA should reveal more of the genetic history of the Palenque including contributions made by other communities in Colombia.
All samples were collected anonymously with informed consent. This study received ethical approval from the Ministry of Health of the Republic of the Congo (741/MSP/DGS/S), the Scientific Committee of the Academic Corporation for the Studies of Tropical Pathologies of the Universidad de Antioquia in Colombia (CPT-8840-03-054), the village council of Palenque de San Basilio and the Joint UCL/UCLH Committees on the Ethics of Human Research Committee A (99/0196).
Figure S1 and tables S1–S14 have been uploaded as part of the electronic supplementary material.
N.A., N.B. and Y.M. conceived and designed the study, C.D., N.G., G.B. and N.B. collected DNA samples. N.A. and M.G.T. genotyped and sequenced the samples, Y.M. analysed the anthropological data, N.A., M.G.T. and N.B. analysed the genetic data, and N.A., N.B. and M.G.T. wrote the paper.
The authors declare no conflict of interest.
N.A. was supported by NERC CASE award.
We thank all DNA donors and those assisting in sample collection, and David Balding for advice on statistical analysis.
- Received December 15, 2015.
- Accepted February 29, 2016.
- © 2016 The Author(s)
Published by the Royal Society. All rights reserved.