Archaeological maize specimens from Andean sites of southern South America, dating from 400 to 1400 years before present, were tested for the presence of ancient DNA and three microsatellite loci were typed in the specimens that gave positive results. Genotypes were also obtained for 146 individuals corresponding to modern landraces currently cultivated in the same areas and for 21 plants from Argentinian lowland races. Sequence analysis of cloned ancient DNA products revealed a high incidence of substitutions appearing in only one clone, with transitions prevalent. In the archaeological specimens, there was no evidence of polymorphism at any one of the three microsatellite loci: each exhibited a single allelic variant, identical to the most frequent allele found in contemporary populations belonging to races Amarillo Chico, Amarillo Grande, Blanco and Altiplano. Affiliation between ancient specimens and a set of races from the Andean complex was further supported by assignment tests. The striking genetic uniformity displayed by the ancient specimens and their close relationship with the Andean complex suggest that the latter gene pool has predominated in the western regions of southern South America for at least the past 1400 years. The results support hypotheses suggesting that maize cultivation initially spread into South America via a highland route, rather than through the lowlands.
Maize (Zea mays ssp. mays) is the principal domesticated crop of the Americas. Although its Mesoamerican origin has been clearly established (e.g. Doebley 1990; Benz 2001; Matsuoka et al. 2002b), its time of arrival and trajectory of spread through South America is still uncertain. According to Piperno & Pearsall (1998), maize was already present in southern Central and northern South America between 7700 and 6000 years ago, but this scenario, primarily based upon plant microfossil evidence, has been questioned by Staller & Thompson (2002) who proposed a more recent introduction on archaeological and palaeoethnobotanical grounds (1200–2200 BC). The pattern of spread of maize cultivation into South America has been inferred by cytogenetic and molecular studies of landraces and other varieties, but again with contradictory results. McClintock et al. (1981), following an extensive examination of heterochromatic bands (chromosome knobs), concluded that maize was initially introduced into the central Andes and then spread extensively throughout the highland and lowland regions of the continent, not being significantly supplemented by other types of maize until new genotypes spread southwards along the east coast of Brazil in relatively recent times. In contrast, analysis of microsatellite variation suggests that maize was first introduced into the lowlands of South America, reaching the Andes at a later stage (Matsuoka et al. 2002b).
The large number of landraces currently cultivated throughout South America (more than 300), many of which are directly descended from the crops grown by natives and are maintained by traditional farmers with little or no input from commercial inbred lines, constitute a valuable source of material from which to reconstruct the origins and spread of maize. Additional insights might also be obtained by examination of ancient DNA (aDNA) preserved in archaeological specimens: such studies enabling genetic variation to be assessed over time (Jaenicke-Després et al. 2003) and avoiding the complicating factors caused by the movement of genotypes during the post-Columbian period. By analysis of a short segment of the Adh2 locus in archaeological specimens from eastern Brazil, Peru and northern Chile, Freitas et al. (2003) found evidence for two separate expansions of maize cultivation into South America, one from Central America into the Andean regions and a second along the lowlands of the northeast coast of the continent. This examination of aDNA therefore enabled a rapport to be reached between the differing outcomes of the cytogenetic and molecular studies of modern material.
Ancient DNA therefore has considerable potential in developing an understanding of the history of maize cultivation in South America, particularly so given that the morphological attributes needed for racial identification are not always present or sufficiently well preserved in charred cobs and kernels, the types of material generally recovered from archaeological sites. The aim of the present study was to use this approach to assess the genetic affiliation of archaeological specimens and extant landraces in northwestern Argentina. This region comprises the southernmost area of maize distribution and possesses a diverse variety of landraces, some of which are thought to be related to primitive maize forms. Indeed, Argentine popcorn, a collection introduced into the United States from Argentina 50 years ago, is thought to more closely resemble the earliest maize remains from the Tehuacan Valley, Mexico, than any extant race thus far examined (Benz & Iltis 1990). Apart from its present importance as a reservoir of morphological and genetic variation, northwestern Argentina is also noted for providing extraordinarily good preservation conditions for human, animal and plant remains, with the Llullaillaco mummies being one of the most remarkable examples (Previgliano et al. 2003). Desiccated and charred maize cobs and kernels have been recovered from several archaeological sites ranging in age from 300 to 2000 years before present (BP) and have been related to different human cultures with contrasting degrees of social and technological complexity, from the early agro-pastoral societies to the Inca and Spanish occupations (e.g. Sempé 1977; Tarragó 1980; Korstanje & Wurschmidt 1999).
The extensive microsatellite survey conducted by Matsuoka et al. (2002a,b), which encompasses the entire pre-Columbian range of maize distribution, enables comparison of microsatellite alleles throughout the Americas and provides a framework for the analysis of genetic variation in both extant and archaeological specimens. The short lengths of the polymerase chain reaction (PCR) amplicons needed to type most microsatellite markers make them suitable for aDNA analysis, aDNA typically being recovered as short fragments less than 200 bp (Allaby et al. 1997), and identification of alleles by amplicon size circumvents problems caused by diagenetic changes when single nucleotide polymorphisms are typed in aDNA (Pääbo 1989; Poinar 2003). The aim of this project was therefore to extend the microsatellite data obtained by Matsuoka et al. (2002a,b) to key archaeological maize specimens from northwestern Argentina, in order to establish the genetic affiliations between these ancient plants and the crops currently grown in this region.
2. Material and methods
Collection sites, race, voucher information, age, type of remains and number of individuals examined in archaeological and modern maize landraces are described in table 1. Location of archaeological sites is presented in figure 1. Archaeological samples were dated according to the contextual evidence available for each site. Accelerator mass spectroscopy was used to confirm the antiquity of the specimens belonging to sites Lorohuasi, Tebenquiche and Punta Colorada (Beta Analytic).
DNA from desiccated cobs and kernels (100 mg) was isolated by the silica-based spin column method described by Yang et al. (1998) with minor modifications. Three microsatellite loci were examined, phi127, phi029 and phi059 (nomenclature of Matsuoka et al. (2002a,b); the loci are on linkage groups 2, 3 and 10, respectively), using primer sequences described in the Maize Genetics and Genomics Database (http://www.maizegdb.org/locus.php). Reaction mixtures contained 2–10 μl DNA extract, 90 ng of each primer, 125 μM of each dNTP, 2 mM MgCl2, 150 μg ml−1 bovine serum albumen (BSA), 1.25 units Taq DNA polymerase (Fermentas), 1× PCR buffer with (NH4)2SO4 and sterile double distilled water to a final volume of 50 μl. Cycling conditions were: 4 min at 94°C; 35 cycles of 1 min at 94°C; 1 min at the annealing temperature; 1 min at 72°C; and a final elongation step of 6 min at 72°C. The annealing temperatures were 47, 50 and 55°C for phi127, phi029 and phi059, respectively. Reactions containing fragments of the expected size were further purified (QIAquick PCR Purification Kit, Qiagen), reamplified under the same conditions but with no addition of BSA and cloned using the TOPO TA Cloning System (Invitrogen). At least five clones were sequenced per specimen (247 clones in total) with M13 universal primers using the ABI Prism BigDye Terminator Cycle Sequencing Kit and an Applied Biosystems 377 DNA sequencer. Specimens that gave the same sequence in every amplification product were considered to be homozygous, although heterozygosity cannot be excluded. Precautions were taken to minimize the risk of contaminating ancient material with modern DNA: ancient extracts were prepared in an isolated room not used for handling modern DNA or PCR products; PCR mixes were set up in a third laboratory in a laminar flow cabinet (HEPA filter, Class 100, conforming to BS 5295 and 5726) used for no other purpose; standard precautions were taken regarding pipette types; solutions and work surfaces were sterilized by autoclaving and ultraviolet irradiation, respectively; and all aDNA amplifications were accompanied by an extraction and water blank. Reproducibility of positive results was confirmed by independent rounds of extraction and amplification.
For modern plants, DNA was extracted from 2- to 3-day-old seedlings according to Dellaporta et al. (1983). Maize landraces were genotyped at loci phi127, phi029 and phi059 in order to calculate contemporary allelic frequencies. Reaction mixtures contained 1 μl DNA extract, 30 ng of each primer, 125 μM of each dNTP, 1.5 mM MgCl2, 0.5 units Taq DNA polymerase (Promega), 1× PCR buffer and sterile double distilled water to a final volume of 25 μl. A touchdown cycling profile (annealing temperature 65–55°C) was used and PCR products were separated on a 6% denaturing polyacrylamide gel (8 M urea) following standard procedures. Gels were silver stained with Silver Sequence DNA Staining Reagents (Promega). Alleles were identified by comparison with products of known size using GelPro Analyzer v. 4.0 (Media Cybernetics) and further confirmed by direct sequencing of homozygous specimens representative of each allelic class. Allelic variants, 112 (phi127) and 154 (phi029), corresponding to teosintes Z. mays ssp. parviglumis and Z. mays ssp. mexicana were also sequenced for comparison with archaeological specimens. All Zea sequences obtained in this study have been deposited in GenBank under accession numbers AY965913–AY965994.
Sequences were aligned using ClustalW (Higgins et al. 1994) followed by minor manual modifications. Allelic frequencies of modern maize populations were calculated by the direct count method. Genic differentiation among populations and genotypic linkage equilibria were evaluated using Fisher's exact test as implemented in genepop (Raymond & Rousset 1995). Departures from Hardy–Weinberg proportions were assessed using the score test (U-test) provided by the same software package. Amplification products not corresponding to the microsatellite loci under study were analysed by comparison with the public sequence databases as both nucleotide and amino acid translations using blastn, blastx and tblastx. A sequence was classified as a known element when retrieved with an E-value of less than 10−5 following the criteria of Meyers et al. (2001). The identity and E-values of the highest score identifiable matches are provided as electronic supplementary material. Assignment of archaeological specimens to modern landraces was performed according to the assignment test developed by Paetkau et al. (1995). The method assigns an individual to the population in which its genotype is most likely to occur. As summarized in Cornuet et al. (1999), given J independent loci typed in I reference populations (and in the individuals to be assigned), the frequency of allele k at locus j in population i is pijk. Assuming Hardy–Weinberg equilibrium, the likelihood of a genotype AkAk′ occurring in the ith population at the jth locus is proportional to (pijk)2 if k=k′ and to 2 pijkpijk′ otherwise. Since the J loci are assumed to be independent, the likelihood of a multilocus genotype occurring in a given population is the product of likelihoods for each locus. The probability of drawing random samples of individuals of a given genotype from different populations was calculated with Microsoft Excel assuming a binomial distribution (see §3).
(a) Ancient DNA sequences
DNA extracted from archaeological specimens was consistently of low molecular weight (100–500 bp) as expected for aDNA (Pääbo 1989; Poinar 2003). However, high molecular weight fragments (approx. 20 kb) were also retrieved from Tebenquiche samples. Amplicons of the expected size and sequence were obtained for 9 out of 51 archaeological samples examined (table 2). The remaining samples failed to give positive results despite repeated rounds of extraction and amplification. No positive amplifications were obtained from the Batungasta samples. All negative controls including mock extractions were always devoid of PCR product.
At locus phi127, all the nine archaeological specimens that gave results were homozygous for allelic variant 112 (table 2). This allele was also present in modern populations, along with four others ranging in size from 114 to 126 bp (table 3). Analysis of 37 clones from archaeological sequences revealed sequence variations at a total of 23 nucleotide positions (see electronic supplementary material). The majority of variations (21/25) were singletons (substitutions appearing only in one clone), with 43% corresponding to putative C/G→T/A transitions. Alterations that occur in only a few sequences of a cloned PCR product can result either from misincorporation due to aDNA degradation or from erroneous Taq polymerase activity (Hansen et al. 2001). In our results, the strong bias towards C/G→T/A transitions indicates amplification of degraded templates with a high incidence of cytosine deamination (Hofreiter et al. 2001). Consensus sequences reconstructed from those individuals represented by multiple clones were identical to the allelic variant 112 present in contemporary samples, with one exception. This was specimen PC51, which had substitutions at positions 88 and 96 in the majority of clones. These could be authentic polymorphisms, but the observed pattern is also consistent with a small number of degraded template molecules initiating the PCR (Hofreiter et al. 2001). To evaluate the possible evolutionary significance of these substitutions, allele variant 112 was sequenced from Z. mays ssp. parviglumis and Z. mays ssp. mexicana. No evidence of polymorphism was detected in these teosintes (see electronic supplementary material). However, the same substitution was observed at position 96 of allele 126 from contemporary samples.
The two ancient specimens typed at locus phi029 were homozygous for allelic variant 154 (table 2), which was also present in all the modern populations examined. In total, eight alleles were detected in contemporary specimens, with allele 154 being found at high frequency in several populations (table 3). The consensus sequence obtained from specimen Lorohuasi 38 was identical to that retrieved from modern plants. Nineteen base differences were identified in single clones at random positions and, as previously described for locus phi127, C/G→T/A transitions were the prevalent nucleotide substitutions (14/19; see electronic supplementary material). The allelic variant reconstructed from specimen Lorohuasi 40 showed 3% uncorrected sequence divergence with respect to its contemporary counterpart. Five nucleotide substitutions were detected, four of which were present in both clones examined. Modern specimens display sequence variation within the part of the amplicon containing these substitutions, but the specific variations seen in Lorohuasi 40 were not found in any modern maize or teosinte (see electronic supplementary material).
Alleles from locus phi059 were refractory to cloning and had to be analysed by direct sequencing. In the archaeological specimens, a single allele was detected corresponding to variant 157 of contemporary landraces (tables 2 and 3). The archaeological sequence was identical to that observed for modern populations 6473 and 6476, but it showed 0.5% sequence divergence from the alleles from populations 6480, 6484 and 6482.
Clone analysis revealed several amplification products not corresponding to the expected microsatellite loci. After BLAST searches, most of these were found to be of microbial or unknown origin (see electronic supplementary material), but highly significant matches (E-values≤e−26) were also obtained for sequences identified as members of the Huck (SanMiguel et al. 1996) and Zeon (Hu et al. 1995) maize retrotransposon families, which are among the most abundant sequences of the maize genome (Meyers et al. 2001). It is noteworthy that all four putative retrotransposon sequences were retrieved from kernels (Lorohuasi 20, Lorohuasi 38 and Lorohuasi 40), while the majority of contaminating micro-organisms were associated with cobs from the Tebenquiche site. The presence of high molecular weight DNA in the Tebenquiche extracts is concordant with these observations, as ancient extracts with DNA fragments up to 10–20 kb are frequently indicative of microbial contamination (Rollo et al. 1994). The cobs from this site were the only ones recovered from domestic areas, whereas Lorohuasi and Punta Colorada specimens were retrieved from funerary, more protected, settings.
(b) Assignment of archaeological specimens to modern landraces
Genotyping of 146 individuals belonging to eight extant maize landraces currently cultivated in northwestern Argentina was conducted to provide a reference framework for determination of the genetic affiliations between modern and archaeological specimens. Following from the assumptions of Paetkau's assignment method, both panmixia and linkage equilibria were evaluated in the eight modern populations. No deviations from genotypic linkage equilibrium between pairs of loci were detected, as determined by the Fisher's exact test. Hardy–Weinberg proportions were confirmed for all population–locus combinations, except for the homozygote excess observed in population 6482 at locus phi059 (score test p=0.0001). Allelic frequencies for contemporary populations are given in table 3.
Despite being cultivated in the same region (northwestern Argentina), the eight landraces can be placed in three groups according to morphological and cytogenetic evidence (Poggio et al. 1998): (i) Andean complex (Altiplano populations 6473, 6167; Amarillo Chico populations 6476, 6484; Amarillo Grande population 6480; Blanco population 6485), (ii) South American popcorns (Pisingallo population 6313), and (iii) incipient races derived from the introduction of commercial germplasm into local varieties approximately 40 years ago (Orgullo Cuarentón population 6482). Application of the assignment test to locus phi127 places each archaeological specimen in race Blanco (table 4). However, the allelic frequencies obtained for this population are not significantly different from those corresponding to races Altiplano, Amarillo Grande and population 6484 of race Amarillo Chico. Similarly, when phi059 is considered, the highest probability of occurrence of the archaeological genotype is shown by population 6484 of race Amarillo Chico, but the populations belonging to races Amarillo Grande, Altiplano and Blanco, as well as population 6476 of race Amarillo Chico, show no significant differences in their allelic distributions. In contrast, phi029 assigns the highest probability to population 6473 of race Altiplano, with no other population displaying such a pattern of allelic distribution. As individuals Lorohuasi 38, Lorohuasi 40 and TC54 could be simultaneously genotyped for two loci, the probabilities of the combined genotypes (phi127112/112–phi029154/154; phi127112/112–phi059157/157) were also calculated (table 4). Once again, the populations from the Andean complex exhibit the highest probabilities of occurrence, while those corresponding to races Pisingallo and Orgullo Cuarentón are greatly diminished. To examine possible affiliations with non-Andean landraces, plants representative of racial complexes from the lowlands (populations VAV6607 and VAV6614; table 1) were typed at loci phi127 and phi029. Three alleles were found at phi127: 112 (f, 0.545); 124 (f, 0.137); and 126 (f, 0.318). Two alleles were found at phi029: 150 (f, 0.650); and 154 (f, 0.350). Therefore, for locus phi127, the probability of a genotype 112/112 being sampled from the lowland region was 0.297. For locus phi029, the probability of occurrence of genotype 154/154 in this area was 0.123, with the combined genotype phi127112/112–phi029154/154 having a likelihood of 0.037.
Generally, the results of the assignment test indicate that the archaeological specimens are more closely affiliated to the races of the Andean complex than to South American popcorns or incipient races, even though these are all currently cultivated in the same area, or to populations from the lowland regions of South America. Given the small sample sizes, it is conceivable that germplasm from lineages other than the Andean complex were present historically, but not represented in the archaeological samples that were studied. To evaluate this possibility, data from extant populations were combined into overall allele frequencies and the probability of drawing allele samples equal to those observed in the archaeological specimens was calculated. Taking advantage of the fact that a single allele size was sampled from archaeological specimens at all the three loci examined, population frequencies were recalculated considering only two allele categories (e.g. 112 and non-112 for locus phi127) and probabilities were computed according to the binomial distribution. For locus phi127, the probability of drawing eighteen 112 alleles was: 0.0047 for the combined gene pool comprising the populations from the Andean complex, Pisingallo and Orgullo Cuarenton; 0.0298 for the Andean complex–lowland region assembly; 1.8×10−5 for the lowland region alone; and 0.0578 for the Andean complex alone. For locus phi029, these probabilities were 0.1302, 0.1975, 0.015 and 0.2374, respectively.
(a) Ancient DNA
Plant aDNA studies have lagged behind animal and human investigations, but several nuclear loci have successfully been amplified from a variety of domesticated crops (e.g. O'Donoghue et al. 1996; Allaby et al. 1997; Brown et al. 1998; Manen et al. 2003), including maize (Rollo et al. 1991; Goloubinoff et al. 1993; Jaenicke-Després et al. 2003; Freitas et al. 2003). The microsatellite sequences reported here provide another example of single-copy regions being amplified from archaeological plant remains. Several observations support the authenticity of these ancient sequences: (i) those DNA extracts that were not heavily contaminated with microbial DNA contained only low molecular weight material, consistent with the expected molecular properties of aDNA, (ii) sequences of cloned PCR products contained singleton substitutions typical in nature and distribution to damage artefacts present in ancient DNA, (iii) the results were corroborated by independent rounds of DNA extraction and amplification, and (iv) the results are phylogenetically meaningful. However, the lack of genetic variation seen among the archaeological specimens could be regarded as unnatural and hence due to uniform contamination of these specimens or extracts (Hadly et al. 2003). As previously stated, no evidence of contamination was detected at any stage of the experimental procedures and although cross-contamination between PCRs is always a possibility, there is no reason to believe that it should exclusively affect archaeological samples and not extraction or amplification controls. Therefore, it is reasonable to conclude that the genotypes obtained for the archaeological specimens are authentic.
A thorough understanding of the chemical lesions present in aDNA is of fundamental importance for the identification of authentic nucleotide substitutions in sequences from archaeological and other preserved specimens. Two main types of damage are likely to affect DNA in archaeological deposits. First, hydrolytic damage will result in depurination and depyrimidination and in deamination of bases. Second, oxidative damage will result in a variety of base modifications (Höss et al. 1996). Deoxyuridine residues appear to be the most frequent miscoding lesion in aDNA extracted from both animal and human specimens (Hansen et al. 2001; Hofreiter et al. 2001; Orlando et al. 2003), but no such bias has been reported for plant remains. The prevalence of C/G→T/A transitions in the archaeological sequences studied here (see electronic supplementary material) support the hypothesis that cytosine deamination is indeed the predominant miscoding modification in many if not most aDNA samples.
O'Donoghue et al. (1996) suggested that the microenvironment within desiccated seeds is conducive to enhanced preservation of lipids and other biomolecules. Interestingly, five of the nine archaeological specimens genotyped at locus phi127 were kernels, and so were the two individuals typed at locus phi029.
(b) Genotypes of archaeological specimens and relationships with modern landraces
All the three microsatellite loci examined in the archaeological specimens exhibited a single allelic variant, identical in size to the most frequent allele found in contemporary populations belonging to races Amarillo Chico (6476, 6484), Amarillo Grande (6480), Blanco (6485) and Altiplano (6167). This genetic homogeneity is remarkable when considering the diversity of the archaeological sites included in this study (table 1). These not only encompass a time period of nearly 1000 years, but they also cover different socio-historical periods each characterized by a distinctive pattern of agricultural production and interregional exchange. Furthermore, the specimens from Punta Colorada and Lorohuasi were found in association with funerary artefacts, whereas the Tebenquiche specimens were retrieved from households. Contrasting climatic conditions could also be regarded as a factor promoting genetic differentiation among the races cultivated at each site. The mountain slopes of the Abaucán Valley (Lorohuasi, Punta Colorada) provide a fertile environment for the development of cultivars with little or no need for artificial irrigation. Most of the sites from this area are located along extant riversides or at the verge of ancient riverbeds. In contrast, the agricultural activities in a high-altitude desert such as the Atacama Plateau (Tebenquiche, 3650 m.a.s.l.) are strongly conditioned by water availability.
Although the genetic homogeneity displayed by these archaeological samples is remarkable, the results are not unprecedented. Vásquez et al. (in press) report homogeneity at five microsatellite loci in 400–600-year-old maize remains from two sites, 220 km apart, associated with the Chimú culture of the north Peruvian coast. They suggest that homogeneity at a particular site could arise owing to an annual founder effect, the inhabitants saving just a few cobs every year for sowing, rather than flailing the crop and choosing the best grains as the seeds for the following year. Similar farming practices are observed in northwestern Argentina, suggesting that founder effects could also explain the homogeneity of the archaeological specimens from this area. A founder effect is most likely to fix the most common allele variants in a particular population, so the homogeneity observed in samples from Lorohuasi, Tebenquiche and Punta Colorada could indicate that the original maize populations at each of these three sites had similar genetic structures. If this hypothesis is correct, then the assignment tests suggest that these ancestral populations were similar to the modern Andean complex.
An alternative explanation of the homogeneity is that this is an artefact of the small sample size which is an inevitable constraint of aDNA studies. If only a few individuals are sampled from a population within which there is a prevalence of certain allelic variants, then these sampled individuals will most probably exhibit the most frequent alleles. If the prevalent alleles are the same in each of the several source populations, then individuals sampled from these different populations might appear identical. If this is the explanation of the homogeneity of the archaeological samples then, of the modern races studied, those of the Andean complex are the ones most likely to resemble the archaeological populations, the probability of drawing gene samples equal to those observed in the archaeological specimens from extant populations being higher for the Andean complex than for the other population combinations examined. For instance, it is 10 times more likely to draw eighteen 112 alleles (locus phi127) from the Andean complex alone than from the gene pool comprising the Andean complex, Pisingallo and Orgullo Cuarenton, and almost two times more likely when the Andean complex–lowland assembly is considered. This explanation of the observed homogeneity implies that the ancient populations had similar genetic structures despite the temporal, socio-historical and geographical differences between the sites. This could also arise if significant gene flow occurred in the past to preclude genetic differentiation and maintain a similar population structure throughout time and space. With kernels being the principal means of dispersal, food exchange among neighbouring human populations could easily produce such an effect. In fact, high levels of intra- and inter-regional exchange have been extensively documented for northwestern Argentina since very early in the archaeological record (10 000 BP; Castro & Tarragó 1992; Albeck 2000).
Whichever of the above explanations is correct, it appears that, regardless of their site of origin, each archaeological specimen is more closely related to the races of the Andean complex than to the South American popcorns, the incipient races or the lowland races included in this study. The actual assignment of the archaeological specimens to the Andean complex depends on two assumptions, both of which we can demonstrate to be correct. The first assumption is that a certain amount of phylogenetic signal has been retained through time and that allelic frequencies have not been significantly altered by deterministic forces. Genetic differentiation and cluster analyses of a total of 18 microsatellite loci in the modern populations included in this study strongly suggest that this has been the case (Lia 2004). Moreover, Bayesian analysis of a combined data matrix including multilocus genotypes (10 loci) of individuals from these populations and those examined by Matsuoka et al. (2002b) showed that individuals corresponding to populations 6473, 6167, 6476, 6480, 6484 and 6485 and a set of accessions defined as ‘core Andean’ share a similar genetic constitution, whereas populations 6313 and 6482 are significantly different (Lia 2004). The second assumption concerns allele size convergence. To be meaningful, affiliations should be deduced from a set of alleles that are not only identical in state but also, and more importantly, identical by descent. The mutation rate per generation for maize dinucleotide microsatellite loci has been estimated as 7.7×10−4, with microsatellites having repeats of more than 2 bp showing an upper limit of 5.1×10−5 (Vigouroux et al. 2002). Therefore, considering that maize is an annual cultigen and accepting that its domestication took place between 6500 and 9000 years ago (Piperno & Flannery 2001; Matsuoka et al. 2002b), the amount of time elapsed is not sufficient for allele size convergence to have distorted the genetic assignments presented here.
(c) Origins of the Andean complex
Although maize racial classification is far from clear, and the term Andean complex may seem rather vague, the validity of this complex as a significant evolutionary unit has been repeatedly stressed by morphological, cytogenetic and genetic evidence (Goodman & Bird 1977; McClintock et al. 1981; Goodman & Brown 1988; Matsuoka et al. 2002b). The conclusion of the assignment tests, as described earlier, is that each of the nine archaeological specimens for which aDNA sequences were obtained is a member of the Andean complex and that this gene pool has therefore predominated in the western regions of southern South America for at least the last 1300 years. Although interpretation of genetic data, especially the necessarily limited data obtainable from aDNA analysis, in the context of crop dispersal is not straightforward, our results are pertinent to the competing hypotheses regarding the spread of maize cultivation into and through South America. Taking into account the location of northwestern Argentina at the extreme southern range of maize distribution, meaning that it was presumably one of the last areas of South America reached by cultivation and considering that the genetic similarities displayed by the archaeological specimens must predate the oldest age of these specimens (1400 years) by at least several centuries, it appears likely that the genetic ancestors of the Andean complex became established in northwestern Argentina soon after the first arrival of maize cultivation to South America. The prevailing view that the Andean complex is a highland rather than lowland population therefore supports a highland origin for maize cultivation in this region, consistent with the views of McClintock et al. (1981) and Freitas et al. (2003). According to McClintock et al. (1981), the extensive spread of the chromosome constitution characteristic of the Andean complex across such a vast territory (i.e. from Colombia to Northern Chile and the highlands of Argentina) would be expected if this was the type initially introduced into the region and if the introduction of later varieties was delayed, the race Pisingallo (Pisinkalla) from Bolivia and Argentina being an example of the latter. The slightly different proposal of Freitas et al. (2003), that there were separate expansions of maize cultivation into South America, one from Central America into the Andean regions and a second along the lowlands of the northeast coast of the continent, is still consistent with the cytogenetic evidence presented by McClintock and also allows that maize from the Andean region was the first to be introduced in South America.
The antiquity of the Andean complex is not however compatible with the interpretation of Matsuoka et al. (2002b) that maize cultivation reached the Andes at a presumably late stage, only after its initial introduction to the lowlands of South America. If the races from the lowlands of South America were ancestral to those of the Andean complex, then at least some indication of their presence might be expected at archaeological sites from northwestern Argentina, but no evidence for the presence of germplasm from sources other than the Andean complex were found within the samples that we analysed. Both Paetkau's assignment test and the random sampling probabilities calculated according to the binomial distribution show that it is very unlikely for the archaeological specimens to have been derived from a lowland gene pool. Further studies of archaeological specimens, in particular from lowland regions of southern South America, will clarify these issues.
We thank Dra. Carlota Sempé and Dr Alejandro Haber for kindly providing the archaeological specimens from the Punta Colorada and Tebenquiche sites, respectively, and we thank Dr Victor Vásquez for allowing us to cite his work on Peruvian maize. We are also in debt to Keri Brown for her assistance in the radiocarbon dating of the samples. Financial support for this work was provided by the Consejo Nacional de Investigaciones Científicas y Técnicas (Argentina), the Agencia Nacional de Promoción Científica y Tecnológica (Argentina), the Universidad de Buenos Aires and the Natural Environment Research Council (UK).