The ancestry of modern Iberians (comprising the Spanish and Portuguese) is consistent with the geographical situation of the Iberian Peninsula in the South-west corner of Europe, showing characteristics that are largely typical in Southern and Western Europeans. As is the case for most of the rest of Southern Europe, the principal ancestral origin of modern Iberians are Early European Farmers who arrived during the Neolithic. The large predominance of Y-Chromosome Haplogroup R1b, common throughout Western Europe, is also testimony to a sizeable input from various waves of (predominantly male) Western Steppe Herders that originated in the Pontic-Caspian Steppe during the Bronze Age. [2] [3]
Modern Iberians' genetic inheritance largely derives from the pre-Roman inhabitants of the Iberian Peninsula who were deeply Romanized after the conquest of the region by the ancient Romans: [4] [5]
Genetic research on medieval populations of the Iberian Peninsula indicates that, in comparison with Iron Age groups, there was a discernible shift in ancestry toward sources related to Italy and Greece, contributing approximately one quarter of their genetic profile. This component is still detectable in the genomes of modern Spaniards, with the exception of a significant number of the Basques, whose genetic makeup remains most closely related to that of the Iron Age inhabitants of Iberia. [10]
There are also some genetic influences from the Alans and Germanic tribes who arrived after the Roman period, including the Suebi, Hasdingi Vandals, and Visigoths. [11] [12] [13] Due to its position on the Mediterranean Sea, like other Southern European countries, there were also contacts with other Mediterranean peoples such as the Phoenicians, Ancient Greeks and Carthaginians who settled along Iberia's eastern and southern coasts, the Sephardi Jewish community, and Berbers and Arabs arrived during Al-Andalus, all of them leaving some North African and Middle Eastern genetic influences, particularly in the south and west of the Iberian Peninsula. [14] [15] [5] [16] [17] [18] [4] Similar to Sardinia, Iberia was shielded from settlement from the Middle East and Caucasus region by its western geographic location, and thus has lower levels of Western Asian and Middle Eastern admixture than Italy and Greece, most of which probably arrived to Iberia during historic rather than prehistoric times, especially in the Roman period. [19] [10]
The foremost pioneer of the study of population genetics was Luigi Luca Cavalli-Sforza. Cavalli-Sforza used classical genetic markers to analyse DNA by proxy. This method studies differences in the frequencies of particular allelic traits, namely polymorphisms from proteins found within human blood (such as the ABO blood groups, Rhesus blood antigens, HLA loci, immunoglobulins, G-6-P-D isoenzymes, among others). Subsequently, his team calculated genetic distances between populations, based on the principle that two populations that share similar frequencies of a trait are more closely related than populations that have more divergent frequencies of the trait. [20]
Since then, population genetics has progressed significantly and studies using direct DNA analysis are now abundant and may use mitochondrial DNA (mtDNA), the non-recombining portion of the Y chromosome (NRY) or autosomal DNA. MtDNA and NRY DNA share some similar features which have made them particularly useful in genetic anthropology. These properties include the direct, unaltered inheritance of mtDNA and NRY DNA from mother to offspring and father to son, respectively, without the 'scrambling' effects of genetic recombination. We also presume that these genetic loci are not affected by natural selection and that the major process responsible for changes in base pairs has been mutation (which can be calculated). [21]
Whereas Y-DNA and mtDNA haplogroups represent but a small component of a person's DNA pool, autosomal DNA has the advantage of containing hundreds and thousands of examinable genetic loci, thus giving a more complete picture of genetic composition. Descent relationships can only to be determined on a statistical basis, because autosomal DNA undergoes recombination. A single chromosome can record a history for each gene. Autosomal studies are much more reliable for showing the relationships between existing populations but do not offer the possibilities for unraveling their histories in the same way as mtDNA and NRY DNA studies promise, despite their many complications.[ citation needed ]
Nuclear DNA analysis shows that Spanish and Portuguese populations are most closely related to other populations of western Europe. [23] [24] [25] There is an axis of significant genetic differentiation along the east–west direction, in contrast to remarkable genetic similarity in the north–south direction. North African admixture, associated with the Islamic conquest, can be dated to the period between c. AD 860–1120. [4]
A study published in 2019 using samples of 271 Iberians spanning prehistoric and historic times proposes the following inflexion points in Iberian genomic history: [5]
Studies of genome-wide variation, uniparental markers and ancient DNA agree that Iberians are primarily West European populations, but with a geographically structured layer of North African-related ancestry that is highest in the south and west of the Peninsula and very low or absent in the Basque Country and northeast. [26] Early work based on mtDNA and Y‑chromosome markers already pointed to elevated frequencies of typically Northwest African lineages (for example E‑M81, U6 and some L subclades) [27] in Iberia compared with most of Europe. More recent genome-wide studies using local ancestry and haplotype-based methods refine these observations and show that North African-related ancestry today ranges roughly between 0% and about 10–11% across mainland Iberia, with the highest values in southern Portugal and western Andalusia. [28]
The first large-scale autosomal analyses of African admixture in Europe (for example, Moorjani et al. 2011; Botigué et al. 2013) reported that Spain and Portugal have the highest African-related ancestry among mainland Europeans, and estimated total “African” ancestry in Iberia at approximately 2–4%, with North African-related ancestry between ~4% and 12% and “sub‑Saharan” ancestry below 1–3%, depending on the method. [30] However, these studies modelled African sources using present-day populations such as Mozabites and Yoruba as proxies, so that part of the ancestry now known to derive from an ancient North African lineage (Ancestral North African, ANA) carried by Iberomaurusian/Taforalt-related groups may have been misassigned as “sub‑Saharan”. [31] Later haplotype-based work dissecting North African gene flow into Iberia confirms that the main African contribution is Maghrebi and that the “sub‑Saharan” component detected in southwestern Europe largely arrived together with North African migrants rather than through a strong direct influx from sub‑Saharan Africa. [32]
Haplotype-sharing and identity‑by‑descent (IBD) analyses between Iberians and North Africans reveal at least two main episodes of North African gene flow into Iberia that correlate with historical Arab-Berber rule: an earlier, concentrated event affecting the northwest of the Peninsula and a more continuous and recent input in southern Iberia, with admixture dates broadly centred between the eighth and fifteenth centuries CE. [33] Maps of North African IBD show the highest sharing in southwestern Iberia and much lower values in the rest of Europe, whereas maps of “sub‑Saharan IBD” show only very weak signals in Iberia (and stronger ones in the Canary Islands), consistent with a scenario in which most African segments in Iberians derive predominantly from North African ancestors. [32] Because many of these analyses used Yoruba and related groups as technical proxies for deep African ancestry, the small “sub‑Saharan” component visualised in such maps should be interpreted cautiously, as it may partly reflect ancient ANA/Taforalt-related variation that has no exact modern counterpart. [34]
Y‑chromosome studies consistently find that Iberia has the highest frequencies of the typically Northwest African haplogroup E‑M81 in Europe, with averages around 3–7% and regional peaks of ~7–10%, alongside other E‑M78 subclades shared with North Africa. [35] These lineages reflect multiple episodes of male-mediated migration from the Maghreb, including prehistoric movements, Roman and Islamic periods, and later contacts, although they still represent a minority share of the Iberian paternal pool, which is dominated by R1b subclades of local West European origin. [36] Mitochondrial DNA surveys show comparatively high levels of the North African haplogroup U6 and of L lineages in western Andalusia and southern Portugal, with peninsula-wide mean frequencies around 3–4% for L and 4–6% for U6, again highest in the southwest. [37]
Current debates focus on whether U6 and L lineages in Iberia are mainly the result of Islamic‑period migrations and the Atlantic slave trade, or whether a substantial fraction arrived earlier together with prehistoric or early-historic North African populations. Phylogeographic analyses and complete mtDNA sequences indicate that many Iberian U6 lineages coalesce around 10 ka, and Hernández et al. (2015) concluded that “the estimated entrance of the North African U6 lineages into Iberia at 10 ky correlates well with other L African clades, indicating that some U6 and L lineages moved together from Africa to Iberia in the Early Holocene while a majority were introduced during historic times”. [38] Subsequent work on North African mitogenomes and ancient DNA suggests that a sizeable subset of Iberian L lineages are phylogenetically embedded within North‑West African clades that have deep Holocene roots and are associated with Iberomaurusian/ANA-related ancestry, whereas other L subclades with shallower coalescence times likely reflect later inputs during the Roman, Islamic and slave‑trade periods. [39]
The most comprehensive recent genomic studies therefore converge on a picture in which North African-related ancestry is detectable across most of the Iberian Peninsula, with values ranging from 0% to roughly 10–11% and highest levels in the south and west, while being almost absent in the Basque Country and parts of the northeast. [40] When models explicitly include ancient and modern North African sources and distinguish these from sub‑Saharan proxies, modern sub‑Saharan African admixture in mainland Iberia appears statistically almost nonexistent, whereas the bulk of the “African‑like” signal derives from ancient and historic North African (including ANA/Taforalt‑related) inputs. [41]
Recent ancient‑DNA work in North Africa has identified an Ancestral North African (ANA) component, carried by Iberomaurusian (Taforalt) and later Maghrebi populations, which includes a deep African branch often referred to as a “basal” or “ghost” lineage. [42] This ancestry forms its own clade and is not well represented by any present‑day Sub‑Saharan African population, although formal statistics sometimes find slightly closer affinities to West Africans than to other African groups; such “closer affinity” reflects shared deep ancestry rather than recent common origin. Earlier autosomal studies of Iberia that reported 1–3% “Sub‑Saharan African” ancestry used West African groups such as Yoruba as technical proxies for all non‑North‑African African ancestry, so part of the ANA‑related signal was likely misassigned as "Sub‑Saharan". [32] More recent modelling that separates explicit North African/ANA sources from Sub‑Saharan ones indicates that actual modern Sub‑Saharan African contribution to mainland Iberian populations is close to zero, whereas most of the “African‑like” component in Iberians derives from ancient and historic North African (ANA‑related) inputs that are only distantly related to present‑day West African groups.
Like other Western Europeans, among Spaniards and Portuguese the Y-DNA Haplogroup R1b is the most frequent, occurring at over 70% throughout most of Spain. [44] R1b is particularly dominant in the Basque Country and Catalonia, occurring at rate of over 80%. In Iberia, most men with R1b belong to the subclade R-P312 (R1b1a1a2a1a2; as of 2017). The distribution of haplogroups other than R1b varies widely from one region to another.
In Portugal as a whole the R1b haplogroups rate 70%, with some areas in the Northwest regions reaching over 90%. [45]
Although R1b prevails in much of Western Europe, a key difference is found in the prevalence in Iberia of R-DF27 (R1b1a1a2a1a2a). This subclade is found in over 60% of the male population in the Basque Country and 40-48% in Madrid, Alicante, Barcelona, Cantabria, Andalucia, Asturias and Galicia. [46] R-DF27 constitutes much more than the half of the total R1b in the Iberian Peninsula. Subsequent in-migration by members of other haplogroups and subclades of R1b did not affect its overall prevalence, although this falls to only two thirds of the total R1b in Valencia and the coast more generally. [44] R-DF27 is also a significant subclade of R1b in parts of France and Britain. R-S28/R-U152 (R1b1a1a2a1a2b) is the prevailing subclade of R1b in Northern Italy, Switzerland and parts of France, but it represents less than 5.0% of the male population in Iberia. Ancient samples from the central European Bell Beaker culture, Hallstatt culture and Tumulus culture belonged to this subclade. [47] [48] [49] R-S28/R-U152 is slightly significant in Seville, Barcelona, Portugal and Basque Country at 10-20% of the total population, but it is represented at frequencies of only 3.0% in Cantabria and Santander, 2.0% in Castille and Leon, 6% in Valencia, and under 1% in Andalusia. [44] Sephardic Jews I1 0% I2*/I2a 1% I2 0% Haplogroup R1a 5% R1b 13% G 15% Haplogroup J2 2 25% J*/J1 22% E-M2151b1b 9% T 6% Q 2% [50]
Haplogroup J, mostly subclades of Haplogroup J-M172 (J2), is found at levels of over 20% in some regions, while Haplogroup E has a general frequency of about 10% – albeit with peaks surpassing 30% in certain areas. Overall, E-M78 (E1b1b1a1 in 2017) and E-M81 (E1b1b1b1a in 2017) both constitute about 4.0% each, with a further 1.0% from Haplogroup E-M123 (E1b1b1b2a1) and 1.0% from unknown subclades of E-M96. [43] (E-M81 is widely considered to represent relatively historical migrations from North Africa).
There have been a number of studies about the mitochondrial DNA haplogroups (mtDNA) in Europe. In contrast to Y DNA haplogroups, mtDNA haplogroups did not show as much geographical patterning, but were more evenly ubiquitous. Apart from the outlying Sami, all Europeans are characterized by the predominance of haplogroups H, U and T. The lack of observable geographic structuring of mtDNA may be due to socio-cultural factors, namely patrilocality and a lack of polyandry. [52]
The subhaplogroups H1 and H3 have been subject to a more detailed study and would be associated to the Magdalenian expansion from Iberia c. 13,000 years ago: [53]
A 2007 European-wide study including Spanish Basques and Valencian Spaniards found Iberian populations to cluster the furthest from other continental groups, implying that Iberia holds the most ancient European ancestry. In this study, the most prominent genetic stratification in Europe was found to run from the north to the south-east, while another important axis of differentiation runs east–west across the continent. It also found, despite the differences, that all Europeans are closely related. [57]
| Region | Sample size | C | E | G | I | J2 | JxJ2 | R1a | R1b | Notes |
|---|---|---|---|---|---|---|---|---|---|---|
| Aragon | 34 | 6% | 0% | 18% | 12% | 0% | 3% | 56% | ||
| Andalusia East | 95 | 4% | 3% | 6% | 9% | 3% | 1% | 72% | ||
| Andalusia West | 73 | 15% | 4% | 5% | 14% | 1% | 4% | 54% | ||
| Asturias | 20 | 15% | 5% | 10% | 15% | 0% | 0% | 50% | ||
| Basques | 116 | 1% | 0% | 8% | 3% | 1% | 0% | 87% | ||
| Castilla La Mancha | 63 | 4% | 10% | 2% | 6% | 2% | 2% | 72% | ||
| Castile North-East | 31 | 9% | 3% | 3% | 3% | 0% | 0% | 77% | ||
| Castile North-West | 100 | 19% | 5% | 3% | 8% | 1% | 2% | 60% | ||
| Catalonia | 80 | >0% [59] | 3% | 6% | 3% | 6% | 0% | 0% | 81% | |
| Extremadura | 52 | 18% | 4% | 10% | 12% | 0% | 0% | 50% | ||
| Galicia | 88 | 17% | 6% | 10% | 7% | 1% | 0% | 57% | ||
| Valencia | 73 | >0% [60] | 10% | 1% | 10% | 5% | 3% | 3% | 64% | |
| Majorca | 62 | 9% | 6% | 8% | 8% | 2% | 0% | 66% | ||
| Menorca | 37 | 19% | 0% | 3% | 3% | 0% | 3% | 73% | ||
| Ibiza | 54 | 8% | 13% | 2% | 4% | 0% | 0% | 57% | ||
| Seville | 155 | 7% | 4% | 12% | 8% | 3% | 1% | 60% | ||
| Huelva | 22 | 14% | 0% | 9% | 14% | 0% | 0% | 59% | ||
| Cadiz | 28 | 4% | 0% | 14% | 14% | 4% | 0% | 51% | ||
| Cordoba | 27 | 11% | 0% | 15% | 15% | 0% | 0% | 56% | ||
| Málaga | 26 | 31% | 4% | 0% | 15% | 0% | 8% | 43% | ||
| Leon | 60 | 10% | 7% | 3% | 5% | 2% | 7% | 62% | ||
| Cantabria | 70 | 13% | 9% | 6% | 3% | 3% | 4% | 58% |
Excerpts from the Abstract of a study published [61] in 2015:
"[...] In the case of Portugal, previous population genetics studies have already revealed the general portrait of HVS-I and HVS-II mitochondrial diversity, becoming now important to update and expand the mitochondrial region analysed. Accordingly, a total of 292 complete control region sequences from continental Portugal were obtained, under a stringent experimental design to ensure the quality of data through double sequencing of each target region.* Furthermore, H-specific coding region SNPs were examined to detail haplogroup classification and complete mitogenomes were obtained for all sequences belonging to haplogroups U4 and U5. In general, a typical Western European haplogroup or Atlantic modal haplotype (AMH) composition was found in mainland Portugal, associated to high level of mitochondrial genetic diversity. Within the country, no signs of substructure were detected. The typing of extra coding region SNPs has provided the refinement or confirmation of the previous classification obtained with EMMA tool in 96% of the cases. Finally, it was also possible to enlarge haplogroup U phylogeny with 28 new U4 and U5 mitogenomes."
The AMH reaches the highest frequencies in the Iberian Peninsula, in Great Britain and Ireland. In the Iberian Peninsula it reaches 70% in Portugal as a whole, with more than 90% in NW Portugal and nearly 90% in Galicia (NW Spain), while the highest value is to be found among the Basques (NE Spain).
The Atlantic modal haplotype (AMH) or haplotype 15 is a Y chromosome haplotype of Y-STR microsatellite variations, associated with the Haplogroup R1b. It was discovered prior to many of the SNPs now used to identify subclades of R1b and references to it can be found in some of the older literature. It corresponds most closely with subclade R1b1a2a1a(1) [L11].
The AMH is the most frequently occurring haplotype amongst human males in Atlantic Europe. It is characterized by the following marker alleles:
{{cite journal}}: CS1 maint: article number as page number (link){{cite journal}}: CS1 maint: article number as page number (link){{cite journal}}: CS1 maint: article number as page number (link){{cite journal}}: CS1 maint: article number as page number (link){{cite journal}}: CS1 maint: article number as page number (link){{cite journal}}: CS1 maint: DOI inactive as of January 2026 (link)