The study of the genetics and archaeogenetics of the Gujarati people of India aims at uncovering these people's genetic history. According to the 1000 Genomes Project, "Gujarati" is a general term used to describe people who trace their ancestry to the region of Gujarat, located in the northwestern part of the Indian subcontinent, and who speak the Gujarati language, an Indo-European language. [1] They have some genetic commonalities as well as differences with other ethnic groups of India.
Our analysis of haplotype similarity at the SLC24A5 region [which affects skin pigmentation] between the Gujarati Indians also indicated greater degree of sharing with the southern Europeans than with the south Tamil Indians.
One of the GIH subgroups fall outside the main gradient of Indian groups, suggesting that they harbor substantial ancestry that is not a simple mixture of ASI and ANI. A speculative hypothesisis that some Gujarati groups descend from the founders of the “Gurjara Pratihara” empire, which is thought to have been founded by Central Asian invaders in the 7th century A.D. and to have ruled parts of northwest India from the 7-12th centuries. I. Karve noted that endogamous groups with names like “Gurjar” are now distributed throughout the northwest of the subcontinent, and hypothesized that that they likely trace their names to this invading group.
The 1000 Genomes Project collected 117 samples of unrelated Gujarati people living in Houston, Texas (abbreviated GIH). [1] These samples were analyzed in a 2016 study by South African and Indian geneticists (Sengupta et al., 2016), who divided them into two subgroups, and used admixture to estimate the proportion of inferred ancestral component for each subgroup. Based on genetic and geographical affinities, Gujaratis and Gujarati Brahmins were placed in the Northwest Indian subcontinent in supplementary table 1 of this study. [12]
Populations | Ancestral North Indian (ANI) | Ancestral South Indian (ASI) | Ancestral Austroasiatic (AAA) | Ancestral Tibeto-Burman (ATB) | Andaman archipelago (A&N) |
---|---|---|---|---|---|
Gujaratis (1) | 76.2% | 9.4% | 13.8% | 0.3% | 0.2% |
Gujaratis (2) | 80.7% | 7.1% | 8.5% | 3.1% | 0.5% |
A separate study by geneticists from West Bengal (Basu et al., 2016) estimated the following ancestral proportions for Gujarati Brahmins, and classified them in northwest India: [13]
Population | ANI | ASI | AAA | ATB |
---|---|---|---|---|
Gujarati Brahmins | 88.23% | 7.59% | 4.12% | 0.06% |
Most South Asians carry both the Ancestral North Indian (ANI) component, which is closely related to those in Central Asia, West Asia and Europe, and the Ancestral South Indian (ASI) component, which is restricted to South Asia. [8] [14]
Over 33% of all mitochondrial genetic markers of the population of Gujarat originate from West Eurasia. The mitochondrial DNA (mtDNA) sub-haplogroup U7 is common in Gujarat. It is found in over 12% of the population, higher than in Punjab (9%), Pakistan, Iran, Afghanistan, or anywhere else. U7 is only found in frequencies between 0% and 0.9% in other populations of India. The study by Quintana-Murci et al. (2004) of 34 Gujaratis found the presence of U2a and U7 sub-haplogroups (8.8%) followed by U2b (5.9%) and U2c (2.9%). [15] [16] [17]
The study by Quintana-Murci et al. (2004) of Gujaratis found the presence of R* (8.8%), H (5.9%), J1(2.9%). [16] [17]
The study by Quintana-Murci et al. (2004) found presence of haplogroup W (8.8%) in Gujaratis as well as other northwestern states like Punjab and Kashmir. Haplogroup W is descended from the haplogroup N2. They also found N* (2.9%) in Gujaratis. [16] [18] [17]
The mtDNA haplogroup HV is found in at least one Patel Gujarati family which is also found in some Tamil Brahmins. [18]
The Y chromosome DNA (Y-DNA) of some Gujarati males has haplogroup R1a, specifically R-L657 (the L657 subclade). Other has R-Y874 (the Y874 subclade of R1a1a1) which also found in Telugu people. Other common finding is haplogroup J2b2 (J-M241). The SNP mutation referred to as Y951 is also found in Gujarati people as well as Punjabis. The haplogroup C-M356 (C1b, previously called C5a, within which C-P92 is a subclade defined by the P92 mutation) is also found among some Gujarati people. [18] [19]
A set of 48 bi-allelic markers on the non-recombining region of Y chromosome (NRY) were analysed in 284 males; representing nine Indo-European speaking tribal populations of South Gujarat. The genetic structure of the populations revealed that none of these groups was overtly admixed or completely isolated. However, elevated haplogroup diversity and FST value point towards greater diversity and differentiation which suggests the possibility of early demographic expansion of the study groups. The phylogenetic analysis revealed 13 paternal lineages, of which six haplogroups: C5, H1a*, H2, J2, R1a1* and R2 accounted for a major portion of the Y chromosome diversity. The higher frequency of the six haplogroups and the pattern of clustering in the populations indicated overlapping of haplogroups with West and Central Asian populations. Other analyses undertaken on the population affiliations revealed that the Indo-European speaking populations along with the Dravidian speaking groups of southern India have an influence on the tribal groups of Gujarat. The vital role of geography in determining the distribution of Y lineages was also noticed. This implies that although language plays a vital role in determining the distribution of Y lineages, the present day linguistic affiliation of any population in India for reconstructing the demographic history of the country should be considered with caution. [20]
Following the 2010 discovery that the oldest strain of the R1a1a branch was concentrated in the Gujarat-Sindh-Western Rajasthan region, it has been suggested[ by whom? ] that this location may have been the origin of this genetic group. The genetic similarities between North Indians and Eastern Europeans are possibly due to having a common ancestor. R1b, the most common haplogroup in Western Europe, has a relatively low concentration in India, and it possibly originated in the Persian Gulf. It is possible that these haplogroups originated from two major genetic dispersals from the Persian Gulf-Makran-Gujarat region. [21]
The pericentric inversion of the Y-chromosome (inv(Y)) is a rare chromosomal heteromorphism which is hereditary. It has no clinical significance or reproductive disadvantage. [22] It is studied in Gujarati Muslims of South Africa, 8 inv(Y) men opposed to 9 normal men. The p49a/TaqI and p49a/PvuII haplotypes were determined and found that 8 inv(Y) possessed identical TaqI and PvuII where as 7 different TaqI and 8 different PvuII haplotypes observed in the 9 normal men. So It is concluded that inv(Y) has common genetic origin in Gujarati Muslims of South Africa. The origin is traced to Kholvad, a small village near Surat, and some neighbouring villages. It is probably originated through random genetic drift in reproductively isolated community, maintained by strict endogamy based on religious and linguistic affiliations. [23] [24]
Vitiligo is a long term skin condition characterized by patches of the skin losing their pigment. It affects 1—2% of the population of world while 0.5—2.5% population of India. But Gujarat and Rajasthan has the highest prevalence (~8.8%). [25] Rasheedunnisa Begum et al. conducted genetic studies of over 1500 patients from Gujarat and found an SNP variation in the autosomal DNA of Gujaratis which make them more prone to Vitiligo. [26] [27]
A study suggested the Indian deletion β° type (600 bp deletion involving the 3’ end) thalassaemia has single origin as all 23 patients of it in study were from either Sindh or Gujarat who had identical haplotypes. [28]
The Leicestershire Perinatal Mortality Survey for the years 1976 to 1982 found high incidences of Meckel syndrome in Gujarati immigrants. [29]
In India, caste is a form of social stratification characterized by endogamy. Within caste, there are endogamous groups known as gol (marriage circle) by Gujarati people. In it, there are small number of exogamous lineages known as gotra. A person can not marry within gotra as well as outside gol. Gujarati Patels practice this "exogamic endogamy" which is also found elsewhere in India. One such gol known as Chha Gaam Patels (CGP) include people from six villages of Charotar region of Gujarat. A study found their genetic similarities as well as confirms their patrilocal and patrilineal practices within the group. It also confirms low-level of female gene inflow within the group. The study illustrates impact of culture marriage practice on genetic variation in Indian population. [30]
The Parsis have sharp contrast between genetic data obtained from mitochondrial DNA (mtDNA), a maternal component, and Y-chromosome DNA (Y-DNA), a paternal component. According to Y-DNA, they resemble the Iranian population, which supports their historical origin from Iran. But about 60% of their maternal gene pool originates from South Asian haplogroups, which is just 7% in Iranians. Parsis have high frequency of haplogroup M of mtDNA (55%), similar to Indians, which is just 1.7% in combined Iranian sample. Due to high diversity in Y-DNA and mtDNA lineages, the strong drift effect is unlikely even though they had small population. The studies suggest a male-mediated migration of Parsi ancestors from Iran to Gujarat where they admixed with local female population which ultimately resulting in loss of Iranian mtDNA. [17]
Dawoodi Bohras of Gujarat show 47% genetic contribution from West Asia, especially Iran; followed by 30% from Arabia and from 23% closest Hindu parental populations. This shows considerable genetic flow from West Asia in them. [31]
Four tribal populations (Chaudhari, Vasava, Kotwalia and Gamit) of the Surat district in South Gujarat were studied for the distribution of 22 polymorphic systems of the blood. The studies suggested that they are considerably hetero-genetic and have small genetic difference among them which is due to genetic variation. It also showed that Vasava and Kotwalia are closely related genetically, and Chaudhari and Gamit are different from them. [32]
According to a 2016 study on "Population Stratification and Underrepresentation of Indian Subcontinent Genetic Diversity in the 1000 Genomes Project Dataset", It shows us that, Gujaratis (GIH Samples in 1000 genomes project) are closer to PJL (Punjabis from Lahore) samples (Fst=0.0036) and ITU (Telugu) samples (Fst=0.0039) than they are to STU (Sri Lankan Tamil) samples (Fst=0.0044) and BEB (Bengali) samples (Fst=0.0045). [12]
Genetics and archaeogenetics of South Asia is the study of the genetics and archaeogenetics of the ethnic groups of South Asia. It aims at uncovering these groups' genetic histories. The geographic position of the Indian subcontinent makes its biodiversity important for the study of the early dispersal of anatomically modern humans across Asia.
In human genetics, Haplogroup J-M172 or J2 is a Y-chromosome haplogroup which is a subclade (branch) of haplogroup J-M304. Haplogroup J-M172 is common in modern populations in Western Asia, Central Asia, South Asia, Southern Europe, Northwestern Iran and North Africa. It is thought that J-M172 may have originated in the Caucasus, Anatolia and/or Western Iran.
Haplogroup J-M304, also known as J, is a human Y-chromosome DNA haplogroup. It is believed to have evolved in Western Asia. The clade spread from there during the Neolithic, primarily into North Africa, the Horn of Africa, the Socotra Archipelago, the Caucasus, Europe, Anatolia, Central Asia, South Asia, and Southeast Asia.
Haplogroup F, also known as F-M89 and previously as Haplogroup FT, is a very common Y-chromosome haplogroup. The clade and its subclades constitute over 90% of paternal lineages outside of Africa.
Haplogroup H (Y-DNA), also known as H-L901/M2939, is a Y-chromosome haplogroup.
Haplogroup K or K-M9 is a genetic lineage within human Y-chromosome DNA haplogroup. A sublineage of haplogroup IJK, K-M9, and its descendant clades represent a geographically widespread and diverse haplogroup. The lineages have long been found among males on every continent except Antarctica.
Haplogroup L-M20 is a human Y-DNA haplogroup, which is defined by SNPs M11, M20, M61 and M185. As a secondary descendant of haplogroup K and a primary branch of haplogroup LT, haplogroup L currently has the alternative phylogenetic name of K1a, and is a sibling of haplogroup T.
Haplogroup P also known as P-F5850 or K2b2 is a Y-chromosome DNA haplogroup in human genetics. P-F5850 is a branch of K2b, which is a branch of Haplogroup K2 (K-M526).
Haplogroup Q or Q-M242 is a Y-chromosome DNA haplogroup. It has one primary subclade, Haplogroup Q1 (L232/S432), which includes numerous subclades that have been sampled and identified in males among modern populations.
In human genetics, a human Y-chromosome DNA haplogroup is a haplogroup defined by specific mutations in the non-recombining portions of DNA on the male-specific Y chromosome (Y-DNA). Individuals within a haplogroup share similar numbers of short tandem repeats (STRs) and single-nucleotide polymorphisms (SNPs). The Y-chromosome accumulates approximately two mutations per generation, and Y-DNA haplogroups represent significant branches of the Y-chromosome phylogenetic tree, each characterized by hundreds or even thousands of unique mutations.
Haplogroup R, or R-M207, is a Y-chromosome DNA haplogroup. It is both numerous and widespread among modern populations.
Haplogroup R2a, or haplogroup R-M124, is a Y-chromosome haplogroup characterized by genetic markers M124, P249, P267, L266, and is mainly found in South Asia as well as in Central Asia, Caucasus, West Asia and North Africa / MENA.
The Burusho, or Brusho, also known as the Botraj, are an ethnolinguistic group indigenous to the Yasin, Hunza, Nagar, and other valleys of Gilgit–Baltistan in northern Pakistan, with a tiny minority of around 350 Burusho people residing in Jammu and Kashmir, India. Their language, Burushaski, has been classified as a language isolate.
The genetic history of Europe includes information around the formation, ethnogenesis, and other DNA-specific information about populations indigenous, or living in Europe.
The peopling of India refers to the migration of Homo sapiens into the Indian subcontinent. Anatomically modern humans settled India in multiple waves of early migrations, over tens of millennia. The first migrants came with the Coastal Migration/Southern Dispersal 65,000 years ago, whereafter complex migrations within South and Southeast Asia took place. West Asian (Iranian) hunter-gatherers migrated to South Asia after the Last Glacial Period but before the onset of farming. Together with ancient South Asian hunter-gatherers they formed the population of the Indus Valley Civilisation (IVC).
Genetic studies on the Sinhalese is part of population genetics investigating the origins of the Sinhalese population.
Listed here are notable ethnic groups and populations from Western Asia, Egypt and South Caucasus by human Y-chromosome DNA haplogroups based on relevant studies. The samples are taken from individuals identified with the ethnic and linguistic designations in the first two columns, the third column gives the sample size studied, and the other columns give the percentage of the particular haplogroup. Some old studies conducted in the early 2000s regarded several haplogroups as one haplogroup, e.g. I, G and sometimes J were haplogroup 2, so conversion sometimes may lead to unsubstantial frequencies below.
Y-DNA haplogroups in populations of South Asia are haplogroups of the male Y-chromosome found in South Asian populations.
The father tongue hypothesis proposes the idea that humans tend to speak their father's language. The hypothesis is based on a 1997 proposal that linguistic affiliation correlates more closely with Y-chromosomal variation than with mitochondrial DNA variation. The initial work was performed on African and European samples by a team of population geneticists, led by Laurent Excoffier. On the basis of these, and similar findings by other geneticists, the hypothesis was elaborated by historical linguist George van Driem in 2010 that the teaching by a mother of her spouse's tongue to her children is a mechanism by which language has preferentially been spread over time. Focusing on prehistoric language shift in already settled areas, examples worldwide show that as little as 10–20% of prehistoric male immigration can cause a language switch, indicating an elite imposition such as may have happened with the appearance of the first farmers or metalworkers in the Neolithic, Bronze and Iron Ages.
Haplogroup K2b (P331), also known as MPS is a human y-chromosome haplogroup that is thought to be less than 3,000 years younger than K, and less than 10,000 years younger than F, meaning it probably is around 50,000 years old, according to the age estimates of Tatiana Karafet et al. 2014.