Allele frequency

Last updated

Allele frequency, or gene frequency, is the relative frequency of an allele (variant of a gene) at a particular locus in a population, expressed as a fraction or percentage. [1] Specifically, it is the fraction of all chromosomes in the population that carry that allele over the total population or sample size. Microevolution is the change in allele frequencies that occurs over time within a population.

Contents

Given the following:

  1. A particular locus on a chromosome and a given allele at that locus
  2. A population of N individuals with ploidy n, i.e. an individual carries n copies of each chromosome in their somatic cells (e.g. two chromosomes in the cells of diploid species)
  3. The allele exists in i chromosomes in the population

then the allele frequency is the fraction of all the occurrences i of that allele and the total number of chromosome copies across the population, i/(nN).

The allele frequency is distinct from the genotype frequency, although they are related, and allele frequencies can be calculated from genotype frequencies. [1]

In population genetics, allele frequencies are used to describe the amount of variation at a particular locus or across multiple loci. When considering the ensemble of allele frequencies for many distinct loci, their distribution is called the allele frequency spectrum.

Calculation of allele frequencies from genotype frequencies

The actual frequency calculations depend on the ploidy of the species for autosomal genes.

Monoploids

The frequency (p) of an allele A is the fraction of the number of copies (i) of the A allele and the population or sample size (N), so

Diploids

If , , and are the frequencies of the three genotypes at a locus with two alleles, then the frequency p of the A-allele and the frequency q of the B-allele in the population are obtained by counting alleles. [2]

Because p and q are the frequencies of the only two alleles present at that locus, they must sum to 1. To check this:

and

If there are more than two different allelic forms, the frequency for each allele is simply the frequency of its homozygote plus half the sum of the frequencies for all the heterozygotes in which it appears.

(For 3 alleles see Allele § Genotype frequencies)

Allele frequency can always be calculated from genotype frequency, whereas the reverse requires that the Hardy–Weinberg conditions of random mating apply.

Example

Consider a locus that carries two alleles, A and B. In a diploid population there are three possible genotypes, two homozygous genotypes (AA and BB), and one heterozygous genotype (AB). If we sample 10 individuals from the population, and we observe the genotype frequencies

  1. freq (AA) = 6
  2. freq (AB) = 3
  3. freq (BB) = 1

then there are observed copies of the A allele and of the B allele, out of 20 total chromosome copies. The frequency p of the A allele is p = 15/20 = 0.75, and the frequency q of the B allele is q = 5/20 = 0.25.

Dynamics

Population genetics describes the genetic composition of a population, including allele frequencies, and how allele frequencies are expected to change over time. The Hardy–Weinberg law describes the expected equilibrium genotype frequencies in a diploid population after random mating. Random mating alone does not change allele frequencies, and the Hardy–Weinberg equilibrium assumes an infinite population size and a selectively neutral locus. [1]

In natural populations natural selection (adaptation mechanism), gene flow, and mutation combine to change allele frequencies across generations. Genetic drift causes changes in allele frequency from random sampling due to offspring number variance in a finite population size, with small populations experiencing larger per generation fluctuations in frequency than large populations. There is also a theory that second adaptation mechanism exists – niche construction [3] According to extended evolutionary synthesis adaptation occur due to natural selection, environmental induction, non-genetic inheritance, learning and cultural transmission. [4] An allele at a particular locus may also confer some fitness effect for an individual carrying that allele, on which natural selection acts. Beneficial alleles tend to increase in frequency, while deleterious alleles tend to decrease in frequency. Even when an allele is selectively neutral, selection acting on nearby genes may also change its allele frequency through hitchhiking or background selection.

While heterozygosity at a given locus decreases over time as alleles become fixed or lost in the population, variation is maintained in the population through new mutations and gene flow due to migration between populations. For details, see population genetics.

See also

Related Research Articles

An allele, or allelomorph, is a variant of the sequence of nucleotides at a particular location, or locus, on a DNA molecule.

The genotype of an organism is its complete set of genetic material. Genotype can also be used to refer to the alleles or variants an individual carries in a particular gene or genetic location. The number of alleles an individual can have in a specific gene depends on the number of copies of each chromosome found in that species, also referred to as ploidy. In diploid species like humans, two full sets of chromosomes are present, meaning each individual has two alleles for any given gene. If both alleles are the same, the genotype is referred to as homozygous. If the alleles are different, the genotype is referred to as heterozygous.

<span class="mw-page-title-main">Dominance (genetics)</span> One gene variant masking the effect of another in the other copy of the gene

In genetics, dominance is the phenomenon of one variant (allele) of a gene on a chromosome masking or overriding the effect of a different variant of the same gene on the other copy of the chromosome. The first variant is termed dominant and the second is called recessive. This state of having two different variants of the same gene on each chromosome is originally caused by a mutation in one of the genes, either new or inherited. The terms autosomal dominant or autosomal recessive are used to describe gene variants on non-sex chromosomes (autosomes) and their associated traits, while those on sex chromosomes (allosomes) are termed X-linked dominant, X-linked recessive or Y-linked; these have an inheritance and presentation pattern that depends on the sex of both the parent and the child. Since there is only one copy of the Y chromosome, Y-linked traits cannot be dominant or recessive. Additionally, there are other forms of dominance, such as incomplete dominance, in which a gene variant has a partial effect compared to when it is present on both chromosomes, and co-dominance, in which different variants on each chromosome both show their associated traits.

<span class="mw-page-title-main">Heritability</span> Estimation of effect of genetic variation on phenotypic variation of a trait

Heritability is a statistic used in the fields of breeding and genetics that estimates the degree of variation in a phenotypic trait in a population that is due to genetic variation between individuals in that population. The concept of heritability can be expressed in the form of the following question: "What is the proportion of the variation in a given trait within a population that is not explained by the environment or random chance?"

Fitness is a quantitative representation of individual reproductive success. It is also equal to the average contribution to the gene pool of the next generation, made by the same individuals of the specified genotype or phenotype. Fitness can be defined either with respect to a genotype or to a phenotype in a given environment or time. The fitness of a genotype is manifested through its phenotype, which is also affected by the developmental environment. The fitness of a given phenotype can also be different in different selective environments.

Population genetics is a subfield of genetics that deals with genetic differences within and among populations, and is a part of evolutionary biology. Studies in this branch of biology examine such phenomena as adaptation, speciation, and population structure.

<span class="mw-page-title-main">Hardy–Weinberg principle</span> Principle in genetics

In population genetics, the Hardy–Weinberg principle, also known as the Hardy–Weinberg equilibrium, model, theorem, or law, states that allele and genotype frequencies in a population will remain constant from generation to generation in the absence of other evolutionary influences. These influences include genetic drift, mate choice, assortative mating, natural selection, sexual selection, mutation, gene flow, meiotic drive, genetic hitchhiking, population bottleneck, founder effect,inbreeding and outbreeding depression.

Genetic linkage is the tendency of DNA sequences that are close together on a chromosome to be inherited together during the meiosis phase of sexual reproduction. Two genetic markers that are physically near to each other are unlikely to be separated onto different chromatids during chromosomal crossover, and are therefore said to be more linked than markers that are far apart. In other words, the nearer two genes are on a chromosome, the lower the chance of recombination between them, and the more likely they are to be inherited together. Markers on different chromosomes are perfectly unlinked, although the penetrance of potentially deleterious alleles may be influenced by the presence of other alleles, and these other alleles may be located on other chromosomes than that on which a particular potentially deleterious allele is located.

<span class="mw-page-title-main">Quantitative genetics</span> Study of the inheritance of continuously variable traits

Quantitative genetics is the study of quantitative traits, which are phenotypes that vary continuously—such as height or mass—as opposed to phenotypes and gene-products that are discretely identifiable—such as eye-colour, or the presence of a particular biochemical.

<span class="mw-page-title-main">Haplotype</span> Group of genes from one parent

A haplotype is a group of alleles in an organism that are inherited together from a single parent.

In population genetics, linkage disequilibrium (LD) is the non-random association of alleles at different loci in a given population. Loci are said to be in linkage disequilibrium when the frequency of association of their different alleles is higher or lower than expected if the loci were independent and associated randomly.

In population genetics, F-statistics describe the statistically expected level of heterozygosity in a population; more specifically the expected degree of (usually) a reduction in heterozygosity when compared to Hardy–Weinberg expectation.

<span class="mw-page-title-main">Haldane's dilemma</span> Limit on the speed of beneficial evolution

Haldane's dilemma, also known as the waiting time problem, is a limit on the speed of beneficial evolution, calculated by J. B. S. Haldane in 1957. Before the invention of DNA sequencing technologies, it was not known how much polymorphism DNA harbored, although alloenzymes were beginning to make it clear that substantial polymorphism existed. This was puzzling because the amount of polymorphism known to exist seemed to exceed the theoretical limits that Haldane calculated, that is, the limits imposed if polymorphisms present in the population generally influence an organism's fitness. Motoo Kimura's landmark paper on neutral theory in 1968 built on Haldane's work to suggest that most molecular evolution is neutral, resolving the dilemma. Although neutral evolution remains the consensus theory among modern biologists, and thus Kimura's resolution of Haldane's dilemma is widely regarded as correct, some biologists argue that adaptive evolution explains a large fraction of substitutions in protein coding sequence, and they propose alternative solutions to Haldane's dilemma.

<span class="mw-page-title-main">Wahlund effect</span>

In population genetics, the Wahlund effect is a reduction of heterozygosity in a population caused by subpopulation structure. Namely, if two or more subpopulations are in a Hardy–Weinberg equilibrium but have different allele frequencies, the overall heterozygosity is reduced compared to if the whole population was in equilibrium. The underlying causes of this population subdivision could be geographic barriers to gene flow followed by genetic drift in the subpopulations.

<span class="mw-page-title-main">Genotype frequency</span>

Genetic variation in populations can be analyzed and quantified by the frequency of alleles. Two fundamental calculations are central to population genetics: allele frequencies and genotype frequencies. Genotype frequency in a population is the number of individuals with a given genotype divided by the total number of individuals in the population. In population genetics, the genotype frequency is the frequency or proportion of genotypes in a population.

Genetic load is the difference between the fitness of an average genotype in a population and the fitness of some reference genotype, which may be either the best present in a population, or may be the theoretically optimal genotype. The average individual taken from a population with a low genetic load will generally, when grown in the same conditions, have more surviving offspring than the average individual from a population with a high genetic load. Genetic load can also be seen as reduced fitness at the population level compared to what the population would have if all individuals had the reference high-fitness genotype. High genetic load may put a population in danger of extinction.

Mutation–selection balance is an equilibrium in the number of deleterious alleles in a population that occurs when the rate at which deleterious alleles are created by mutation equals the rate at which deleterious alleles are eliminated by selection. The majority of genetic mutations are neutral or deleterious; beneficial mutations are relatively rare. The resulting influx of deleterious mutations into a population over time is counteracted by negative selection, which acts to purge deleterious mutations. Setting aside other factors, the equilibrium number of deleterious alleles is then determined by a balance between the deleterious mutation rate and the rate at which selection purges those mutations.

In population genetics, fixation is the change in a gene pool from a situation where there exists at least two variants of a particular gene (allele) in a given population to a situation where only one of the alleles remains. That is, the allele becomes fixed. In the absence of mutation or heterozygote advantage, any allele must eventually either be lost completely from the population, or fixed, i.e. permanently established at 100% frequency in the population. Whether a gene will ultimately be lost or fixed is dependent on selection coefficients and chance fluctuations in allelic proportions. Fixation can refer to a gene in general or particular nucleotide position in the DNA chain (locus).

<span class="mw-page-title-main">Zygosity</span> Degree of similarity of the alleles in an organism

Zygosity is the degree to which both copies of a chromosome or gene have the same genetic sequence. In other words, it is the degree of similarity of the alleles in an organism.

This glossary of genetics and evolutionary biology is a list of definitions of terms and concepts used in the study of genetics and evolutionary biology, as well as sub-disciplines and related fields, with an emphasis on classical genetics, quantitative genetics, population biology, phylogenetics, speciation, and systematics. Overlapping and related terms can be found in Glossary of cellular and molecular biology, Glossary of ecology, and Glossary of biology.

References

  1. 1 2 3 Gillespie, John H. (2004). Population genetics : a concise guide (2. ed.). Baltimore, Md.: The Johns Hopkins University Press. ISBN   978-0801880087.
  2. "Population and Evolutionary Genetics". ndsu.edu.
  3. Scott-Phillips, T. C.; Laland, K. N.; Shuker, D. M.; Dickins, T. E.; West, S. A. (2014). "The Niche Construction Perspective: A Critical Appraisal". Evolution. 68 (5): 1231–1243. doi:10.1111/evo.12332. PMC   4261998 . PMID   24325256.
  4. Laland, K. N.; Uller, T.; Feldman, M. W.; Sterelny, K.; Müller, G. B.; Moczek, A.; Jablonka, E.; Odling-Smee, J. (Aug 2015). "The extended evolutionary synthesis: its structure, assumptions and predictions". Proc Biol Sci. 282 (1813): 20151019. doi:10.1098/rspb.2015.1019. PMC   4632619 . PMID   26246559.

Cheung, KH; Osier MV; Kidd JR; Pakstis AJ; Miller PL; Kidd KK (2000). "ALFRED: an allele frequency database for diverse populations and DNA polymorphisms". Nucleic Acids Research. 28 (1): 361–3. doi:10.1093/nar/28.1.361. PMC   102486 . PMID   10592274.

Middleton, D; Menchaca L; Rood H; Komerofsky R (2002). "New allele frequency database: www.allelefrequencies.net". Tissue Antigens. 61 (5): 403–7. doi: 10.1034/j.1399-0039.2003.00062.x . PMID   12753660.