STR analysis

Last updated
Short tandem repeat (STR) analysis on a simplified model using polymerase chain reaction (PCR): First, a DNA sample undergoes PCR with primers targeting certain STRs (which vary in lengths between individuals and their alleles). The resultant fragments are separated by size (such as electrophoresis). Short Tandem Repeat (STR) analysis.png
Short tandem repeat (STR) analysis on a simplified model using polymerase chain reaction (PCR): First, a DNA sample undergoes PCR with primers targeting certain STRs (which vary in lengths between individuals and their alleles). The resultant fragments are separated by size (such as electrophoresis).
A partial human STR profile obtained using the Applied Biosystems Identifiler kit Str profile.jpg
A partial human STR profile obtained using the Applied Biosystems Identifiler kit

Shorttandemrepeat (STR) analysis is a common molecular biology method used to compare allele repeats at specific loci in DNA between two or more samples. A short tandem repeat is a microsatellite with repeat units that are 2 to 7 base pairs in length, with the number of repeats varying among individuals, making STRs effective for human identification purposes. [2] This method differs from restriction fragment length polymorphism analysis (RFLP) since STR analysis does not cut the DNA with restriction enzymes. Instead, polymerase chain reaction (PCR) is employed to discover the lengths of the short tandem repeats based on the length of the PCR product.

Forensic uses

STR analysis is a tool in forensic analysis that evaluates specific STR regions found on nuclear DNA. The variable (polymorphic) nature of the STR regions that are analyzed for forensic testing intensifies the discrimination between one DNA profile and another. [3] Scientific tools such as FBI approved STRmix incorporate this research technique. [4] [5] Forensic science takes advantage of the population's variability in STR lengths, enabling scientists to distinguish one DNA sample from another. The system of DNA profiling used today is based on PCR and uses simple sequences [6] or short tandem repeats (STR). This method uses highly polymorphic regions that have short repeated sequences of DNA (the most common is 4 bases repeated, but there are other lengths in use, including 3 and 5 bases). Because unrelated people almost certainly have different numbers of repeat units, STRs can be used to discriminate between unrelated individuals. These STR loci (locations on a chromosome) are targeted with sequence-specific primers and amplified using PCR. The DNA fragments that result are then separated and detected using electrophoresis. There are two common methods of separation and detection, capillary electrophoresis (CE) and gel electrophoresis.

Each STR is polymorphic, but the number of alleles is very small. Typically each STR allele will be shared by around 5 - 20% of individuals. The power of STR analysis comes from looking at multiple STR loci simultaneously. [6] The pattern of alleles can identify an individual quite accurately. Thus STR analysis provides an excellent identification tool. The more STR regions that are tested in an individual the more discriminating the test becomes. [6] Given 10 loci, it can result in an error margin of 30%, or nearly one third of the time. [7]

From country to country, different STR-based DNA-profiling systems are in use. In North America, systems that amplify the CODIS 13 core loci are almost universal, whereas in the United Kingdom the DNA-17 17 loci system (which is compatible with The National DNA Database) is in use. Whichever system is used, many of the STR regions used are the same. These DNA-profiling systems are based on multiplex reactions, whereby many STR regions will be tested at the same time.

The true power of STR analysis is in its statistical power of discrimination. Because the 13 loci that are currently used for discrimination in CODIS are independently assorted (having a certain number of repeats at one locus does not change the likelihood of having any number of repeats at any other locus), the product rule for probabilities can be applied. This means that, if someone has the DNA type of ABC, where the three loci were independent, we can say that the probability of having that DNA type is the probability of having type A times the probability of having type B times the probability of having type C. This has resulted in the ability to generate match probabilities of 1 in a quintillion (1x1018) or more. However, DNA database searches showed much more frequent than expected false DNA profile matches. [8] Moreover, since there are about 12 million monozygotic twins on Earth, the theoretical probability is not accurate.

In practice, the risk of contaminated-matching is much greater than matching a distant relative, such as contamination of a sample from nearby objects, or from left-over cells transferred from a prior test. The risk is greater for matching the most common person in the samples: Everything collected from, or in contact with, a victim is a major source of contamination for any other samples brought into a lab. For that reason, multiple control-samples are typically tested in order to ensure that they stayed clean, when prepared during the same period as the actual test samples. Unexpected matches (or variations) in several control-samples indicates a high probability of contamination for the actual test samples. In a relationship test, the full DNA profiles should differ (except for twins), to prove that a person was not matched as being related to their own DNA in another sample.[ citation needed ]

In biomedical research, STR profiles are used to authenticate cell lines. [9] Self-generated STR profiles can be compared with databases such as CLASTR (https://www.cellosaurus.org/cellosaurus-str-search/) or STRBase (https://strbase.nist.gov/). In addition, self-generated primary murine cell lines cultured before the first passaging can be matched with later passages, thus ensuring the identity of the cell line.

Related Research Articles

<span class="mw-page-title-main">Polymerase chain reaction</span> Laboratory technique to multiply a DNA sample for study

The polymerase chain reaction (PCR) is a method widely used to make millions to billions of copies of a specific DNA sample rapidly, allowing scientists to amplify a very small sample of DNA sufficiently to enable detailed study. PCR was invented in 1983 by American biochemist Kary Mullis at Cetus Corporation. Mullis and biochemist Michael Smith, who had developed other essential ways of manipulating DNA, were jointly awarded the Nobel Prize in Chemistry in 1993.

In molecular biology, restriction fragment length polymorphism (RFLP) is a technique that exploits variations in homologous DNA sequences, known as polymorphisms, populations, or species or to pinpoint the locations of genes within a sequence. The term may refer to a polymorphism itself, as detected through the differing locations of restriction enzyme sites, or to a related laboratory technique by which such differences can be illustrated. In RFLP analysis, a DNA sample is digested into fragments by one or more restriction enzymes, and the resulting restriction fragments are then separated by gel electrophoresis according to their size.

A microsatellite is a tract of repetitive DNA in which certain DNA motifs are repeated, typically 5–50 times. Microsatellites occur at thousands of locations within an organism's genome. They have a higher mutation rate than other areas of DNA leading to high genetic diversity. Microsatellites are often referred to as short tandem repeats (STRs) by forensic geneticists and in genetic genealogy, or as simple sequence repeats (SSRs) by plant geneticists.

<span class="mw-page-title-main">DNA profiling</span> Technique used to identify individuals via DNA characteristics

DNA profiling is the process of determining an individual's deoxyribonucleic acid (DNA) characteristics. DNA analysis intended to identify a species, rather than an individual, is called DNA barcoding.

A minisatellite is a tract of repetitive DNA in which certain DNA motifs are typically repeated two to several hundred times. Minisatellites occur at more than 1,000 locations in the human genome and they are notable for their high mutation rate and high diversity in the population. Minisatellites are prominent in the centromeres and telomeres of chromosomes, the latter protecting the chromosomes from damage. The name "satellite" refers to the early observation that centrifugation of genomic DNA in a test tube separates a prominent layer of bulk DNA from accompanying "satellite" layers of repetitive DNA. Minisatellites are small sequences of DNA that do not encode proteins but appear throughout the genome hundreds of times, with many repeated copies lying next to each other.

<span class="mw-page-title-main">Haplotype</span> Group of genes from one parent

A haplotype is a group of alleles in an organism that are inherited together from a single parent.

<span class="mw-page-title-main">Variable number tandem repeat</span>

A variable number tandem repeat is a location in a genome where a short nucleotide sequence is organized as a tandem repeat. These can be found on many chromosomes, and often show variations in length among individuals. Each variant acts as an inherited allele, allowing them to be used for personal or parental identification. Their analysis is useful in genetics and biology research, forensics, and DNA fingerprinting.

Forensic identification is the application of forensic science, or "forensics", and technology to identify specific objects from the trace evidence they leave, often at a crime scene or the scene of an accident. Forensic means "for the courts".

<span class="mw-page-title-main">Promega</span>

Promega Corporation is a Madison, Wisconsin–based manufacturer of enzymes and other products for biotechnology and molecular biology with a portfolio covering the fields of genomics, protein analysis and expression, cellular analysis, drug discovery, and genetic identity.

A Y-STR is a short tandem repeat (STR) on the Y-chromosome. Y-STRs are often used in forensics, paternity, and genealogical DNA testing. Y-STRs are taken specifically from the male Y chromosome. These Y-STRs provide a weaker analysis than autosomal STRs because the Y chromosome is only found in males, which are only passed down by the father, making the Y chromosome in any paternal line practically identical. This causes a significantly smaller amount of distinction between Y-STR samples. Autosomal STRs provide a much stronger analytical power because of the random matching that occurs between pairs of chromosomes during the zygote-making process.

<span class="mw-page-title-main">Electropherogram</span>


An electropherogram, or electrophoretogram, can also be referred to as an EPG or e-gram. It is a record or chart produced when electrophoresis is used in an analytical technique, primarily in the fields of forensic biology, molecular biology and biochemistry. The method utilizes data points that correspond with a specific time and fluorescence intensity at various wavelengths of light to represent a DNA profile.

Second Generation Multiplex Plus (SGM Plus), is a DNA profiling system developed by Applied Biosystems. It is an updated version of Second Generation Multiplex. SGM Plus has been used by the UK National DNA Database since 1998.

<span class="mw-page-title-main">Forensic biology</span> Forensic application of the study of biology

Forensic biology is the use of biological principles and techniques in the context of law enforcement investigations.

SNP genotyping is the measurement of genetic variations of single nucleotide polymorphisms (SNPs) between members of a species. It is a form of genotyping, which is the measurement of more general genetic variation. SNPs are one of the most common types of genetic variation. An SNP is a single base pair mutation at a specific locus, usually consisting of two alleles. SNPs are found to be involved in the etiology of many human diseases and are becoming of particular interest in pharmacogenetics. Because SNPs are conserved during evolution, they have been proposed as markers for use in quantitative trait loci (QTL) analysis and in association studies in place of microsatellites. The use of SNPs is being extended in the HapMap project, which aims to provide the minimal set of SNPs needed to genotype the human genome. SNPs can also provide a genetic fingerprint for use in identity testing. The increase of interest in SNPs has been reflected by the furious development of a diverse range of SNP genotyping methods.

<span class="mw-page-title-main">Earth Human STR Allele Frequencies Database</span>

The Earth Human STR Allele Frequencies Database is a scientific project based on a dynamic web interface and a relational database management system. Its main purpose is the management of STR populational data reported from all over the world, providing highly specialized population genetics tools and also an overview of world population genetic structure at global scale.

In paternity testing, Paternity Index (PI) is a calculated value generated for a single genetic marker or locus and is associated with the statistical strength or weight of that locus in favor of or against parentage given the phenotypes of the tested participants and the inheritance scenario. Phenotype typically refers to physical characteristics such as body plan, color, behavior, etc. in organisms. However, the term used in the area of DNA paternity testing refers to what is observed directly in the laboratory. Laboratories involved in parentage testing and other fields of human identity employ genetic testing panels that contain a battery of loci each of which is selected due to extensive allelic variations within and between populations. These genetic variations are not assumed to bestow physical and/or behavioral attributes to the person carrying the allelic arrangement(s) and therefore are not subject to selective pressure and follow Hardy Weinberg inheritance patterns.

<span class="mw-page-title-main">Combined DNA Index System</span> United States national DNA database

The Combined DNA Index System (CODIS) is the United States national DNA database created and maintained by the Federal Bureau of Investigation. CODIS consists of three levels of information; Local DNA Index Systems (LDIS) where DNA profiles originate, State DNA Index Systems (SDIS) which allows for laboratories within states to share information, and the National DNA Index System (NDIS) which allows states to compare DNA information with one another.

The terms "relative fluorescence units" (RFU) and "RFU peak" refer to measurements in electrophoresis methods, such as for DNA analysis. A "relative fluorescence unit" is a unit of measurement used in analysis which employs fluorescence detection. Fluorescence is detected using a charged coupled device (CCD) array, when the labeled fragments, which are separated within a capillary by using electrophoresis, are energized by laser light and travel across the detection window. A computer program measures the results, determining the quantity or size of the fragments, at each data point, from the level of fluorescence intensity. Samples which contain higher quantities of amplified DNA will have higher corresponding RFU values.

DNA Specimen Provenance Assignment (DSPA) also known as DNA Specimen ProvenanceAssay, is a molecular diagnostic test used to definitively assign biopsy specimen identity and establish specimen purity during the diagnostic testing cycle for cancer and other histopathological conditions. The term first appeared in the 2011 scientific paper, “The Changing Spectrum of DNA-Based Specimen Provenance Testing in Surgical Pathology,” published in the American Journal of Clinical Pathology, which built upon concepts described in an earlier paper published in the Journal of Urology.

<span class="mw-page-title-main">Forensic DNA analysis</span>

DNA profiling is the determination of a DNA profile for legal and investigative purposes. DNA analysis methods have changed countless times over the years as technology changes and allows for more information to be determined with less starting material. Modern DNA analysis is based on the statistical calculation of the rarity of the produced profile within a population.

References

  1. Image by Mikael Häggström, MD, using following source image: Figure 1 - available via license: Creative Commons Attribution 4.0 International", from the following article:
    Roberta Sitnik, Margareth Afonso Torres, Nydia Strachman Bacal, João Renato Rebello Pinho (2006). "Using PCR for molecular monitoring of post-transplantation chimerism". Einstein (Sao Paulo). 4 (2).{{cite journal}}: CS1 maint: multiple names: authors list (link)
  2. Butler, John M. (4 August 2011). Advanced Topics in Forensic DNA Typing: Methodology. San Diego: Elsevier Academic Press. pp. 99–100. ISBN   9780123745132.
  3. National Commission on the Future of DNA Evidence (July 2002). "Using DNA to Solve Cold Cases" (PDF). U.S. Department of Justice. Retrieved 2006-08-08.
  4. https://dfs.dc.gov/sites/default/files/dc/sites/dfs/page_content/attachments/STRmix%20Validation.pdf [ bare URL PDF ]
  5. Moretti, Tamyra R.; Just, Rebecca S.; Kehl, Susannah C.; Willis, Leah E.; Buckleton, John S.; Bright, Jo-Anne; Taylor, Duncan A.; Onorato, Anthony J. (2017). "Internal validation of STRmix™ for the interpretation of single source and mixed DNA profiles". Forensic Science International: Genetics. 29: 126–144. doi: 10.1016/j.fsigen.2017.04.004 . PMID   28504203.
  6. 1 2 3 Tautz D. (1989). "Hypervariability of simple sequences as a general source for polymorphic DNA markers". Nucleic Acids Research. 17 (16): 6463–6471. doi:10.1093/nar/17.16.6463. PMC   318341 . PMID   2780284.
  7. Witherspoon, D. J.; Wooding, S.; Rogers, A. R.; Marchani, E. E.; Watkins, W. S.; Batzer, M. A.; Jorde, L. B. (2007-05-01). "Genetic Similarities Within and Between Human Populations". Genetics. 176 (1): 351–359. doi:10.1534/genetics.106.067355. ISSN   0016-6731. PMC   1893020 . PMID   17339205.
  8. Felch, Jason; et al. (July 20, 2008). "FBI resists scrutiny of 'matches'". Los Angeles Times. pp. P8.
  9. Hong Y. (2020). "Authentication of Primary Murine Cell Lines by a Microfluidics-Based Lab-On-Chip System". Biomedicines. 8 (12): 590. doi: 10.3390/biomedicines8120590 . PMC   7763653 . PMID   33317212.