Content | |
---|---|
Description | short tandem repeat DNA database |
Organisms | Homo sapiens |
Contact | |
Primary citation | Ruitberg & al. (2001) [1] |
Access | |
Website | http://www.cstl.nist.gov/biotech/strbase/ |
STRBase in computational biology is a database of Short Tandem Repeats [1]
In the fields of molecular biology and genetics, a genome is all genetic information of an organism. It consists of nucleotide sequences of DNA. The genome includes both the genes and the noncoding DNA, as well as mitochondrial DNA and chloroplast DNA. The study of the genome is called genomics. The genome for several organisms have been sequenced and genes analyzed, the human genome project which sequenced the entire genome for Homo sapiens was successfully completed in April 2003.
A transposable element is a DNA sequence that can change its position within a genome, sometimes creating or reversing mutations and altering the cell's genetic identity and genome size. Transposition often results in duplication of the same genetic material. Barbara McClintock's discovery of them earned her a Nobel Prize in 1983.
A microsatellite is a tract of repetitive DNA in which certain DNA motifs are repeated, typically 5–50 times. Microsatellites occur at thousands of locations within an organism's genome. They have a higher mutation rate than other areas of DNA leading to high genetic diversity. Microsatellites are often referred to as short tandem repeats (STRs) by forensic geneticists and in genetic genealogy, or as simple sequence repeats (SSRs) by plant geneticists.
An inverted repeat is a single stranded sequence of nucleotides followed downstream by its reverse complement. The intervening sequence of nucleotides between the initial sequence and the reverse complement can be any length including zero. For example, 5'---TTACGnnnnnnCGTAA---3' is an inverted repeat sequence. When the intervening length is zero, the composite sequence is a palindromic sequence.
Tandem repeats occur in DNA when a pattern of one or more nucleotides is repeated and the repetitions are directly adjacent to each other. Several protein domains also form tandem repeats within their amino acid primary structure, such as armadillo repeats. However, in proteins, perfect tandem repeats are unlikely in most in vivo proteins, and most known repeats are in proteins which have been designed.
Ribosomal DNA (rDNA) is a DNA sequence that codes for ribosomal RNA. These sequences regulate transcription initiation and amplification, and contain both transcribed and non-transcribed spacer segments. The rRNA transcribed from the approximately 600 rDNA repeats forms the most abundant section of RNA found in cells of eukaryotes. Ribosomes are assemblies of proteins and rRNA molecules that translate mRNA molecules to produce proteins. As shown in the figure, rDNA of eukaryotes consists of a tandem repeat of a unit segment, composed of NTS, ETS, 18S, ITS1, 5.8S, ITS2, and 28S tracts. rDNA has another gene, coding for 5S rRNA, located in the genome in most eukaryotes. 5S rDNA is also present in tandem repeats as in Drosophila. DNA regions that are repetitive often undergo recombination events. The rDNA repeats have many regulatory mechanisms that keep the DNA from undergoing mutations, thus keeping the rDNA conserved.
Satellite DNA consists of very large arrays of tandemly repeating, non-coding DNA. Satellite DNA is the main component of functional centromeres, and form the main structural constituent of heterochromatin.
Repeated sequences are patterns of nucleic acids that occur in multiple copies throughout the genome. Repetitive DNA was first detected because of its rapid re-association kinetics. In many organisms, a significant fraction of the genomic DNA is highly repetitive, with over two-thirds of the sequence consisting of repetitive elements in humans.
A variable number tandem repeat is a location in a genome where a short nucleotide sequence is organized as a tandem repeat. These can be found on many chromosomes, and often show variations in length among individuals. Each variant acts as an inherited allele, allowing them to be used for personal or parental identification. Their analysis is useful in genetics and biology research, forensics, and DNA fingerprinting.
Interspersed repetitive DNA is found in all eukaryotic genomes. They differ from tandem repeat DNA in that rather than the repeat sequences coming right after one another, they are dispersed throughout the genome and nonadjacent. The sequence that repeats can vary depending on the type of organism, and many other factors. Certain classes of interspersed repeat sequences propagate themselves by RNA mediated transposition; they have been called retrotransposons, and they constitute 25–40% of most mammalian genomes. Some types of interspersed repetitive DNA elements allow new genes to evolve by uncoupling similar DNA sequences from gene conversion during meiosis.
A Y-STR is a short tandem repeat (STR) on the Y-chromosome. Y-STRs are often used in forensics, paternity, and genealogical DNA testing. Y-STRs are taken specifically from the male Y chromosome. These Y-STRs provide a weaker analysis than autosomal STRs because the Y chromosome is only found in males, which are only passed down by the father, making the Y chromosome in any paternal line practically identical. This causes a significantly smaller amount of distinction between Y-STR samples. Autosomal STRs provide a much stronger analytical power because of the random matching that occurs between pairs of chromosomes during the zygote making process.
Second Generation Multiplex Plus , is a DNA profiling system developed by Applied Biosystems. It is an updated version of Second Generation Multiplex. SGM Plus has been used by the UK National DNA Database since 1998.
Slipped strand mispairing (SSM),, is a mutation process which occurs during DNA replication. It involves denaturation and displacement of the DNA strands, resulting in mispairing of the complementary bases. Slipped strand mispairing is one explanation for the origin and evolution of repetitive DNA sequences.
Short Tandem Repeat (STR) analysis is a common molecular biology method used to compare allele repeats at specific loci in DNA between two or more samples. A short tandem repeat is a microsatellite with repeat units that are 2 to 7 base pairs in length, with the number of repeats varying among individuals, making STRs effective for human identification purposes. This method differs from restriction fragment length polymorphism analysis (RFLP) since STR analysis does not cut the DNA with restriction enzymes. Instead, polymerase chain reaction (PCR) is employed to discover the lengths of the short tandem repeats based on the length of the PCR product.
In paternity testing, Paternity Index (PI) is a calculated value generated for a single genetic marker or locus and is associated with the statistical strength or weight of that locus in favor of or against parentage given the phenotypes of the tested participants and the inheritance scenario. Phenotype typically refers to physical characteristics such as body plan, color, behavior, etc. in organisms. However, the term used in the area of DNA paternity testing refers to what is observed directly in the laboratory. Laboratories involved in parentage testing and other fields of human identity employ genetic testing panels that contain a battery of loci each of which is selected due to extensive allelic variations within and between populations. These genetic variations are not assumed to bestow physical and/or behavioral attributes to the person carrying the allelic arrangement(s) and therefore are not subject to selective pressure and follow Hardy Weinberg inheritance patterns.
The Combined DNA Index System (CODIS) is the United States national DNA database created and maintained by the Federal Bureau of Investigation. CODIS consists of three levels of information; Local DNA Index Systems (LDIS) where DNA profiles originate, State DNA Index Systems (SDIS) which allows for laboratories within states to share information, and the National DNA Index System (NDIS) which allows states to compare DNA information with one another.
Solenoid protein domains are a highly modular type of protein domain. They consist of a chain of nearly identical folds, often simply called tandem repeats. They are extremely common among all types of proteins, though exact figures are unknown.
DNA profiling is the determination of a DNA profile for legal and investigative purposes. DNA analysis methods have changed numerous times over the years as technology improves and allows for more information to be determined with less starting material. Modern DNA analysis is based on the statistical calculation of the rarity of the produced profile within a population.
An array of protein tandem repeats is defined as several adjacent copies having the same or similar sequence motifs. These periodic sequences are generated by internal duplications in both coding and non-coding genomic sequences. Repetitive units of protein tandem repeats are considerably diverse, ranging from the repetition of a single amino acid to domains of 100 or more residues.