In the fields of geometry and biochemistry, a triple helix (pl.: triple helices) is a set of three congruent geometrical helices with the same axis, differing by a translation along the axis. This means that each of the helices keeps the same distance from the central axis. As with a single helix, a triple helix may be characterized by its pitch, diameter, and handedness. Examples of triple helices include triplex DNA, [1] triplex RNA, [2] the collagen helix, [3] and collagen-like proteins.
A triple helix is named such because it is made up of three separate helices. Each of these helices shares the same axis, but they do not take up the same space because each helix is translated angularly around the axis. Generally, the identity of a triple helix depends on the type of helices that make it up. For example: a triple helix made of three strands of collagen protein is a collagen triple helix, and a triple helix made of three strands of DNA is a DNA triple helix.
As with other types of helices, triple helices have handedness: right-handed or left-handed. A right-handed helix moves around its axis in a clockwise direction from beginning to end. A left-handed helix is the right-handed helix's mirror image, and it moves around the axis in a counterclockwise direction from beginning to end. [4] The beginning and end of a helical molecule are defined based on certain markers in the molecule that do not change easily. For example: the beginning of a helical protein is its N terminus, and the beginning of a single strand of DNA is its 5' end. [4]
The collagen triple helix is made of three collagen peptides, each of which forms its own left-handed polyproline helix. [5] When the three chains combine, the triple helix adopts a right-handed orientation. The collagen peptide is composed of repeats of Gly-X-Y, with the second residue (X) usually being Pro and the third (Y) being hydroxyproline. [6] [5]
A DNA triple helix is made up of three separate DNA strands, each oriented with the sugar/phosphate backbone on the outside of the helix and the bases on the inside of the helix. The bases are the part of the molecule closest to the triple helix's axis, and the backbone is the part of the molecule farthest away from the axis. The third strand occupies the major groove of relatively normal duplex DNA. [7] The bases in triplex DNA are arranged to match up according to a Hoogsteen base pairing scheme. [8] Similarly, RNA triple helices are formed as a result of a single stranded RNA forming hydrogen bonds with an RNA duplex; the duplex consists of Watson-Crick base pairing while the third strand binds via Hoogsteen base pairing. [9]
The collagen triple helix has several characteristics that increase its stability. When proline is incorporated into the Y position of the Gly-X-Y sequence, it is post-translationally modified to hydroxyproline. [10] The hydroxyproline can enter into favorable interactions with water, which stabilizes the triple helix because the Y residues are solvent-accessible in the triple helix structure. The individual helices are also held together by an extensive network of amide-amide hydrogen bonds formed between the strands, each of which contributes approximately -2 kcal/mol to the overall free energy of the triple helix. [5] The formation of the superhelix not only protects the critical glycine residues on the interior of the helix, but also protects the overall protein from proteolysis. [6]
Triple helix DNA and RNA are stabilized by many of the same forces that stabilize double-stranded DNA helices. With nucleotide bases oriented to the inside of the helix, closer to its axis, bases engage in hydrogen bonding with other bases. The bonded bases in the center exclude water, so the hydrophobic effect is particularly important in the stabilization of DNA triple helices. [4]
Members of the collagen superfamily are major contributors to the extracellular matrix. The triple helical structure provides strength and stability to collagen fibers by providing great resistance to tensile stress. The rigidity of the collagen fibers is an important factor that can withstand most mechanical stress, making it an ideal protein for macromolecular transport and overall structural support throughout the body. [6]
There are some oligonucleotide sequences, called triplet-forming oligonucleotides (TFOs) that can bind to form a triplex with a longer molecule of double-stranded DNA; TFOs can inactivate a gene or help to induce mutations. [7] TFOs can only bind to certain sites in a larger molecule, so researchers must first determine whether a TFO can bind to the gene of interest. Twisted intercalating nucleic acid is sometimes used to improve this process. Mapping of genome-wide TFO-TTS pairs by sequencing is a useful way to study the triplex forming DNA in the whole genome using oligo-library.
In recent years, the biological function of triplex RNA has become more studied. Some roles include increasing stability, translation, influencing ligand binding, and catalysis. One example of ligand binding being influenced by a triple helix is in the SAM-II riboswitch where the triple helix creates a binding site that will uniquely accept S-adenosylmethionine (SAM). [9] The ribonucleoprotein complex telomerase, responsible for replicating the tail-ends of DNA (telomeres) also contains triplex RNA believed to be necessary for proper telomerase functioning. [9] [11] The triple helix at the 3' end of the PAN and MALAT1 long-noncoding RNAs serves to stabilize the RNA by protecting the Poly(A) tail from deadenylation, which subsequently affect their functions in viral pathogenesis and multiple human cancers. [9] [12] Additionally, RNA triple helices can stabilize mRNAs by formation of a poly(A) tail 3'-end binding pocket. [13]
TDF is a python-based package [14] to predict RNA-DNA triplex formation potential. The software starts by enumerating the substrings between TFO and TTS and uses statistical tests to find out significant result compared to the background.
Triplexfpp [15] is based on deep learning methods. This python-based pipelines can help predict the most likely triplex-forming lncRNA. However since the lncRNA for training is limited, there is a long way to go before machine learning and deep learning methods can be applied.
Collagen is the main structural protein in the extracellular matrix of a body's various connective tissues. As the main component of connective tissue, it is the most abundant protein in mammals. 25% to 35% of a mammalian body's protein content is collagen. Amino acids are bound together to form a triple helix of elongated fibril known as a collagen helix. The collagen helix is mostly found in connective tissue such as cartilage, bones, tendons, ligaments, and skin. Vitamin C is vital for collagen synthesis, and Vitamin E improves the production of collagen.
In molecular biology, the collagen triple helix or type-2 helix is the main secondary structure of various types of fibrous collagen, including type I collagen. In 1954, Ramachandran & Kartha advanced a structure for the collagen triple helix on the basis of fiber diffraction data. It consists of a triple helix made of the repetitious amino acid sequence glycine-X-Y, where X and Y are frequently proline or hydroxyproline. Collagen folded into a triple helix is known as tropocollagen. Collagen triple helices are often bundled into fibrils which themselves form larger fibres, as in tendons.
Deoxyribonucleic acid is a polymer composed of two polynucleotide chains that coil around each other to form a double helix. The polymer carries genetic instructions for the development, functioning, growth and reproduction of all known organisms and many viruses. DNA and ribonucleic acid (RNA) are nucleic acids. Alongside proteins, lipids and complex carbohydrates (polysaccharides), nucleic acids are one of the four major types of macromolecules that are essential for all known forms of life.
Ribonucleic acid (RNA) is a polymeric molecule that is essential for most biological functions, either by performing the function itself or by forming a template for the production of proteins. RNA and deoxyribonucleic acid (DNA) are nucleic acids. The nucleic acids constitute one of the four major macromolecules essential for all known forms of life. RNA is assembled as a chain of nucleotides. Cellular organisms use messenger RNA (mRNA) to convey genetic information that directs synthesis of specific proteins. Many viruses encode their genetic information using an RNA genome.
Peptide nucleic acid (PNA) is an artificially synthesized polymer similar to DNA or RNA.
A nucleic acid sequence is a succession of bases within the nucleotides forming alleles within a DNA or RNA (GACU) molecule. This succession is denoted by a series of a set of five different letters that indicate the order of the nucleotides. By convention, sequences are usually presented from the 5' end to the 3' end. For DNA, with its double helix, there are two possible directions for the notated sequence; of these two, the sense strand is used. Because nucleic acids are normally linear (unbranched) polymers, specifying the sequence is equivalent to defining the covalent structure of the entire molecule. For this reason, the nucleic acid sequence is also termed the primary structure.
In a chain-like biological molecule, such as a protein or nucleic acid, a structural motif is a common three-dimensional structure that appears in a variety of different, evolutionarily unrelated molecules. A structural motif does not have to be associated with a sequence motif; it can be represented by different and completely unrelated sequences in different proteins or RNA.
Z-DNA is one of the many possible double helical structures of DNA. It is a left-handed double helical structure in which the helix winds to the left in a zigzag pattern, instead of to the right, like the more common B-DNA form. Z-DNA is thought to be one of three biologically active double-helical structures along with A-DNA and B-DNA.
A Hoogsteen base pair is a variation of base-pairing in nucleic acids such as the A•T pair. In this manner, two nucleobases, one on each strand, can be held together by hydrogen bonds in the major groove. A Hoogsteen base pair applies the N7 position of the purine base and C6 amino group, which bind the Watson–Crick (N3–C4) face of the pyrimidine base.
Triple-stranded DNA is a DNA structure in which three oligonucleotides wind around each other and form a triple helix. In triple-stranded DNA, the third strand binds to a B-form DNA double helix by forming Hoogsteen base pairs or reversed Hoogsteen hydrogen bonds.
In molecular biology, the term double helix refers to the structure formed by double-stranded molecules of nucleic acids such as DNA. The double helical structure of a nucleic acid complex arises as a consequence of its secondary structure, and is a fundamental component in determining its tertiary structure. The structure was discovered by Rosalind Franklin, her student Raymond Gosling, James Watson, and Francis Crick, while the term "double helix" entered popular culture with the 1968 publication of Watson's The Double Helix: A Personal Account of the Discovery of the Structure of DNA.
A DNA-binding domain (DBD) is an independently folded protein domain that contains at least one structural motif that recognizes double- or single-stranded DNA. A DBD can recognize a specific DNA sequence or have a general affinity to DNA. Some DNA-binding domains may also include nucleic acids in their folded structure.
Therapeutic gene modulation refers to the practice of altering the expression of a gene at one of various stages, with a view to alleviate some form of ailment. It differs from gene therapy in that gene modulation seeks to alter the expression of an endogenous gene whereas gene therapy concerns the introduction of a gene whose product aids the recipient directly.
Nucleic acid tertiary structure is the three-dimensional shape of a nucleic acid polymer. RNA and DNA molecules are capable of diverse functions ranging from molecular recognition to catalysis. Such functions require a precise three-dimensional structure. While such structures are diverse and seemingly complex, they are composed of recurring, easily recognizable tertiary structural motifs that serve as molecular building blocks. Some of the most common motifs for RNA and DNA tertiary structure are described below, but this information is based on a limited number of solved structures. Many more tertiary structural motifs will be revealed as new RNA and DNA molecules are structurally characterized.
Nucleic acid structure refers to the structure of nucleic acids such as DNA and RNA. Chemically speaking, DNA and RNA are very similar. Nucleic acid structure is often divided into four different levels: primary, secondary, tertiary, and quaternary.
Nucleic acid secondary structure is the basepairing interactions within a single nucleic acid polymer or between two polymers. It can be represented as a list of bases which are paired in a nucleic acid molecule. The secondary structures of biological DNAs and RNAs tend to be different: biological DNA mostly exists as fully base paired double helices, while biological RNA is single stranded and often forms complex and intricate base-pairing interactions due to its increased ability to form hydrogen bonds stemming from the extra hydroxyl group in the ribose sugar.
Nucleic acid NMR is the use of nuclear magnetic resonance spectroscopy to obtain information about the structure and dynamics of nucleic acid molecules, such as DNA or RNA. It is useful for molecules of up to 100 nucleotides, and as of 2003, nearly half of all known RNA structures had been determined by NMR spectroscopy.
Twisted intercalating nucleic acid (TINA) is a nucleic acid molecule that, when added to triplex-forming oligonucleotides (TFOs), stabilizes Hoogsteen triplex DNA formation from double-stranded DNA (dsDNA) and TFOs. Its ability to twist around a triple bond increases ease of intercalation within double stranded DNA in order to form triplex DNA. Certain configurations have been shown to stabilize Watson-Crick antiparallel duplex DNA. TINA-DNA primers have been shown to increase the specificity of binding in PCR. The use of TINA insertions in G-quadruplexes has also been shown to enhance anti-HIV-1 activity. TINA stabilized PT demonstrates improved sensitivity and specificity of DNA based clinical diagnostic assays.
Polypurine reverse-Hoogsteen hairpins (PPRHs) are non-modified oligonucleotides containing two polypurine domains, in a mirror repeat fashion, linked by a pentathymidine stretch forming double-stranded DNA stem-loop molecules. The two polypurine domains interact by intramolecular reverse-Hoogsteen bonds allowing the formation of this specific hairpin structure.
Non-B DNA refers to DNA conformations that differ from the canonical B-DNA conformation, the most common form of DNA found in nature at neutral pH and physiological salt concentrations. Non-B DNA structures can arise due to various factors, including DNA sequence, length, supercoiling, and environmental conditions. Non-B DNA structures can have important biological roles, but they can also cause problems, such as genomic instability and disease.