Triple helix

Last updated
The collagen triple helix is a triple helix formed from three separate protein helices, spiraling around the same axis. Collagen (triple helix protein).jpg
The collagen triple helix is a triple helix formed from three separate protein helices, spiraling around the same axis.

In the fields of geometry and biochemistry, a triple helix (pl.: triple helices) is a set of three congruent geometrical helices with the same axis, differing by a translation along the axis. This means that each of the helices keeps the same distance from the central axis. As with a single helix, a triple helix may be characterized by its pitch, diameter, and handedness. Examples of triple helices include triplex DNA, [1] triplex RNA, [2] the collagen helix, [3] and collagen-like proteins.

Contents

Structure

A triple helix is named such because it is made up of three separate helices. Each of these helices shares the same axis, but they do not take up the same space because each helix is translated angularly around the axis. Generally, the identity of a triple helix depends on the type of helices that make it up. For example: a triple helix made of three strands of collagen protein is a collagen triple helix, and a triple helix made of three strands of DNA is a DNA triple helix.

As with other types of helices, triple helices have handedness: right-handed or left-handed. A right-handed helix moves around its axis in a clockwise direction from beginning to end. A left-handed helix is the right-handed helix's mirror image, and it moves around the axis in a counterclockwise direction from beginning to end. [4] The beginning and end of a helical molecule are defined based on certain markers in the molecule that do not change easily. For example: the beginning of a helical protein is its N terminus, and the beginning of a single strand of DNA is its 5' end. [4]

The collagen triple helix is made of three collagen peptides, each of which forms its own left-handed polyproline helix. [5] When the three chains combine, the triple helix adopts a right-handed orientation. The collagen peptide is composed of repeats of Gly-X-Y, with the second residue (X) usually being Pro and the third (Y) being hydroxyproline. [6] [5]

A DNA triple helix is made up of three separate DNA strands, each oriented with the sugar/phosphate backbone on the outside of the helix and the bases on the inside of the helix. The bases are the part of the molecule closest to the triple helix's axis, and the backbone is the part of the molecule farthest away from the axis. The third strand occupies the major groove of relatively normal duplex DNA. [7] The bases in triplex DNA are arranged to match up according to a Hoogsteen base pairing scheme. [8] Similarly, RNA triple helices are formed as a result of a single stranded RNA forming hydrogen bonds with an RNA duplex; the duplex consists of Watson-Crick base pairing while the third strand binds via Hoogsteen base pairing. [9]

Stabilizing factors

The collagen triple helix has several characteristics that increase its stability. When proline is incorporated into the Y position of the Gly-X-Y sequence, it is post-translationally modified to hydroxyproline. [10] The hydroxyproline can enter into favorable interactions with water, which stabilizes the triple helix because the Y residues are solvent-accessible in the triple helix structure. The individual helices are also held together by an extensive network of amide-amide hydrogen bonds formed between the strands, each of which contributes approximately -2 kcal/mol to the overall free energy of the triple helix. [5] The formation of the superhelix not only protects the critical glycine residues on the interior of the helix, but also protects the overall protein from proteolysis. [6]

Triple helix DNA and RNA are stabilized by many of the same forces that stabilize double-stranded DNA helices. With nucleotide bases oriented to the inside of the helix, closer to its axis, bases engage in hydrogen bonding with other bases. The bonded bases in the center exclude water, so the hydrophobic effect is particularly important in the stabilization of DNA triple helices. [4]

Biological role

Proteins

Members of the collagen superfamily are major contributors to the extracellular matrix. The triple helical structure provides strength and stability to collagen fibers by providing great resistance to tensile stress. The rigidity of the collagen fibers is an important factor that can withstand most mechanical stress, making it an ideal protein for macromolecular transport and overall structural support throughout the body. [6]

DNA

There are some oligonucleotide sequences, called triplet-forming oligonucleotides (TFOs) that can bind to form a triplex with a longer molecule of double-stranded DNA; TFOs can inactivate a gene or help to induce mutations. [7] TFOs can only bind to certain sites in a larger molecule, so researchers must first determine whether a TFO can bind to the gene of interest. Twisted intercalating nucleic acid is sometimes used to improve this process. Mapping of genome-wide TFO-TTS pairs by sequencing is a useful way to study the triplex forming DNA in the whole genome using oligo-library.

RNA

In recent years, the biological function of triplex RNA has become more studied. Some roles include increasing stability, translation, influencing ligand binding, and catalysis. One example of ligand binding being influenced by a triple helix is in the SAM-II riboswitch where the triple helix creates a binding site that will uniquely accept S-adenosylmethionine (SAM). [9] The ribonucleoprotein complex telomerase, responsible for replicating the tail-ends of DNA (telomeres) also contains triplex RNA believed to be necessary for proper telomerase functioning. [9] [11] The triple helix at the 3' end of the PAN and MALAT1 long-noncoding RNAs serves to stabilize the RNA by protecting the Poly(A) tail from deadenylation, which subsequently affect their functions in viral pathogenesis and multiple human cancers. [9] [12] Additionally, RNA triple helices can stabilize mRNAs by formation of a poly(A) tail 3'-end binding pocket. [13]

Computational Tools

TDF (Triplex Domain Finder)

TDF is a python-based package [14] to predict RNA-DNA triplex formation potential. The software starts by enumerating the substrings between TFO and TTS and uses statistical tests to find out significant result compared to the background.

Triplexfpp

Triplexfpp [15] is based on deep learning methods. This python-based pipelines can help predict the most likely triplex-forming lncRNA. However since the lncRNA for training is limited, there is a long way to go before machine learning and deep learning methods can be applied.

Related Research Articles

<span class="mw-page-title-main">Collagen</span> Most abundant structural protein in animals

Collagen is the main structural protein in the extracellular matrix of a body's various connective tissues. As the main component of connective tissue, it is the most abundant protein in mammals. 25% to 35% of a mammalian body's protein content is collagen. Amino acids are bound together to form a triple helix of elongated fibril known as a collagen helix. The collagen helix is mostly found in connective tissue such as cartilage, bones, tendons, ligaments, and skin. Vitamin C is vital for collagen synthesis, and Vitamin E improves the production of collagen.

<span class="mw-page-title-main">Collagen helix</span> Main protein structure of fibrous collagen

In molecular biology, the collagen triple helix or type-2 helix is the main secondary structure of various types of fibrous collagen, including type I collagen. In 1954, Ramachandran & Kartha advanced a structure for the collagen triple helix on the basis of fiber diffraction data. It consists of a triple helix made of the repetitious amino acid sequence glycine-X-Y, where X and Y are frequently proline or hydroxyproline. Collagen folded into a triple helix is known as tropocollagen. Collagen triple helices are often bundled into fibrils which themselves form larger fibres, as in tendons.

<span class="mw-page-title-main">DNA</span> Molecule that carries genetic information

Deoxyribonucleic acid is a polymer composed of two polynucleotide chains that coil around each other to form a double helix. The polymer carries genetic instructions for the development, functioning, growth and reproduction of all known organisms and many viruses. DNA and ribonucleic acid (RNA) are nucleic acids. Alongside proteins, lipids and complex carbohydrates (polysaccharides), nucleic acids are one of the four major types of macromolecules that are essential for all known forms of life.

<span class="mw-page-title-main">RNA</span> Family of large biological molecules

Ribonucleic acid (RNA) is a polymeric molecule that is essential for most biological functions, either by performing the function itself or by forming a template for the production of proteins. RNA and deoxyribonucleic acid (DNA) are nucleic acids. The nucleic acids constitute one of the four major macromolecules essential for all known forms of life. RNA is assembled as a chain of nucleotides. Cellular organisms use messenger RNA (mRNA) to convey genetic information that directs synthesis of specific proteins. Many viruses encode their genetic information using an RNA genome.

<span class="mw-page-title-main">Peptide nucleic acid</span> Biological molecule

Peptide nucleic acid (PNA) is an artificially synthesized polymer similar to DNA or RNA.

<span class="mw-page-title-main">Nucleic acid sequence</span> Succession of nucleotides in a nucleic acid

A nucleic acid sequence is a succession of bases within the nucleotides forming alleles within a DNA or RNA (GACU) molecule. This succession is denoted by a series of a set of five different letters that indicate the order of the nucleotides. By convention, sequences are usually presented from the 5' end to the 3' end. For DNA, with its double helix, there are two possible directions for the notated sequence; of these two, the sense strand is used. Because nucleic acids are normally linear (unbranched) polymers, specifying the sequence is equivalent to defining the covalent structure of the entire molecule. For this reason, the nucleic acid sequence is also termed the primary structure.

In a chain-like biological molecule, such as a protein or nucleic acid, a structural motif is a common three-dimensional structure that appears in a variety of different, evolutionarily unrelated molecules. A structural motif does not have to be associated with a sequence motif; it can be represented by different and completely unrelated sequences in different proteins or RNA.

<span class="mw-page-title-main">Z-DNA</span> One of many possible double helical structures of DNA

Z-DNA is one of the many possible double helical structures of DNA. It is a left-handed double helical structure in which the helix winds to the left in a zigzag pattern, instead of to the right, like the more common B-DNA form. Z-DNA is thought to be one of three biologically active double-helical structures along with A-DNA and B-DNA.

<span class="mw-page-title-main">Hoogsteen base pair</span>

A Hoogsteen base pair is a variation of base-pairing in nucleic acids such as the A•T pair. In this manner, two nucleobases, one on each strand, can be held together by hydrogen bonds in the major groove. A Hoogsteen base pair applies the N7 position of the purine base and C6 amino group, which bind the Watson–Crick (N3–C4) face of the pyrimidine base.

<span class="mw-page-title-main">Triple-stranded DNA</span> DNA structure

Triple-stranded DNA is a DNA structure in which three oligonucleotides wind around each other and form a triple helix. In triple-stranded DNA, the third strand binds to a B-form DNA double helix by forming Hoogsteen base pairs or reversed Hoogsteen hydrogen bonds.

<span class="mw-page-title-main">Nucleic acid double helix</span> Structure formed by double-stranded molecules

In molecular biology, the term double helix refers to the structure formed by double-stranded molecules of nucleic acids such as DNA. The double helical structure of a nucleic acid complex arises as a consequence of its secondary structure, and is a fundamental component in determining its tertiary structure. The structure was discovered by Rosalind Franklin, her student Raymond Gosling, James Watson, and Francis Crick, while the term "double helix" entered popular culture with the 1968 publication of Watson's The Double Helix: A Personal Account of the Discovery of the Structure of DNA.

A DNA-binding domain (DBD) is an independently folded protein domain that contains at least one structural motif that recognizes double- or single-stranded DNA. A DBD can recognize a specific DNA sequence or have a general affinity to DNA. Some DNA-binding domains may also include nucleic acids in their folded structure.

Therapeutic gene modulation refers to the practice of altering the expression of a gene at one of various stages, with a view to alleviate some form of ailment. It differs from gene therapy in that gene modulation seeks to alter the expression of an endogenous gene whereas gene therapy concerns the introduction of a gene whose product aids the recipient directly.

<span class="mw-page-title-main">Nucleic acid tertiary structure</span> Three-dimensional shape of a nucleic acid polymer

Nucleic acid tertiary structure is the three-dimensional shape of a nucleic acid polymer. RNA and DNA molecules are capable of diverse functions ranging from molecular recognition to catalysis. Such functions require a precise three-dimensional structure. While such structures are diverse and seemingly complex, they are composed of recurring, easily recognizable tertiary structural motifs that serve as molecular building blocks. Some of the most common motifs for RNA and DNA tertiary structure are described below, but this information is based on a limited number of solved structures. Many more tertiary structural motifs will be revealed as new RNA and DNA molecules are structurally characterized.

<span class="mw-page-title-main">Nucleic acid structure</span> Biomolecular structure of nucleic acids such as DNA and RNA

Nucleic acid structure refers to the structure of nucleic acids such as DNA and RNA. Chemically speaking, DNA and RNA are very similar. Nucleic acid structure is often divided into four different levels: primary, secondary, tertiary, and quaternary.

<span class="mw-page-title-main">Nucleic acid secondary structure</span>

Nucleic acid secondary structure is the basepairing interactions within a single nucleic acid polymer or between two polymers. It can be represented as a list of bases which are paired in a nucleic acid molecule. The secondary structures of biological DNAs and RNAs tend to be different: biological DNA mostly exists as fully base paired double helices, while biological RNA is single stranded and often forms complex and intricate base-pairing interactions due to its increased ability to form hydrogen bonds stemming from the extra hydroxyl group in the ribose sugar.

Nucleic acid NMR is the use of nuclear magnetic resonance spectroscopy to obtain information about the structure and dynamics of nucleic acid molecules, such as DNA or RNA. It is useful for molecules of up to 100 nucleotides, and as of 2003, nearly half of all known RNA structures had been determined by NMR spectroscopy.

Twisted intercalating nucleic acid (TINA) is a nucleic acid molecule that, when added to triplex-forming oligonucleotides (TFOs), stabilizes Hoogsteen triplex DNA formation from double-stranded DNA (dsDNA) and TFOs. Its ability to twist around a triple bond increases ease of intercalation within double stranded DNA in order to form triplex DNA. Certain configurations have been shown to stabilize Watson-Crick antiparallel duplex DNA. TINA-DNA primers have been shown to increase the specificity of binding in PCR. The use of TINA insertions in G-quadruplexes has also been shown to enhance anti-HIV-1 activity. TINA stabilized PT demonstrates improved sensitivity and specificity of DNA based clinical diagnostic assays.

<span class="mw-page-title-main">Polypurine reverse-Hoogsteen hairpin</span>

Polypurine reverse-Hoogsteen hairpins (PPRHs) are non-modified oligonucleotides containing two polypurine domains, in a mirror repeat fashion, linked by a pentathymidine stretch forming double-stranded DNA stem-loop molecules. The two polypurine domains interact by intramolecular reverse-Hoogsteen bonds allowing the formation of this specific hairpin structure.

Non-B DNA refers to DNA conformations that differ from the canonical B-DNA conformation, the most common form of DNA found in nature at neutral pH and physiological salt concentrations. Non-B DNA structures can arise due to various factors, including DNA sequence, length, supercoiling, and environmental conditions. Non-B DNA structures can have important biological roles, but they can also cause problems, such as genomic instability and disease.

References

  1. Bernués J, Azorín F (1995). "Triple-Stranded DNA". Nucleic Acids and Molecular Biology. Vol. 9. Berlin, Heidelberg: Springer. pp. 1–21. doi:10.1007/978-3-642-79488-9_1. ISBN   978-3-642-79490-2.
  2. Buske FA, Mattick JS, Bailey TL (May 2011). "Potential in vivo roles of nucleic acid triple-helices". RNA Biology. 8 (3): 427–439. doi:10.4161/rna.8.3.14999. PMC   3218511 . PMID   21525785.
  3. Bächinger HP (2005-05-03). Collagen: Primer in Structure, Processing and Assembly. Springer Science & Business Media. ISBN   9783540232728.
  4. 1 2 3 Kuriyan J, Konforti B, Wemmer D (2012-07-25). The molecules of life : physical and chemical principles. New York: Garland Science, Taylor & Francis Group. ISBN   9780815341888. OCLC   779577263.
  5. 1 2 3 Shoulders MD, Raines RT (2009). "Collagen structure and stability". Annual Review of Biochemistry. 78: 929–958. doi:10.1146/annurev.biochem.77.032207.120833. PMC   2846778 . PMID   19344236.
  6. 1 2 3 Fidler AL, Boudko SP, Rokas A, Hudson BG (April 2018). "The triple helix of collagens - an ancient protein structure that enabled animal multicellularity and tissue evolution". Journal of Cell Science. 131 (7): jcs203950. doi:10.1242/jcs.203950. PMC   5963836 . PMID   29632050.
  7. 1 2 Jain A, Wang G, Vasquez KM (August 2008). "DNA triple helices: biological consequences and therapeutic potential". Biochimie. 90 (8): 1117–1130. doi:10.1016/j.biochi.2008.02.011. PMC   2586808 . PMID   18331847.
  8. Duca M, Vekhoff P, Oussedik K, Halby L, Arimondo PB (September 2008). "The triple helix: 50 years later, the outcome". Nucleic Acids Research. 36 (16): 5123–5138. doi:10.1093/nar/gkn493. PMC   2532714 . PMID   18676453.
  9. 1 2 3 4 Conrad NK (2014). "The emerging role of triple helices in RNA biology". Wiley Interdisciplinary Reviews. RNA. 5 (1): 15–29. doi:10.1002/wrna.1194. PMC   4721660 . PMID   24115594.
  10. Brodsky B, Persikov AV (2005-01-01). "Molecular structure of the collagen triple helix". Advances in Protein Chemistry. 70: 301–339. doi:10.1016/S0065-3233(05)70009-7. ISBN   9780120342709. PMID   15837519. S2CID   20879450.
  11. Theimer CA, Blois CA, Feigon J (March 2005). "Structure of the human telomerase RNA pseudoknot reveals conserved tertiary interactions essential for function". Molecular Cell. 17 (5): 671–682. doi: 10.1016/j.molcel.2005.01.017 . PMID   15749017.
  12. Brown JA, Bulkley D, Wang J, Valenstein ML, Yario TA, Steitz TA, Steitz JA (July 2014). "Structural insights into the stabilization of MALAT1 noncoding RNA by a bipartite triple helix". Nature Structural & Molecular Biology. 21 (7): 633–640. doi:10.1038/nsmb.2844. PMC   4096706 . PMID   24952594.
  13. Torabi SF, Vaidya AT, Tycowski KT, DeGregorio SJ, Wang J, Shu MD, et al. (February 2021). "RNA stabilization by a poly(A) tail 3'-end binding pocket and other modes of poly(A)-RNA interaction". Science. 371 (6529): eabe6523. doi:10.1126/science.abe6523. PMC   9491362 . PMID   33414189. S2CID   231195473.
  14. Kuo CC, Hänzelmann S, Sentürk Cetin N, Frank S, Zajzon B, Derks JP, et al. (April 2019). "Detection of RNA-DNA binding sites in long noncoding RNAs". Nucleic Acids Research. 47 (6): e32. doi:10.1093/nar/gkz037. PMC   6451187 . PMID   30698727.
  15. Zhang Y, Long Y, Kwoh CK (November 2020). "Deep learning based DNA:RNA triplex forming potential prediction". BMC Bioinformatics. 21 (1): 522. doi: 10.1186/s12859-020-03864-0 . PMC   7663897 . PMID   33183242.