Structural motif

Last updated April 26, 2024

In a chain-like biological molecule, such as a protein or nucleic acid, a structural motif is a common three-dimensional structure that appears in a variety of different, evolutionarily unrelated molecules.^[1] A structural motif does not have to be associated with a sequence motif; it can be represented by different and completely unrelated sequences in different proteins or RNA.

In nucleic acids

Depending upon the sequence and other conditions, nucleic acids can form a variety of structural motifs which is thought to have biological significance.

Stem-loop: Stem-loop intramolecular base pairing is a pattern that can occur in single-stranded DNA or, more commonly, in RNA.^[2] The structure is also known as a hairpin or hairpin loop. It occurs when two regions of the same strand, usually complementary in nucleotide sequence when read in opposite directions, base-pair to form a double helix that ends in an unpaired loop. The resulting structure is a key building block of many RNA secondary structures.

Cruciform DNA: Cruciform DNA is a form of non-B DNA that requires at least a 6 nucleotide sequence of inverted repeats to form a structure consisting of a stem, branch point and loop in the shape of a cruciform, stabilized by negative DNA supercoiling.^[3] Two classes of cruciform DNA have been described; folded and unfolded.

G-quadruplex: G-quadruplex secondary structures (G4) are formed in nucleic acids by sequences that are rich in guanine.^[4] They are helical in shape and contain guanine tetrads that can form from one,^[5] two^[6] or four strands.^[7]

D-loop: A displacement loop or D-loop is a DNA structure where the two strands of a double-stranded DNA molecule are separated for a stretch and held apart by a third strand of DNA.^[8] An R-loop is similar to a D-loop, but in this case the third strand is RNA rather than DNA.^[9] The third strand has a base sequence which is complementary to one of the main strands and pairs with it, thus displacing the other complementary main strand in the region. Within that region the structure is thus a form of triple-stranded DNA. A diagram in the paper introducing the term illustrated the D-loop with a shape resembling a capital "D", where the displaced strand formed the loop of the "D".^[10]

In proteins

In proteins, a structural motif describes the connectivity between secondary structural elements. An individual motif usually consists of only a few elements, e.g., the 'helix-turn-helix' motif which has just three. Note that, while the spatial sequence of elements may be identical in all instances of a motif, they may be encoded in any order within the underlying gene. In addition to secondary structural elements, protein structural motifs often include loops of variable length and unspecified structure. Structural motifs may also appear as tandem repeats.

Beta hairpin: Extremely common. Two antiparallel beta strands connected by a tight turn of a few amino acids between them.
Greek key: Four beta strands, three connected by hairpins, the fourth folded over the top.
Omega loop: A loop in which the residues that make up the beginning and end of the loop are very close together.^[11]
Helix-loop-helix: Consists of alpha helices bound by a looping stretch of amino acids. This motif is seen in transcription factors.
Zinc finger: Two beta strands with an alpha helix end folded over to bind a zinc ion. Important in DNA binding proteins.
Helix-turn-helix: Two α helices joined by a short strand of amino acids and found in many proteins that regulate gene expression.^[12]
Nest: Extremely common. Three consecutive amino acid residues form an anion-binding concavity.^[13]
Niche: Extremely common. Three or four consecutive amino acid residues form a cation-binding feature.^[14]

Related Research Articles

A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA and RNA. Dictated by specific hydrogen bonding patterns, "Watson–Crick" base pairs allow the DNA helix to maintain a regular helical structure that is subtly dependent on its nucleotide sequence. The complementary nature of this based-paired structure provides a redundant copy of the genetic information encoded within each strand of DNA. The regular structure and data redundancy provided by the DNA double helix make DNA well suited to the storage of genetic information, while base-pairing between DNA and incoming nucleotides provides the mechanism through which DNA polymerase replicates DNA and RNA polymerase transcribes DNA into RNA. Many DNA-binding proteins can recognize specific base-pairing patterns that identify particular regulatory regions of genes.

Deoxyribonucleic acid is a polymer composed of two polynucleotide chains that coil around each other to form a double helix. The polymer carries genetic instructions for the development, functioning, growth and reproduction of all known organisms and many viruses. DNA and ribonucleic acid (RNA) are nucleic acids. Alongside proteins, lipids and complex carbohydrates (polysaccharides), nucleic acids are one of the four major types of macromolecules that are essential for all known forms of life.

Nucleic acids are large biomolecules that are crucial in all cells and viruses. They are composed of nucleotides, which are the monomer components: a 5-carbon sugar, a phosphate group and a nitrogenous base. The two main classes of nucleic acids are deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). If the sugar is ribose, the polymer is RNA; if the sugar is deoxyribose, a variant of ribose, the polymer is DNA.

A nucleic acid sequence is a succession of bases within the nucleotides forming alleles within a DNA or RNA (GACU) molecule. This succession is denoted by a series of a set of five different letters that indicate the order of the nucleotides. By convention, sequences are usually presented from the 5' end to the 3' end. For DNA, with its double helix, there are two possible directions for the notated sequence; of these two, the sense strand is used. Because nucleic acids are normally linear (unbranched) polymers, specifying the sequence is equivalent to defining the covalent structure of the entire molecule. For this reason, the nucleic acid sequence is also termed the primary structure.

A Hoogsteen base pair is a variation of base-pairing in nucleic acids such as the A•T pair. In this manner, two nucleobases, one on each strand, can be held together by hydrogen bonds in the major groove. A Hoogsteen base pair applies the N7 position of the purine base and C6 amino group, which bind the Watson–Crick (N3–C4) face of the pyrimidine base.

A DNA-binding domain (DBD) is an independently folded protein domain that contains at least one structural motif that recognizes double- or single-stranded DNA. A DBD can recognize a specific DNA sequence or have a general affinity to DNA. Some DNA-binding domains may also include nucleic acids in their folded structure.

<span class="mw-page-title-main">G-quadruplex</span> Structure in molecular biology

In molecular biology, G-quadruplex secondary structures (G4) are formed in nucleic acids by sequences that are rich in guanine. They are helical in shape and contain guanine tetrads that can form from one, two or four strands. The unimolecular forms often occur naturally near the ends of the chromosomes, better known as the telomeric regions, and in transcriptional regulatory regions of multiple genes, both in microbes and across vertebrates including oncogenes in humans. Four guanine bases can associate through Hoogsteen hydrogen bonding to form a square planar structure called a guanine tetrad, and two or more guanine tetrads can stack on top of each other to form a G-quadruplex.

A palindromic sequence is a nucleic acid sequence in a double-stranded DNA or RNA molecule whereby reading in a certain direction on one strand is identical to the sequence in the same direction on the complementary strand. This definition of palindrome thus depends on complementary strands being palindromic of each other.

In biochemistry, two biopolymers are antiparallel if they run parallel to each other but with opposite directionality (alignments). An example is the two complementary strands of a DNA double helix, which run in opposite directions alongside each other.

The K Homology (KH) domain is a protein domain that was first identified in the human heterogeneous nuclear ribonucleoprotein (hnRNP) K. An evolutionarily conserved sequence of around 70 amino acids, the KH domain is present in a wide variety of nucleic acid-binding proteins. The KH domain binds RNA, and can function in RNA recognition. It is found in multiple copies in several proteins, where they can function cooperatively or independently. For example, in the AU-rich element RNA-binding protein KSRP, which has 4 KH domains, KH domains 3 and 4 behave as independent binding modules to interact with different regions of the AU-rich RNA targets. The solution structure of the first KH domain of FMR1 and of the C-terminal KH domain of hnRNP K determined by nuclear magnetic resonance (NMR) revealed a beta-alpha-alpha-beta-beta-alpha structure. Autoantibodies to NOVA1, a KH domain protein, cause paraneoplastic opsoclonus ataxia. The KH domain is found at the N-terminus of the ribosomal protein S3. This domain is unusual in that it has a different fold compared to the normal KH domain.

Nucleic acid analogues are compounds which are analogous to naturally occurring RNA and DNA, used in medicine and in molecular biology research. Nucleic acids are chains of nucleotides, which are composed of three parts: a phosphate backbone, a pentose sugar, either ribose or deoxyribose, and one of four nucleobases. An analogue may have any of these altered. Typically the analogue nucleobases confer, among other things, different base pairing and base stacking properties. Examples include universal bases, which can pair with all four canonical bases, and phosphate-sugar backbone analogues such as PNA, which affect the properties of the chain . Nucleic acid analogues are also called xeno nucleic acids and represent one of the main pillars of xenobiology, the design of new-to-nature forms of life based on alternative biochemistries.

Tetraloops are a type of four-base hairpin loop motifs in RNA secondary structure that cap many double helices. There are many variants of the tetraloop. The published ones include ANYA, CUYG, GNRA, UNAC and UNCG.

Probable ATP-dependent RNA helicase DHX36 also known as DEAH box protein 36 (DHX36) or MLE-like protein 1 (MLEL1) or G4 resolvase 1 (G4R1) or RNA helicase associated with AU-rich elements (RHAU) is an enzyme that in humans is encoded by the DHX36 gene.

Nucleic acid tertiary structure is the three-dimensional shape of a nucleic acid polymer. RNA and DNA molecules are capable of diverse functions ranging from molecular recognition to catalysis. Such functions require a precise three-dimensional structure. While such structures are diverse and seemingly complex, they are composed of recurring, easily recognizable tertiary structural motifs that serve as molecular building blocks. Some of the most common motifs for RNA and DNA tertiary structure are described below, but this information is based on a limited number of solved structures. Many more tertiary structural motifs will be revealed as new RNA and DNA molecules are structurally characterized.

Nucleic acid structure refers to the structure of nucleic acids such as DNA and RNA. Chemically speaking, DNA and RNA are very similar. Nucleic acid structure is often divided into four different levels: primary, secondary, tertiary, and quaternary.

Nucleic acid secondary structure is the basepairing interactions within a single nucleic acid polymer or between two polymers. It can be represented as a list of bases which are paired in a nucleic acid molecule. The secondary structures of biological DNAs and RNAs tend to be different: biological DNA mostly exists as fully base paired double helices, while biological RNA is single stranded and often forms complex and intricate base-pairing interactions due to its increased ability to form hydrogen bonds stemming from the extra hydroxyl group in the ribose sugar.

In molecular biology, complementarity describes a relationship between two structures each following the lock-and-key principle. In nature complementarity is the base principle of DNA replication and transcription as it is a property shared between two DNA or RNA sequences, such that when they are aligned antiparallel to each other, the nucleotide bases at each position in the sequences will be complementary, much like looking in the mirror and seeing the reverse of things. This complementary base pairing allows cells to copy information from one generation to another and even find and repair damage to the information stored in the sequences.

Non-canonical base pairs are planar hydrogen bonded pairs of nucleobases, having hydrogen bonding patterns which differ from the patterns observed in Watson-Crick base pairs, as in the classic double helical DNA. The structures of polynucleotide strands of both DNA and RNA molecules can be understood in terms of sugar-phosphate backbones consisting of phosphodiester-linked D 2’ deoxyribofuranose sugar moieties, with purine or pyrimidine nucleobases covalently linked to them. Here, the N9 atoms of the purines, guanine and adenine, and the N1 atoms of the pyrimidines, cytosine and thymine, respectively, form glycosidic linkages with the C1’ atom of the sugars. These nucleobases can be schematically represented as triangles with one of their vertices linked to the sugar, and the three sides accounting for three edges through which they can form hydrogen bonds with other moieties, including with other nucleobases. The side opposite to the sugar linked vertex is traditionally called the Watson-Crick edge, since they are involved in forming the Watson-Crick base pairs which constitute building blocks of double helical DNA. The two sides adjacent to the sugar-linked vertex are referred to, respectively, as the Sugar and Hoogsteen edges.

i-motif DNA, short for intercalated-motif DNA, are cytosine-rich four-stranded quadruplex DNA structures, similar to the G-quadruplex structures that are formed in guanine-rich regions of DNA.

<span class="mw-page-title-main">Guanine tetrad</span> Structure in molecular biology

In molecular biology, a guanine tetrad is a structure composed of four guanine bases in a square planar array. They most prominently contribute to the structure of G-quadruplexes, where their hydrogen bonding stabilizes the structure. Usually, there are at least two guanine tetrads in a G-quadruplex, and they often feature Hoogsteen-style hydrogen bonding.

References

↑ Johansson, M.U. (23 July 2012). "Defining and searching for structural motifs using DeepView/Swiss-PdbViewer". BMC Bioinformatics. 13 (173): 173. doi: 10.1186/1471-2105-13-173 . PMC 3436773 . PMID 22823337.
↑ Bolshoy, Alexander (2010). Genome Clustering: From Linguistic Models to Classification of Genetic Texts. Springer. p. 47. ISBN 9783642129513 . Retrieved 24 March 2021.
↑ Shlyakhtenko LS, Potaman VN, Sinden RR, Lyubchenko YL (July 1998). "Structure and dynamics of supercoil-stabilized DNA cruciforms". J. Mol. Biol. 280 (1): 61–72. CiteSeerX 10.1.1.555.4352 . doi:10.1006/jmbi.1998.1855. PMID 9653031.
↑ Routh ED, Creacy SD, Beerbower PE, Akman SA, Vaughn JP, Smaldino PJ (March 2017). "A G-quadruplex DNA-affinity Approach for Purification of Enzymaticacvly Active G4 Resolvase1". Journal of Visualized Experiments. 121 (121). doi:10.3791/55496. PMC 5409278 . PMID 28362374.
↑ Largy E, Mergny JL, Gabelica V (2016). "Chapter 7. Role of Alkali Metal Ions in G-Quadruplex Nucleic Acid Structure and Stability". In Astrid S, Helmut S, Roland KO S (eds.). The Alkali Metal Ions: Their Role in Life (PDF). Metal Ions in Life Sciences. Vol. 16. Springer. pp. 203–258. doi:10.1007/978-3-319-21756-7_7. ISBN 978-3-319-21755-0. PMID 26860303.
↑ Sundquist WI, Klug A (December 1989). "Telomeric DNA dimerizes by formation of guanine tetrads between hairpin loops". Nature. 342 (6251): 825–9. Bibcode:1989Natur.342..825S. doi:10.1038/342825a0. PMID 2601741. S2CID 4357161.
↑ Sen D, Gilbert W (July 1988). "Formation of parallel four-stranded complexes by guanine-rich motifs in DNA and its implications for meiosis". Nature. 334 (6180): 364–6. Bibcode:1988Natur.334..364S. doi:10.1038/334364a0. PMID 3393228. S2CID 4351855.
↑ DePamphilis, Melvin (2011). Genome Duplication. Garland Science, Taylor & Francis Group, LLC. p. 419. ISBN 9780415442060 . Retrieved 24 March 2021.
↑ Al-Hadid, Qais (July 1, 2016). "R-loop: an emerging regulator of chromatin dynamics". Acta Biochim Biophys Sin (Shanghai). 48 (7): 623–31. doi: 10.1093/abbs/gmw052 . PMC 6259673 . PMID 27252122.
↑ Kasamatsu, H.; Robberson, D. L.; Vinograd, J. (1971). "A novel closed-circular mitochondrial DNA with properties of a replicating intermediate". Proceedings of the National Academy of Sciences of the United States of America. 68 (9): 2252–2257. Bibcode:1971PNAS...68.2252K. doi: 10.1073/pnas.68.9.2252 . PMC 389395 . PMID 5289384.
↑ Hettiarachchy, Navam S (2012). Food Proteins and Peptides: Chemistry, Functionality, Interactions, and Commercialization. CRC Press Taylor & Francis Group. p. 16. ISBN 9781420093421 . Retrieved 24 March 2021.
↑ Dubey, R C (2014). Advanced Biotechnology. S Chand Publishing. p. 505. ISBN 978-8121942904 . Retrieved 24 March 2021.
↑ Milner-White, E. James (September 26, 2011). "Functional Capabilities of the Earliest Peptides and the Emergence of Life". Genes. 2 (4): 674. doi: 10.3390/genes2040671 . PMC 3927598 . PMID 24710286.
↑ Milner-White, E. James (September 26, 2011). "Functional Capabilities of the Earliest Peptides and the Emergence of Life". Genes. 2 (4): 678. doi: 10.3390/genes2040671 . PMC 3927598 . PMID 24710286.

Structural motif

Contents

In nucleic acids

In proteins

See also

Related Research Articles

References

Further reading