Palindromic sequence

Last updated
Palindrome of DNA structure
A: Palindrome, B: Loop, C: Stem DNA palindrome.svg
Palindrome of DNA structure
A: Palindrome, B: Loop, C: Stem

A palindromic sequence is a nucleic acid sequence in a double-stranded DNA or RNA molecule whereby reading in a certain direction (e.g. 5' to 3') on one strand is identical to the sequence in the same direction (e.g. 5' to 3') on the complementary strand. This definition of palindrome thus depends on complementary strands being palindromic of each other.

Contents

The meaning of palindrome in the context of genetics is slightly different from the definition used for words and sentences. Since a double helix is formed by two paired antiparallel strands of nucleotides that run in opposite directions, and the nucleotides always pair in the same way (adenine (A) with thymine (T) in DNA or uracil (U) in RNA; cytosine (C) with guanine (G)), a (single-stranded) nucleotide sequence is said to be a palindrome if it is equal to its reverse complement. For example, the DNA sequence ACCTAGGT is palindromic with its nucleotide-by-nucleotide complement TGGATCCA because reversing the order of the nucleotides in the complement gives the original sequence.

A palindromic nucleotide sequence is capable of forming a hairpin. The stem portion of the hairpin is a pseudo-double stranded portion since the entire hairpin is a part of same (single) strand of nucleic acid. Palindromic motifs are found in most genomes or sets of genetic instructions. They have been specially researched in bacterial chromosomes and in the so-called Bacterial Interspersed Mosaic Elements (BIMEs) scattered over them. In 2008, a genome sequencing project discovered that large portions of the human X and Y chromosomes are arranged as palindromes. [1] A palindromic structure allows the Y chromosome to repair itself by bending over at the middle if one side is damaged.

Palindromes also appear to be found frequently in the peptide sequences that make up proteins, [2] [3] but their role in protein function is not clearly known. It has been suggested that the existence of palindromes in peptides might be related to the prevalence of low-complexity regions in proteins, as palindromes are frequently associated with low-complexity sequences. Their prevalence may also be related to the propensity of such sequences to form alpha helices [4] or protein/protein complexes. [5]

Examples

Restriction enzyme sites

Palindromic sequences play an important role in molecular biology. Because a DNA sequence is double stranded, the base pairs are read, (not just the bases on one strand), to determine a palindrome. Many restriction endonucleases (restriction enzymes) recognize specific palindromic sequences and cut them. The restriction enzyme EcoR1 recognizes the following palindromic sequence:

 5'- G  A  A  T  T  C -3' 3'- C  T  T  A  A  G -5'

The top strand reads 5'-GAATTC-3', while the bottom strand reads 3'-CTTAAG-5'. If the DNA strand is flipped over, the sequences are exactly the same (5'GAATTC-3' and 3'-CTTAAG-5'). Here are more restriction enzymes and the palindromic sequences which they recognize:

EnzymeSourceRecognition SequenceCut
EcoR1 Escherichia coli
5'GAATTC 3'CTTAAG
5'---G     AATTC---3' 3'---CTTAA     G---5'
BamH1 Bacillus amyloliquefaciens
5'GGATCC 3'CCTAGG
5'---G     GATCC---3' 3'---CCTAG     G---5'
Taq1 Thermus aquaticus
5'TCGA 3'AGCT
5'---T   CGA---3' 3'---AGC   T---5'
Alu1* Arthrobacter luteus
5'AGCT 3'TCGA
5'---AG  CT---3' 3'---TC  GA---5'
* = blunt ends

Methylation sites

Palindromic sequences may also have methylation sites.[ citation needed ] These are the sites where a methyl group can be attached to the palindromic sequence. Methylation makes the resistant gene inactive; this is called insertional inactivation or insertional mutagenesis. For example, in PBR322 methylation at the tetracyclin resistant gene makes the plasmid liable to tetracyclin; after methylation at the tetracyclin resistant gene if the plasmid is exposed to antibiotic tetracyclin, it does not survive.

Palindromic nucleotides in T cell receptors

Diversity of T cell receptor (TCR) genes is generated by nucleotide insertions upon V(D)J recombination from their germline-encoded V, D and J segments. Nucleotide insertions at V-D and D-J junctions are random, but some small subsets of these insertions are exceptional, in that one to three base pairs inversely repeat the sequence of the germline DNA. These short complementary palindromic sequences are called P nucleotides. [6]

Related Research Articles

A restriction enzyme, restriction endonuclease, REase, ENase orrestrictase is an enzyme that cleaves DNA into fragments at or near specific recognition sites within molecules known as restriction sites. Restriction enzymes are one class of the broader endonuclease group of enzymes. Restriction enzymes are commonly classified into five types, which differ in their structure and whether they cut their DNA substrate at their recognition site, or if the recognition and cleavage sites are separate from one another. To cut DNA, all restriction enzymes make two incisions, once through each sugar-phosphate backbone of the DNA double helix.

An inverted repeat is a single stranded sequence of nucleotides followed downstream by its reverse complement. The intervening sequence of nucleotides between the initial sequence and the reverse complement can be any length including zero. For example, 5'---TTACGnnnnnnCGTAA---3' is an inverted repeat sequence. When the intervening length is zero, the composite sequence is a palindromic sequence.

<span class="mw-page-title-main">Transcription (biology)</span> Process of copying a segment of DNA into RNA

Transcription is the process of copying a segment of DNA into RNA. The segments of DNA transcribed into RNA molecules that can encode proteins are said to produce messenger RNA (mRNA). Other segments of DNA are copied into RNA molecules called non-coding RNAs (ncRNAs). mRNA comprises only 1–3% of total RNA samples. Less than 2% of the human genome can be transcribed into mRNA, while at least 80% of mammalian genomic DNA can be actively transcribed, with the majority of this 80% considered to be ncRNA.

Protein engineering is the process of developing useful or valuable proteins through the design and production of unnatural polypeptides, often by altering amino acid sequences found in nature. It is a young discipline, with much research taking place into the understanding of protein folding and recognition for protein design principles. It has been used to improve the function of many enzymes for industrial catalysis. It is also a product and services market, with an estimated value of $168 billion by 2017.

The restriction modification system is found in bacteria and other prokaryotic organisms, and provides a defense against foreign DNA, such as that borne by bacteriophages.

Site-directed mutagenesis is a molecular biology method that is used to make specific and intentional mutating changes to the DNA sequence of a gene and any gene products. Also called site-specific mutagenesis or oligonucleotide-directed mutagenesis, it is used for investigating the structure and biological activity of DNA, RNA, and protein molecules, and for protein engineering.

<span class="mw-page-title-main">Transfer DNA</span> Type of DNA in bacterial genomes

The transfer DNA is the transferred DNA of the tumor-inducing (Ti) plasmid of some species of bacteria such as Agrobacterium tumefaciens and Agrobacterium rhizogenes . The T-DNA is transferred from bacterium into the host plant's nuclear DNA genome. The capability of this specialized tumor-inducing (Ti) plasmid is attributed to two essential regions required for DNA transfer to the host cell. The T-DNA is bordered by 25-base-pair repeats on each end. Transfer is initiated at the right border and terminated at the left border and requires the vir genes of the Ti plasmid.

<span class="mw-page-title-main">DNA repair</span> Cellular mechanism

DNA repair is a collection of processes by which a cell identifies and corrects damage to the DNA molecules that encodes its genome. In human cells, both normal metabolic activities and environmental factors such as radiation can cause DNA damage, resulting in tens of thousands of individual molecular lesions per cell per day. Many of these lesions cause structural damage to the DNA molecule and can alter or eliminate the cell's ability to transcribe the gene that the affected DNA encodes. Other lesions induce potentially harmful mutations in the cell's genome, which affect the survival of its daughter cells after it undergoes mitosis. As a consequence, the DNA repair process is constantly active as it responds to damage in the DNA structure. When normal repair processes fail, and when cellular apoptosis does not occur, irreparable DNA damage may occur, including double-strand breaks and DNA crosslinkages. This can eventually lead to malignant tumors, or cancer as per the two-hit hypothesis.

Restriction sites, or restriction recognition sites, are located on a DNA molecule containing specific sequences of nucleotides, which are recognized by restriction enzymes. These are generally palindromic sequences, and a particular restriction enzyme may cut the sequence between two nucleotides within its recognition site, or somewhere nearby.

<span class="mw-page-title-main">Restriction fragment</span>

A restriction fragment is a DNA fragment resulting from the cutting of a DNA strand by a restriction enzyme, a process called restriction. Each restriction enzyme is highly specific, recognising a particular short DNA sequence, or restriction site, and cutting both DNA strands at specific points within this site. Most restriction sites are palindromic,, and are four to eight nucleotides long. Many cuts are made by one restriction enzyme because of the chance repetition of these sequences in a long DNA molecule, yielding a set of restriction fragments. A particular DNA molecule will always yield the same set of restriction fragments when exposed to the same restriction enzyme. Restriction fragments can be analyzed using techniques such as gel electrophoresis or used in recombinant DNA technology.

pBR322 Artificial plasmid

pBR322 is a plasmid and was one of the first widely used E. coli cloning vectors. Created in 1977 in the laboratory of Herbert Boyer at the University of California, San Francisco, it was named after Francisco Bolivar Zapata, the postdoctoral researcher and Raymond L. Rodriguez. The p stands for "plasmid," and BR for "Bolivar" and "Rodriguez."

In molecular biology and genetics, the sense of a nucleic acid molecule, particularly of a strand of DNA or RNA, refers to the nature of the roles of the strand and its complement in specifying a sequence of amino acids. Depending on the context, sense may have slightly different meanings. For example, the negative-sense strand of DNA is equivalent to the template strand, whereas the positive-sense strand is the non-template strand whose nucleotide sequence is equivalent to the sequence of the mRNA transcript.

Artificial gene synthesis, or simply gene synthesis, refers to a group of methods that are used in synthetic biology to construct and assemble genes from nucleotides de novo. Unlike DNA synthesis in living cells, artificial gene synthesis does not require template DNA, allowing virtually any DNA sequence to be synthesized in the laboratory. It comprises two main steps, the first of which is solid-phase DNA synthesis, sometimes known as DNA printing. This produces oligonucleotide fragments that are generally under 200 base pairs. The second step then involves connecting these oligonucleotide fragments using various DNA assembly methods. Because artificial gene synthesis does not require template DNA, it is theoretically possible to make a completely synthetic DNA molecule with no limits on the nucleotide sequence or size.

PstI is a type II restriction endonuclease isolated from the Gram negative species, Providencia stuartii.

<span class="mw-page-title-main">Complementarity (molecular biology)</span> Lock-and-key pairing between two structures

In molecular biology, complementarity describes a relationship between two structures each following the lock-and-key principle. In nature complementarity is the base principle of DNA replication and transcription as it is a property shared between two DNA or RNA sequences, such that when they are aligned antiparallel to each other, the nucleotide bases at each position in the sequences will be complementary, much like looking in the mirror and seeing the reverse of things. This complementary base pairing allows cells to copy information from one generation to another and even find and repair damage to the information stored in the sequences.

DNA ends refer to the properties of the ends of linear DNA molecules, which in molecular biology are described as "sticky" or "blunt" based on the shape of the complementary strands at the terminus. In sticky ends, one strand is longer than the other, such that the longer strand has bases which are left unpaired. In blunt ends, both strands are of equal length – i.e. they end at the same base position, leaving no unpaired bases on either strand.

No-SCAR genome editing is an editing method that is able to manipulate the Escherichia coli genome. The system relies on recombineering whereby DNA sequences are combined and manipulated through homologous recombination. No-SCAR is able to manipulate the E. coli genome without the use of the chromosomal markers detailed in previous recombineering methods. Instead, the λ-Red recombination system facilitates donor DNA integration while Cas9 cleaves double-stranded DNA to counter-select against wild-type cells. Although λ-Red and Cas9 genome editing are widely used technologies, the no-SCAR method is novel in combining the two functions; this technique is able to establish point mutations, gene deletions, and short sequence insertions in several genomic loci with increased efficiency and time sensitivity.

<span class="mw-page-title-main">Cruciform DNA</span>

Cruciform DNA is a form of non-B DNA, or an alternative DNA structure. The formation of cruciform DNA requires the presence of palindromes called inverted repeat sequences. These inverted repeats contain a sequence of DNA in one strand that is repeated in the opposite direction on the other strand. As a result, inverted repeats are self-complementary and can give rise to structures such as hairpins and cruciforms. Cruciform DNA structures require at least a six nucleotide sequence of inverted repeats to form a structure consisting of a stem, branch point and loop in the shape of a cruciform, stabilized by negative DNA supercoiling.

This glossary of genetics is a list of definitions of terms and concepts commonly used in the study of genetics and related disciplines in biology, including molecular biology, cell biology, and evolutionary biology. It is intended as introductory material for novices; for more specific and technical detail, see the article corresponding to each term. For related terms, see Glossary of evolutionary biology.

This glossary of cell and molecular biology is a list of definitions of terms and concepts commonly used in the study of cell biology, molecular biology, and related disciplines, including genetics, microbiology, and biochemistry. It is split across two articles:

References

  1. Larionov S, Loskutov A, Ryadchenko E (February 2008). "Chromosome evolution with naked eye: palindromic context of the life origin". Chaos. 18 (1): 013105. Bibcode:2008Chaos..18a3105L. doi:10.1063/1.2826631. PMID   18377056.
  2. Ohno S (1990). "Intrinsic evolution of proteins. The role of peptidic palindromes". Riv. Biol. 83 (2–3): 287–91, 405–10. PMID   2128128.
  3. Giel-Pietraszuk M, Hoffmann M, Dolecka S, Rychlewski J, Barciszewski J (February 2003). "Palindromes in proteins". J. Protein Chem. 22 (2): 109–13. doi:10.1023/A:1023454111924. PMID   12760415. S2CID   28294669. Archived from the original (PDF) on 2019-12-14. Retrieved 2011-02-25.
  4. Sheari A, Kargar M, Katanforoush A, et al. (2008). "A tale of two symmetrical tails: structural and functional characteristics of palindromes in proteins". BMC Bioinformatics. 9: 274. doi: 10.1186/1471-2105-9-274 . PMC   2474621 . PMID   18547401.
  5. Pinotsis N, Wilmanns M (October 2008). "Protein assemblies with palindromic structure motifs". Cell. Mol. Life Sci. 65 (19): 2953–6. doi:10.1007/s00018-008-8265-1. PMID   18791850. S2CID   29569626.
  6. Srivastava, SK; Robins, HS (2012). "Palindromic nucleotide analysis in human T cell receptor rearrangements". PLOS ONE. 7 (12): e52250. Bibcode:2012PLoSO...752250S. doi: 10.1371/journal.pone.0052250 . PMC   3528771 . PMID   23284955.