Last updated

Base pairing: Two base pairs are produced by four nucleotide monomers, nucleobases are in blue. Guanine (G) is paired with cytosine (C) via three hydrogen bonds, in red. Adenine (A) is paired with uracil (U) via two hydrogen bonds, in red. AGCT RNA mini.png
Base pairing: Two base pairs are produced by four nucleotide monomers, nucleobases are in blue. Guanine (G) is paired with cytosine (C) via three hydrogen bonds, in red. Adenine (A) is paired with uracil (U) via two hydrogen bonds, in red.
Purine nucleobases are fused-ring molecules. Blausen 0323 DNA Purines.png
Purine nucleobases are fused-ring molecules.
Pyrimidine nucleobases are simple ring molecules. Blausen 0324 DNA Pyrimidines.png
Pyrimidine nucleobases are simple ring molecules.

Nucleobases, also known as nitrogenous bases or often simply bases, are nitrogen-containing biological compounds that form nucleosides, which, in turn, are components of nucleotides, with all of these monomers constituting the basic building blocks of nucleic acids. The ability of nucleobases to form base pairs and to stack one upon another leads directly to long-chain helical structures such as ribonucleic acid (RNA) and deoxyribonucleic acid (DNA). Five nucleobases—adenine (A), cytosine (C), guanine (G), thymine (T), and uracil (U)—are called primary or canonical. They function as the fundamental units of the genetic code, with the bases A, G, C, and T being found in DNA while A, G, C, and U are found in RNA. Thymine and uracil are distinguished by merely the presence or absence of a methyl group on the fifth carbon (C5) of these heterocyclic six-membered rings. [1] [ page needed ] In addition, some viruses have aminoadenine (Z) instead of adenine. It differs in having an extra amine group, creating a more stable bond to thymine. [2]


Adenine and guanine have a fused-ring skeletal structure derived of purine, hence they are called purine bases. The purine nitrogenous bases are characterized by their single amino group (−NH2), at the C6 carbon in adenine and C2 in guanine. [3] Similarly, the simple-ring structure of cytosine, uracil, and thymine is derived of pyrimidine, so those three bases are called the pyrimidine bases.

Each of the base pairs in a typical double-helix DNA comprises a purine and a pyrimidine: either an A paired with a T or a C paired with a G. These purine-pyrimidine pairs, which are called base complements, connect the two strands of the helix and are often compared to the rungs of a ladder. Only pairing purine with pyrimidine ensures a constant width for the DNA. The AT pairing is based on two hydrogen bonds, while the CG pairing is based on three. In both cases, the hydrogen bonds are between the amine and carbonyl groups on the complementary bases.

Nucleobases such as adenine, guanine, xanthine, hypoxanthine, purine, 2,6-diaminopurine, and 6,8-diaminopurine may have formed in outer space as well as on earth. [4] [5] [6]

The origin of the term base reflects these compounds' chemical properties in acid–base reactions, but those properties are not especially important for understanding most of the biological functions of nucleobases.


Chemical structure of DNA, showing four nucleobase pairs produced by eight nucleotides: adenine (A) is joined to thymine (T), and guanine (G) is joined to cytosine (C). + This structure also shows the directionality of each of the two phosphate-deoxyribose backbones, or strands. The 5' to 3' (read "5 prime to 3 prime") directions are: down the strand on the left, and up the strand on the right. The strands twist around each other to form a double helix structure. DNA chemical structure.svg
Chemical structure of DNA, showing four nucleobase pairs produced by eight nucleotides: adenine (A) is joined to thymine (T), and guanine (G) is joined to cytosine (C). + This structure also shows the directionality of each of the two phosphate-deoxyribose backbones, or strands. The 5' to 3' (read "5 prime to 3 prime") directions are: down the strand on the left, and up the strand on the right. The strands twist around each other to form a double helix structure.

At the sides of nucleic acid structure, phosphate molecules successively connect the two sugar-rings of two adjacent nucleotide monomers, thereby creating a long chain biomolecule. These chain-joins of phosphates with sugars (ribose or deoxyribose) create the "backbone" strands for a single- or double helix biomolecule. In the double helix of DNA, the two strands are oriented chemically in opposite directions, which permits base pairing by providing complementarity between the two bases, and which is essential for replication of or transcription of the encoded information found in DNA.

Modified nucleobases

DNA and RNA also contain other (non-primary) bases that have been modified after the nucleic acid chain has been formed. In DNA, the most common modified base is 5-methylcytosine (m5C). In RNA, there are many modified bases, including those contained in the nucleosides pseudouridine (Ψ), dihydrouridine (D), inosine (I), and 7-methylguanosine (m7G). [7] [8]

Hypoxanthine and xanthine are two of the many bases created through mutagen presence, both of them through deamination (replacement of the amine-group with a carbonyl-group). Hypoxanthine is produced from adenine, xanthine from guanine, [9] and uracil results from deamination of cytosine.

Modified purine nucleobases

These are examples of modified adenosine or guanosine.

Nucleobase Hypoxanthin.svg
Nucleoside Inosin.svg

Modified pyrimidine nucleobases

These are examples of modified cytosine, thymine or uridine.

Nucleobase Dihydrouracil.svg
Nucleoside Dihydrouridine.svg

Artificial nucleobases

A vast number of nucleobase analogues exist. The most common applications are used as fluorescent probes, either directly or indirectly, such as aminoallyl nucleotide, which are used to label cRNA or cDNA in microarrays. Several groups are working on alternative "extra" base pairs to extend the genetic code, such as isoguanine and isocytosine or the fluorescent 2-amino-6-(2-thienyl)purine and pyrrole-2-carbaldehyde. [10] [11]

In medicine, several nucleoside analogues are used as anticancer and antiviral agents. The viral polymerase incorporates these compounds with non-canonical bases. These compounds are activated in the cells by being converted into nucleotides; they are administered as nucleosides as charged nucleotides cannot easily cross cell membranes.[ citation needed ] At least one set of new base pairs has been announced as of May 2014. [12]

Prebiotic condensation of nucleobases with ribose

In order to understand how life arose knowledge is required of chemical pathways that permit formation of the key building blocks of life under plausible prebiotic conditions. According to the RNA world hypothesis free-floating ribonucleotides were present in the primordial soup. These were the fundamental molecules that combined in series to form RNA. Molecules as complex as RNA must have arisen from small molecules whose reactivity was governed by physico-chemical processes. RNA is composed of purine and pyrimidine nucleotides, both of which are necessary for reliable information transfer, and thus Darwinian evolution. Nam et al. [13] demonstrated the direct condensation of nucleobases with ribose to give ribonucleosides in aqueous microdroplets, a key step leading to RNA formation. Similar results were obtained by Becker et al. [14]

See also

Related Research Articles

<span class="mw-page-title-main">Base pair</span> Unit consisting of two nucleobases bound to each other by hydrogen bonds

A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA and RNA. Dictated by specific hydrogen bonding patterns, "Watson–Crick" base pairs allow the DNA helix to maintain a regular helical structure that is subtly dependent on its nucleotide sequence. The complementary nature of this based-paired structure provides a redundant copy of the genetic information encoded within each strand of DNA. The regular structure and data redundancy provided by the DNA double helix make DNA well suited to the storage of genetic information, while base-pairing between DNA and incoming nucleotides provides the mechanism through which DNA polymerase replicates DNA and RNA polymerase transcribes DNA into RNA. Many DNA-binding proteins can recognize specific base-pairing patterns that identify particular regulatory regions of genes.

<span class="mw-page-title-main">Cytosine</span> Chemical compound in nucleic acids

Cytosine is one of the four nucleobases found in DNA and RNA, along with adenine, guanine, and thymine. It is a pyrimidine derivative, with a heterocyclic aromatic ring and two substituents attached. The nucleoside of cytosine is cytidine. In Watson-Crick base pairing, it forms three hydrogen bonds with guanine.

<span class="mw-page-title-main">Guanine</span> Chemical compound of DNA and RNA

Guanine is one of the four main nucleobases found in the nucleic acids DNA and RNA, the others being adenine, cytosine, and thymine. In DNA, guanine is paired with cytosine. The guanine nucleoside is called guanosine.

<span class="mw-page-title-main">Nucleic acid</span> Class of large biomolecules essential to all known life

Nucleic acids are biopolymers, macromolecules, essential to all known forms of life. They are composed of nucleotides, which are the monomers made of three components: a 5-carbon sugar, a phosphate group and a nitrogenous base. The two main classes of nucleic acids are deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). If the sugar is ribose, the polymer is RNA; if the sugar is the ribose derivative deoxyribose, the polymer is DNA.

<span class="mw-page-title-main">Nucleotide</span> Biological molecules that form the building blocks of nucleic acids

Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecules within all life-forms on Earth. Nucleotides are obtained in the diet and are also synthesized from common nutrients by the liver.

Pyrimidine is an aromatic, heterocyclic, organic compound similar to pyridine. One of the three diazines, it has nitrogen atoms at positions 1 and 3 in the ring. The other diazines are pyrazine and pyridazine.

<span class="mw-page-title-main">Thymine</span> Chemical compound of DNA

Thymine is one of the four nucleobases in the nucleic acid of DNA that are represented by the letters G–C–A–T. The others are adenine, guanine, and cytosine. Thymine is also known as 5-methyluracil, a pyrimidine nucleobase. In RNA, thymine is replaced by the nucleobase uracil. Thymine was first isolated in 1893 by Albrecht Kossel and Albert Neumann from calf thymus glands, hence its name.

<span class="mw-page-title-main">Hypoxanthine</span> Chemical compound

Hypoxanthine is a naturally occurring purine derivative. It is occasionally found as a constituent of nucleic acids, where it is present in the anticodon of tRNA in the form of its nucleoside inosine. It has a tautomer known as 6-hydroxypurine. Hypoxanthine is a necessary additive in certain cells, bacteria, and parasite cultures as a substrate and nitrogen source. For example, it is commonly a required reagent in malaria parasite cultures, since Plasmodium falciparum requires a source of hypoxanthine for nucleic acid synthesis and energy metabolism.

Deamination is the removal of an amino group from a molecule. Enzymes that catalyse this reaction are called deaminases.

<span class="mw-page-title-main">Nucleic acid sequence</span> Succession of nucleotides in a nucleic acid

A nucleic acid sequence is a succession of bases signified by a series of a set of five different letters that indicate the order of nucleotides forming alleles within a DNA or RNA (GACU) molecule. By convention, sequences are usually presented from the 5' end to the 3' end. For DNA, the sense strand is used. Because nucleic acids are normally linear (unbranched) polymers, specifying the sequence is equivalent to defining the covalent structure of the entire molecule. For this reason, the nucleic acid sequence is also termed the primary structure.

<span class="mw-page-title-main">Ribonucleotide</span> Nucleotide containing ribose as its pentose component

In biochemistry, a ribonucleotide is a nucleotide containing ribose as its pentose component. It is considered a molecular precursor of nucleic acids. Nucleotides are the basic building blocks of DNA and RNA. Ribonucleotides themselves are basic monomeric building blocks for RNA. Deoxyribonucleotides, formed by reducing ribonucleotides with the enzyme ribonucleotide reductase (RNR), are essential building blocks for DNA. There are several differences between DNA deoxyribonucleotides and RNA ribonucleotides. Successive nucleotides are linked together via phosphodiester bonds.

<span class="mw-page-title-main">Wobble base pair</span> RNA base pair that does not follow Watson-Crick base pair rules

A wobble base pair is a pairing between two nucleotides in RNA molecules that does not follow Watson-Crick base pair rules. The four main wobble base pairs are guanine-uracil (G-U), hypoxanthine-uracil (I-U), hypoxanthine-adenine (I-A), and hypoxanthine-cytosine (I-C). In order to maintain consistency of nucleic acid nomenclature, "I" is used for hypoxanthine because hypoxanthine is the nucleobase of inosine; nomenclature otherwise follows the names of nucleobases and their corresponding nucleosides. The thermodynamic stability of a wobble base pair is comparable to that of a Watson-Crick base pair. Wobble base pairs are fundamental in RNA secondary structure and are critical for the proper translation of the genetic code.

A salvage pathway is a pathway in which a biological product is produced from intermediates in the degradative pathway of its own or a similar substance. The term often refers to nucleotide salvage in particular, in which nucleotides are synthesized from intermediates in their degradative pathway.

A nucleoside triphosphate is a nucleoside containing a nitrogenous base bound to a 5-carbon sugar, with three phosphate groups bound to the sugar. They are the molecular precursors of both DNA and RNA, which are chains of nucleotides made through the processes of DNA replication and transcription. Nucleoside triphosphates also serve as a source of energy for cellular reactions and are involved in signalling pathways.

<span class="mw-page-title-main">Nucleic acid metabolism</span> Process

Nucleic acid metabolism is a collective term that refers to the variety of chemical reactions by which nucleic acids are either synthesized or degraded. Nucleic acids are polymers made up of a variety of monomers called nucleotides. Nucleotide synthesis is an anabolic mechanism generally involving the chemical reaction of phosphate, pentose sugar, and a nitrogenous base. Degradation of nucleic acids is a catabolic reaction and the resulting parts of the nucleotides or nucleobases can be salvaged to recreate new nucleotides. Both synthesis and degradation reactions require multiple enzymes to facilitate the event. Defects or deficiencies in these enzymes can lead to a variety of diseases.

<span class="mw-page-title-main">Nucleic acid analogue</span> Compound analogous to naturally occurring RNA and DNA

Nucleic acid analogues are compounds which are analogous to naturally occurring RNA and DNA, used in medicine and in molecular biology research. Nucleic acids are chains of nucleotides, which are composed of three parts: a phosphate backbone, a pentose sugar, either ribose or deoxyribose, and one of four nucleobases. An analogue may have any of these altered. Typically the analogue nucleobases confer, among other things, different base pairing and base stacking properties. Examples include universal bases, which can pair with all four canonical bases, and phosphate-sugar backbone analogues such as PNA, which affect the properties of the chain . Nucleic acid analogues are also called Xeno Nucleic Acid and represent one of the main pillars of xenobiology, the design of new-to-nature forms of life based on alternative biochemistries.

<span class="mw-page-title-main">Nucleic acid structure</span> Biomolecular structure of nucleic acids such as DNA and RNA

Nucleic acid structure refers to the structure of nucleic acids such as DNA and RNA. Chemically speaking, DNA and RNA are very similar. Nucleic acid structure is often divided into four different levels: primary, secondary, tertiary, and quaternary.

<span class="mw-page-title-main">Nucleic acid secondary structure</span>

Nucleic acid secondary structure is the basepairing interactions within a single nucleic acid polymer or between two polymers. It can be represented as a list of bases which are paired in a nucleic acid molecule. The secondary structures of biological DNAs and RNAs tend to be different: biological DNA mostly exists as fully base paired double helices, while biological RNA is single stranded and often forms complex and intricate base-pairing interactions due to its increased ability to form hydrogen bonds stemming from the extra hydroxyl group in the ribose sugar.

<span class="mw-page-title-main">Complementarity (molecular biology)</span> Lock-and-key pairing between two structures

In molecular biology, complementarity describes a relationship between two structures each following the lock-and-key principle. In nature complementarity is the base principle of DNA replication and transcription as it is a property shared between two DNA or RNA sequences, such that when they are aligned antiparallel to each other, the nucleotide bases at each position in the sequences will be complementary, much like looking in the mirror and seeing the reverse of things. This complementary base pairing allows cells to copy information from one generation to another and even find and repair damage to the information stored in the sequences.

Non-canonical base pairing occurs when nucleobases hydrogen bond, or base pair, to one another in schemes other than the standard Watson-Crick base pairs. There are three main types of non-canonical base pairs: those stabilized by polar hydrogen bonds, those having interactions among C−H and O/N groups, and those that have hydrogen bonds between the bases themselves. The first discovered non-canonical base pairs are Hoogsteen base pairs, which were first described by American biochemist Karst Hoogsteen.


  1. Soukup, Garrett A. (2003). "Nucleic Acids: General Properties". eLS. American Cancer Society. doi:10.1038/npg.els.0001335. ISBN   9780470015902.
  2. "Some viruses thwart bacterial defenses with a unique genetic alphabet". 5 May 2021.
  3. Berg JM, Tymoczko JL, Stryer L. "Section 25.2, Purine Bases Can Be Synthesized de Novo or Recycled by Salvage Pathways". Biochemistry. 5th Edition. Retrieved 11 December 2019.
  4. Callahan MP, Smith KE, Cleaves HJ, Ruzicka J, Stern JC, Glavin DP, House CH, Dworkin JP (August 2011). "Carbonaceous meteorites contain a wide range of extraterrestrial nucleobases". Proceedings of the National Academy of Sciences of the United States of America. PNAS. 108 (34): 13995–8. Bibcode:2011PNAS..10813995C. doi: 10.1073/pnas.1106493108 . PMC   3161613 . PMID   21836052.
  5. Steigerwald, John (8 August 2011). "NASA Researchers: DNA Building Blocks Can Be Made in Space". NASA . Retrieved 10 August 2011.
  6. ScienceDaily Staff (9 August 2011). "DNA Building Blocks Can Be Made in Space, NASA Evidence Suggests". ScienceDaily . Retrieved 9 August 2011.
  7. Stavely, Brian E. "BIOL2060: Translation". Retrieved 17 August 2020.
  8. "Role of 5' mRNA and 5' U snRNA cap structures in regulation of gene expression" – Research – Retrieved 13 December 2010.
  9. Nguyen T, Brunson D, Crespi CL, Penman BW, Wishnok JS, Tannenbaum SR (April 1992). "DNA damage and mutation in human cells exposed to nitric oxide in vitro". Proceedings of the National Academy of Sciences of the United States of America. 89 (7): 3030–4. Bibcode:1992PNAS...89.3030N. doi: 10.1073/pnas.89.7.3030 . PMC   48797 . PMID   1557408.
  10. Johnson SC, Sherrill CB, Marshall DJ, Moser MJ, Prudent JR (2004). "A third base pair for the polymerase chain reaction: inserting isoC and isoG". Nucleic Acids Research. 32 (6): 1937–41. doi:10.1093/nar/gkh522. PMC   390373 . PMID   15051811.
  11. Kimoto M, Mitsui T, Harada Y, Sato A, Yokoyama S, Hirao I (2007). "Fluorescent probing for RNA molecules by an unnatural base-pair system". Nucleic Acids Research. 35 (16): 5360–69. doi:10.1093/nar/gkm508. PMC   2018647 . PMID   17693436.
  12. Malyshev DA, Dhami K, Lavergne T, Chen T, Dai N, Foster JM, Corrêa IR, Romesberg FE (May 2014). "A semi-synthetic organism with an expanded genetic alphabet". Nature. 509 (7500): 385–8. Bibcode:2014Natur.509..385M. doi:10.1038/nature13314. PMC   4058825 . PMID   24805238.
  13. Nam, Inho; Nam, Hong Gil; Zare, Richard N. (2018). "Abiotic synthesis of purine and pyrimidine ribonucleosides in aqueous microdroplets". Proceedings of the National Academy of Sciences. 115 (1): 36–40. Bibcode:2018PNAS..115...36N. doi: 10.1073/pnas.1718559115 . PMC   5776833 . PMID   29255025.
  14. Becker, Sidney; Feldmann, Jonas; Wiedemann, Stefan; Okamura, Hidenori; Schneider, Christina; Iwan, Katharina; Crisp, Antony; Rossa, Martin; Amatov, Tynchtyk; Carell, Thomas (2019). "Unified prebiotically plausible synthesis of pyrimidine and purine RNA ribonucleotides" (PDF). Science. 366 (6461): 76–82. Bibcode:2019Sci...366...76B. doi:10.1126/science.aax2747. PMID   31604305. S2CID   203719976.