Base pair

Last updated
The chemical structure of DNA base-pairs DNA base-pair diagram.jpg
The chemical structure of DNA base-pairs

A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA and RNA. Dictated by specific hydrogen bonding patterns, "Watson–Crick" (or "Watson–Crick–Franklin") base pairs (guaninecytosine and adeninethymine) [1] allow the DNA helix to maintain a regular helical structure that is subtly dependent on its nucleotide sequence. [2] The complementary nature of this based-paired structure provides a redundant copy of the genetic information encoded within each strand of DNA. The regular structure and data redundancy provided by the DNA double helix make DNA well suited to the storage of genetic information, while base-pairing between DNA and incoming nucleotides provides the mechanism through which DNA polymerase replicates DNA and RNA polymerase transcribes DNA into RNA. Many DNA-binding proteins can recognize specific base-pairing patterns that identify particular regulatory regions of genes.

Contents

Intramolecular base pairs can occur within single-stranded nucleic acids. This is particularly important in RNA molecules (e.g., transfer RNA), where Watson–Crick base pairs (guanine–cytosine and adenine–uracil) permit the formation of short double-stranded helices, and a wide variety of non–Watson–Crick interactions (e.g., G–U or A–A) allow RNAs to fold into a vast range of specific three-dimensional structures. In addition, base-pairing between transfer RNA (tRNA) and messenger RNA (mRNA) forms the basis for the molecular recognition events that result in the nucleotide sequence of mRNA becoming translated into the amino acid sequence of proteins via the genetic code.

The size of an individual gene or an organism's entire genome is often measured in base pairs because DNA is usually double-stranded. Hence, the number of total base pairs is equal to the number of nucleotides in one of the strands (with the exception of non-coding single-stranded regions of telomeres). The haploid human genome (23 chromosomes) is estimated to be about 3.2 billion bases long and to contain 20,000–25,000 distinct protein-coding genes. [3] [4] [5] A kilobase (kb) is a unit of measurement in molecular biology equal to 1000 base pairs of DNA or RNA. [6] The total number of DNA base pairs on Earth is estimated at 5.0×1037 with a weight of 50 billion tonnes. [7] In comparison, the total mass of the biosphere has been estimated to be as much as 4  TtC (trillion tons of carbon). [8]

Hydrogen bonding and stability

Base pair GC.svg
Base pair AT.svg
Top, a G.C base pair with three hydrogen bonds. Bottom, an A.T base pair with two hydrogen bonds. Non-covalent hydrogen bonds between the bases are shown as dashed lines. The wiggly lines stand for the connection to the pentose sugar and point in the direction of the minor groove.

Hydrogen bonding is the chemical interaction that underlies the base-pairing rules described above. Appropriate geometrical correspondence of hydrogen bond donors and acceptors allows only the "right" pairs to form stably. DNA with high GC-content is more stable than DNA with low GC-content. Crucially, however, stacking interactions are primarily responsible for stabilising the double-helical structure; Watson-Crick base pairing's contribution to global structural stability is minimal, but its role in the specificity underlying complementarity is, by contrast, of maximal importance as this underlies the template-dependent processes of the central dogma (e.g. DNA replication). [9]

The bigger nucleobases, adenine and guanine, are members of a class of double-ringed chemical structures called purines; the smaller nucleobases, cytosine and thymine (and uracil), are members of a class of single-ringed chemical structures called pyrimidines. Purines are complementary only with pyrimidines: pyrimidine–pyrimidine pairings are energetically unfavorable because the molecules are too far apart for hydrogen bonding to be established; purine–purine pairings are energetically unfavorable because the molecules are too close, leading to overlap repulsion. Purine–pyrimidine base-pairing of AT or GC or UA (in RNA) results in proper duplex structure. The only other purine–pyrimidine pairings would be AC and GT and UG (in RNA); these pairings are mismatches because the patterns of hydrogen donors and acceptors do not correspond. The GU pairing, with two hydrogen bonds, does occur fairly often in RNA (see wobble base pair).

Paired DNA and RNA molecules are comparatively stable at room temperature, but the two nucleotide strands will separate above a melting point that is determined by the length of the molecules, the extent of mispairing (if any), and the GC content. Higher GC content results in higher melting temperatures; it is, therefore, unsurprising that the genomes of extremophile organisms such as Thermus thermophilus are particularly GC-rich. On the converse, regions of a genome that need to separate frequently — for example, the promoter regions for often-transcribed genes — are comparatively GC-poor (for example, see TATA box). GC content and melting temperature must also be taken into account when designing primers for PCR reactions.[ citation needed ]

Examples

The following DNA sequences illustrate pair double-stranded patterns. By convention, the top strand is written from the 5′-end to the 3′-end; thus, the bottom strand is written 3′ to 5′.

A base-paired DNA sequence:
ATCGATTGAGCTCTAGCG
TAGCTAACTCGAGATCGC
The corresponding RNA sequence, in which uracil is substituted for thymine in the RNA strand:
AUCGAUUGAGCUCUAGCG
UAGCUAACUCGAGAUCGC

Base analogs and intercalators

Chemical analogs of nucleotides can take the place of proper nucleotides and establish non-canonical base-pairing, leading to errors (mostly point mutations) in DNA replication and DNA transcription. This is due to their isosteric chemistry. One common mutagenic base analog is 5-bromouracil, which resembles thymine but can base-pair to guanine in its enol form. [10]

Other chemicals, known as DNA intercalators, fit into the gap between adjacent bases on a single strand and induce frameshift mutations by "masquerading" as a base, causing the DNA replication machinery to skip or insert additional nucleotides at the intercalated site. Most intercalators are large polyaromatic compounds and are known or suspected carcinogens. Examples include ethidium bromide and acridine. [11] [ citation needed ]

Mismatch repair

Mismatched base pairs can be generated by errors of DNA replication and as intermediates during homologous recombination. The process of mismatch repair ordinarily must recognize and correctly repair a small number of base mispairs within a long sequence of normal DNA base pairs. To repair mismatches formed during DNA replication, several distinctive repair processes have evolved to distinguish between the template strand and the newly formed strand so that only the newly inserted incorrect nucleotide is removed (in order to avoid generating a mutation). [12] The proteins employed in mismatch repair during DNA replication, and the clinical significance of defects in this process are described in the article DNA mismatch repair. The process of mispair correction during recombination is described in the article gene conversion.

Length measurements

Schematic karyogram of a human. The blue scale to the left of each nuclear chromosome pair (as well as the mitochondrial genome at bottom left) shows its length in terms of mega-base-pairs.

Further information: Karyotype Human karyotype with bands and sub-bands.png
Schematic karyogram of a human. The blue scale to the left of each nuclear chromosome pair (as well as the mitochondrial genome at bottom left) shows its length in terms of mega–base-pairs.

The following abbreviations are commonly used to describe the length of a D/RNA molecule:

For single-stranded DNA/RNA, units of nucleotides are used—abbreviated nt (or knt, Mnt, Gnt)—as they are not paired. To distinguish between units of computer storage and bases, kbp, Mbp, Gbp, etc. may be used for base pairs.

The centimorgan is also often used to imply distance along a chromosome, but the number of base pairs it corresponds to varies widely. In the human genome, the centimorgan is about 1 million base pairs. [14] [15]

Unnatural base pair (UBP)

An unnatural base pair (UBP) is a designed subunit (or nucleobase) of DNA which is created in a laboratory and does not occur in nature. DNA sequences have been described which use newly created nucleobases to form a third base pair, in addition to the two base pairs found in nature, A-T (adeninethymine) and G-C (guaninecytosine). A few research groups have been searching for a third base pair for DNA, including teams led by Steven A. Benner, Philippe Marliere, Floyd E. Romesberg and Ichiro Hirao. [16] Some new base pairs based on alternative hydrogen bonding, hydrophobic interactions and metal coordination have been reported. [17] [18] [19] [20]

In 1989 Steven Benner (then working at the Swiss Federal Institute of Technology in Zurich) and his team led with modified forms of cytosine and guanine into DNA molecules in vitro. [21] The nucleotides, which encoded RNA and proteins, were successfully replicated in vitro. Since then, Benner's team has been trying to engineer cells that can make foreign bases from scratch, obviating the need for a feedstock. [22]

In 2002, Ichiro Hirao's group in Japan developed an unnatural base pair between 2-amino-8-(2-thienyl)purine (s) and pyridine-2-one (y) that functions in transcription and translation, for the site-specific incorporation of non-standard amino acids into proteins. [23] In 2006, they created 7-(2-thienyl)imidazo[4,5-b]pyridine (Ds) and pyrrole-2-carbaldehyde (Pa) as a third base pair for replication and transcription. [24] Afterward, Ds and 4-[3-(6-aminohexanamido)-1-propynyl]-2-nitropyrrole (Px) was discovered as a high fidelity pair in PCR amplification. [25] [26] In 2013, they applied the Ds-Px pair to DNA aptamer generation by in vitro selection (SELEX) and demonstrated the genetic alphabet expansion significantly augment DNA aptamer affinities to target proteins. [27]

In 2012, a group of American scientists led by Floyd Romesberg, a chemical biologist at the Scripps Research Institute in San Diego, California, published that his team designed an unnatural base pair (UBP). [19] The two new artificial nucleotides or Unnatural Base Pair (UBP) were named d5SICS and dNaM. More technically, these artificial nucleotides bearing hydrophobic nucleobases, feature two fused aromatic rings that form a (d5SICS–dNaM) complex or base pair in DNA. [22] [28] His team designed a variety of in vitro or "test tube" templates containing the unnatural base pair and they confirmed that it was efficiently replicated with high fidelity in virtually all sequence contexts using the modern standard in vitro techniques, namely PCR amplification of DNA and PCR-based applications. [19] Their results show that for PCR and PCR-based applications, the d5SICS–dNaM unnatural base pair is functionally equivalent to a natural base pair, and when combined with the other two natural base pairs used by all organisms, A–T and G–C, they provide a fully functional and expanded six-letter "genetic alphabet". [28]

In 2014 the same team from the Scripps Research Institute reported that they synthesized a stretch of circular DNA known as a plasmid containing natural T-A and C-G base pairs along with the best-performing UBP Romesberg's laboratory had designed and inserted it into cells of the common bacterium E. coli that successfully replicated the unnatural base pairs through multiple generations. [16] The transfection did not hamper the growth of the E. coli cells and showed no sign of losing its unnatural base pairs to its natural DNA repair mechanisms. This is the first known example of a living organism passing along an expanded genetic code to subsequent generations. [28] [29] Romesberg said he and his colleagues created 300 variants to refine the design of nucleotides that would be stable enough and would be replicated as easily as the natural ones when the cells divide. This was in part achieved by the addition of a supportive algal gene that expresses a nucleotide triphosphate transporter which efficiently imports the triphosphates of both d5SICSTP and dNaMTP into E. coli bacteria. [28] Then, the natural bacterial replication pathways use them to accurately replicate a plasmid containing d5SICS–dNaM. Other researchers were surprised that the bacteria replicated these human-made DNA subunits. [30]

The successful incorporation of a third base pair is a significant breakthrough toward the goal of greatly expanding the number of amino acids which can be encoded by DNA, from the existing 20 amino acids to a theoretically possible 172, thereby expanding the potential for living organisms to produce novel proteins. [16] The artificial strings of DNA do not encode for anything yet, but scientists speculate they could be designed to manufacture new proteins which could have industrial or pharmaceutical uses. [31] Experts said the synthetic DNA incorporating the unnatural base pair raises the possibility of life forms based on a different DNA code. [30] [31]

Non-canonical base pairing

Wobble.svg
Wobble base pairs
Hoogsteen Watson Crick pairing-en.svg
Comparison of Hoogsteen to Watson–Crick base pairs. [32]

In addition to the canonical pairing, some conditions can also favour base-pairing with alternative base orientation, and number and geometry of hydrogen bonds. These pairings are accompanied by alterations to the local backbone shape.[ citation needed ]

The most common of these is the wobble base pairing that occurs between tRNAs and mRNAs at the third base position of many codons during transcription [33] and during the charging of tRNAs by some tRNA synthetases. [34] They have also been observed in the secondary structures of some RNA sequences. [35]

Additionally, Hoogsteen base pairing (typically written as A•U/T and G•C) can exist in some DNA sequences (e.g. CA and TA dinucleotides) in dynamic equilibrium with standard Watson–Crick pairing. [32] They have also been observed in some protein–DNA complexes. [36]

In addition to these alternative base pairings, a wide range of base-base hydrogen bonding is observed in RNA secondary and tertiary structure. [37] These bonds are often necessary for the precise, complex shape of an RNA, as well as its binding to interaction partners. [37]

See also

Related Research Articles

<span class="mw-page-title-main">DNA</span> Molecule that carries genetic information

Deoxyribonucleic acid is a polymer composed of two polynucleotide chains that coil around each other to form a double helix. The polymer carries genetic instructions for the development, functioning, growth and reproduction of all known organisms and many viruses. DNA and ribonucleic acid (RNA) are nucleic acids. Alongside proteins, lipids and complex carbohydrates (polysaccharides), nucleic acids are one of the four major types of macromolecules that are essential for all known forms of life.

<span class="mw-page-title-main">Nucleic acid</span> Class of large biomolecules essential to all known life

Nucleic acids are biopolymers, macromolecules, essential to all known forms of life. They are composed of nucleotides, which are the monomer components: a 5-carbon sugar, a phosphate group and a nitrogenous base. The two main classes of nucleic acids are deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). If the sugar is ribose, the polymer is RNA; if the sugar is deoxyribose, a variant of ribose, the polymer is DNA.

<span class="mw-page-title-main">Nucleotide</span> Biological molecules that form the building blocks of nucleic acids

Nucleotides are organic molecules composed of a nitrogenous base, a pentose sugar and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecules within all life-forms on Earth. Nucleotides are obtained in the diet and are also synthesized from common nutrients by the liver.

<span class="mw-page-title-main">RNA world</span> Hypothetical stage in the early evolutionary history of life on Earth

The RNA world is a hypothetical stage in the evolutionary history of life on Earth, in which self-replicating RNA molecules proliferated before the evolution of DNA and proteins. The term also refers to the hypothesis that posits the existence of this stage.

<span class="mw-page-title-main">Nucleobase</span> Nitrogen-containing biological compounds that form nucleosides

Nucleobases are nitrogen-containing biological compounds that form nucleosides, which, in turn, are components of nucleotides, with all of these monomers constituting the basic building blocks of nucleic acids. The ability of nucleobases to form base pairs and to stack one upon another leads directly to long-chain helical structures such as ribonucleic acid (RNA) and deoxyribonucleic acid (DNA). Five nucleobases—adenine (A), cytosine (C), guanine (G), thymine (T), and uracil (U)—are called primary or canonical. They function as the fundamental units of the genetic code, with the bases A, G, C, and T being found in DNA while A, G, C, and U are found in RNA. Thymine and uracil are distinguished by merely the presence or absence of a methyl group on the fifth carbon (C5) of these heterocyclic six-membered rings. In addition, some viruses have aminoadenine (Z) instead of adenine. It differs in having an extra amine group, creating a more stable bond to thymine.

<span class="mw-page-title-main">Nucleic acid sequence</span> Succession of nucleotides in a nucleic acid

A nucleic acid sequence is a succession of bases within the nucleotides forming alleles within a DNA or RNA (GACU) molecule. This succession is denoted by a series of a set of five different letters that indicate the order of the nucleotides. By convention, sequences are usually presented from the 5' end to the 3' end. For DNA, with its double helix, there are two possible directions for the notated sequence; of these two, the sense strand is used. Because nucleic acids are normally linear (unbranched) polymers, specifying the sequence is equivalent to defining the covalent structure of the entire molecule. For this reason, the nucleic acid sequence is also termed the primary structure.

<span class="mw-page-title-main">DNA synthesis</span>

DNA synthesis is the natural or artificial creation of deoxyribonucleic acid (DNA) molecules. DNA is a macromolecule made up of nucleotide units, which are linked by covalent bonds and hydrogen bonds, in a repeating structure. DNA synthesis occurs when these nucleotide units are joined to form DNA; this can occur artificially or naturally. Nucleotide units are made up of a nitrogenous base, pentose sugar (deoxyribose) and phosphate group. Each unit is joined when a covalent bond forms between its phosphate group and the pentose sugar of the next nucleotide, forming a sugar-phosphate backbone. DNA is a complementary, double stranded structure as specific base pairing occurs naturally when hydrogen bonds form between the nucleotide bases.

<span class="mw-page-title-main">Ribonucleotide</span> Nucleotide containing ribose as its pentose component

In biochemistry, a ribonucleotide is a nucleotide containing ribose as its pentose component. It is considered a molecular precursor of nucleic acids. Nucleotides are the basic building blocks of DNA and RNA. Ribonucleotides themselves are basic monomeric building blocks for RNA. Deoxyribonucleotides, formed by reducing ribonucleotides with the enzyme ribonucleotide reductase (RNR), are essential building blocks for DNA. There are several differences between DNA deoxyribonucleotides and RNA ribonucleotides. Successive nucleotides are linked together via phosphodiester bonds.

<span class="mw-page-title-main">Chargaff's rules</span> Two rules about the percentage of A, C, G, and T in DNA strands

Chargaff's rules [given by Erwin Chargaff] states that in the DNA of any species and any organism, the amount of guanine should be equal to the amount of cytosine and the amount of adenine should be equal to the amount of thymine. Further a 1:1 stoichiometric ratio of purine and pyrimidine bases should exist. This pattern is found in both strands of the DNA. They were discovered by Austrian-born chemist Erwin Chargaff, in the late 1940s.

Xenobiology (XB) is a subfield of synthetic biology, the study of synthesizing and manipulating biological devices and systems. The name "xenobiology" derives from the Greek word xenos, which means "stranger, alien". Xenobiology is a form of biology that is not (yet) familiar to science and is not found in nature. In practice, it describes novel biological systems and biochemistries that differ from the canonical DNA–RNA-20 amino acid system. For example, instead of DNA or RNA, XB explores nucleic acid analogues, termed xeno nucleic acid (XNA) as information carriers. It also focuses on an expanded genetic code and the incorporation of non-proteinogenic amino acids, or “xeno amino acids” into proteins.

<span class="mw-page-title-main">Nucleic acid analogue</span> Compound analogous to naturally occurring RNA and DNA

Nucleic acid analogues are compounds which are analogous to naturally occurring RNA and DNA, used in medicine and in molecular biology research. Nucleic acids are chains of nucleotides, which are composed of three parts: a phosphate backbone, a pentose sugar, either ribose or deoxyribose, and one of four nucleobases. An analogue may have any of these altered. Typically the analogue nucleobases confer, among other things, different base pairing and base stacking properties. Examples include universal bases, which can pair with all four canonical bases, and phosphate-sugar backbone analogues such as PNA, which affect the properties of the chain . Nucleic acid analogues are also called Xeno Nucleic Acid and represent one of the main pillars of xenobiology, the design of new-to-nature forms of life based on alternative biochemistries.

<span class="mw-page-title-main">Expanded genetic code</span> Modified genetic code

An expanded genetic code is an artificially modified genetic code in which one or more specific codons have been re-allocated to encode an amino acid that is not among the 22 common naturally-encoded proteinogenic amino acids.

<span class="mw-page-title-main">Nucleic acid structure</span> Biomolecular structure of nucleic acids such as DNA and RNA

Nucleic acid structure refers to the structure of nucleic acids such as DNA and RNA. Chemically speaking, DNA and RNA are very similar. Nucleic acid structure is often divided into four different levels: primary, secondary, tertiary, and quaternary.

<span class="mw-page-title-main">Nucleic acid secondary structure</span>

Nucleic acid secondary structure is the basepairing interactions within a single nucleic acid polymer or between two polymers. It can be represented as a list of bases which are paired in a nucleic acid molecule. The secondary structures of biological DNAs and RNAs tend to be different: biological DNA mostly exists as fully base paired double helices, while biological RNA is single stranded and often forms complex and intricate base-pairing interactions due to its increased ability to form hydrogen bonds stemming from the extra hydroxyl group in the ribose sugar.

<span class="mw-page-title-main">Complementarity (molecular biology)</span> Lock-and-key pairing between two structures

In molecular biology, complementarity describes a relationship between two structures each following the lock-and-key principle. In nature complementarity is the base principle of DNA replication and transcription as it is a property shared between two DNA or RNA sequences, such that when they are aligned antiparallel to each other, the nucleotide bases at each position in the sequences will be complementary, much like looking in the mirror and seeing the reverse of things. This complementary base pairing allows cells to copy information from one generation to another and even find and repair damage to the information stored in the sequences.

xDNA Benzo-homologated DNA analogue

xDNA is a size-expanded nucleotide system synthesized from the fusion of a benzene ring and one of the four natural bases: adenine, guanine, cytosine, and thymine. This size expansion produces an 8 letter alphabet which has a larger information storage capacity than natural DNA's 4 letter alphabet. As with normal base-pairing, A pairs with xT, C pairs with xG, G pairs with xC, and T pairs with xA. The double helix is thus 2.4Å wider than a natural double helix. While similar in structure to B-DNA, xDNA has unique absorption, fluorescence, and stacking properties.

Non-canonical base pairs are planar hydrogen bonded pairs of nucleobases, having hydrogen bonding patterns which differ from the patterns observed in Watson-Crick base pairs, as in the classic double helical DNA. The structures of polynucleotide strands of both DNA and RNA molecules can be understood in terms of sugar-phosphate backbones consisting of phosphodiester-linked D 2’ deoxyribofuranose sugar moieties, with purine or pyrimidine nucleobases covalently linked to them. Here, the N9 atoms of the purines, guanine and adenine, and the N1 atoms of the pyrimidines, cytosine and thymine, respectively, form glycosidic linkages with the C1’ atom of the sugars. These nucleobases can be schematically represented as triangles with one of their vertices linked to the sugar, and the three sides accounting for three edges through which they can form hydrogen bonds with other moieties, including with other nucleobases. The side opposite to the sugar linked vertex is traditionally called the Watson-Crick edge, since they are involved in forming the Watson-Crick base pairs which constitute building blocks of double helical DNA. The two sides adjacent to the sugar-linked vertex are referred to, respectively, as the Sugar and Hoogsteen edges.

<span class="mw-page-title-main">Hachimoji DNA</span> Synthetic DNA

Hachimoji DNA is a synthetic nucleic acid analog that uses four synthetic nucleotides in addition to the four present in the natural nucleic acids, DNA and RNA. This leads to four allowed base pairs: two unnatural base pairs formed by the synthetic nucleobases in addition to the two normal pairs. Hachimoji bases have been demonstrated in both DNA and RNA analogs, using deoxyribose and ribose respectively as the backbone sugar.

This glossary of genetics is a list of definitions of terms and concepts commonly used in the study of genetics and related disciplines in biology, including molecular biology, cell biology, and evolutionary biology. It is intended as introductory material for novices; for more specific and technical detail, see the article corresponding to each term. For related terms, see Glossary of evolutionary biology.

This glossary of cell and molecular biology is a list of definitions of terms and concepts commonly used in the study of cell biology, molecular biology, and related disciplines, including genetics, microbiology, and biochemistry. It is split across two articles:

References

  1. Spencer M (10 January 1959). "The stereochemistry of deoxyribonucleic acid. II. Hydrogen-bonded pairs of bases". Acta Crystallographica. 12 (1): 66–71. doi:10.1107/S0365110X59000160. ISSN   0365-110X.
  2. Zhurkin VB, Tolstorukov MY, Xu F, Colasanti AV, Olson WK (2005). "Sequence-Dependent Variability of B-DNA". DNA Conformation and Transcription. pp. 18–34. doi:10.1007/0-387-29148-2_2. ISBN   978-0-387-25579-8.
  3. Moran LA (2011-03-24). "The total size of the human genome is very likely to be ~3,200 Mb". Sandwalk.blogspot.com. Retrieved 2012-07-16.
  4. "The finished length of the human genome is 2.86 Gb". Strategicgenomics.com. 2006-06-12. Retrieved 2012-07-16.
  5. International Human Genome Sequencing Consortium (October 2004). "Finishing the euchromatic sequence of the human genome". Nature. 431 (7011): 931–945. Bibcode:2004Natur.431..931H. doi: 10.1038/nature03001 . PMID   15496913.
  6. Cockburn AF, Newkirk MJ, Firtel RA (December 1976). "Organization of the ribosomal RNA genes of Dictyostelium discoideum: mapping of the nontranscribed spacer regions". Cell. 9 (4 Pt 1): 605–613. doi:10.1016/0092-8674(76)90043-X. PMID   1034500. S2CID   31624366.
  7. Nuwer R (18 July 2015). "Counting All the DNA on Earth". The New York Times. New York. ISSN   0362-4331. Archived from the original on 2022-01-01. Retrieved 2015-07-18.
  8. "The Biosphere: Diversity of Life". Aspen Global Change Institute. Basalt, CO. Archived from the original on 2014-11-10. Retrieved 2015-07-19.
  9. Yakovchuk P, Protozanova E, Frank-Kamenetskii MD (2006-01-30). "Base-stacking and base-pairing contributions into thermal stability of the DNA double helix". Nucleic Acids Research. 34 (2): 564–574. doi:10.1093/nar/gkj454. PMC   1360284 . PMID   16449200.
  10. Trautner TA, Swartz MN, Kornberg A (March 1962). "Enzymatic synthesis of deoxyribonucleic acid. X. Influence of bromouracil substitutions on replication". Proceedings of the National Academy of Sciences of the United States of America. 48 (3): 449–455. doi: 10.1073/pnas.48.3.449 . PMC   220799 . PMID   13922323.
  11. Krebs JE, Goldstein ES, Kilpatrick ST, Lewin B (2018). "Genes are DNA and Encode RNAs and Polypeptides". Lewin's genes XII (12th ed.). Burlington, Mass: Jones & Bartlett Learning. p. 12. ISBN   978-1-284-10449-3. Each mutagenic event in the presence of an acridine results in the addition or removal of a single base pair.
  12. Putnam CD (September 2021). "Strand discrimination in DNA mismatch repair". DNA Repair. 105: 103161. doi:10.1016/j.dnarep.2021.103161. PMC   8785607 . PMID   34171627.
  13. Alberts B, Johnson A, Lewis J, Morgan D, Raff M, Roberts K, Walter P (December 2014). Molecular Biology of the Cell (6th ed.). New York/Abingdon: Garland Science, Taylor & Francis Group. p. 177. ISBN   978-0-8153-4432-2.
  14. "NIH ORDR – Glossary – C". Rarediseases.info.nih.gov. Archived from the original on 2012-07-17. Retrieved 2012-07-16.
  15. Scott MP, Matsudaira P, Lodish H, Darnell J, Zipursky L, Kaiser CA, Berk A, Krieger M (2004). Molecular Cell Biology (Fifth ed.). San Francisco: W. H. Freeman. p.  396. ISBN   978-0-7167-4366-8. ...in humans 1 centimorgan on average represents a distance of about 7.5x105 base pairs.
  16. 1 2 3 Fikes BJ (May 8, 2014). "Life engineered with expanded genetic code". San Diego Union Tribune. Archived from the original on 9 May 2014. Retrieved 8 May 2014.
  17. Yang Z, Chen F, Alvarado JB, Benner SA (September 2011). "Amplification, mutation, and sequencing of a six-letter synthetic genetic system". Journal of the American Chemical Society. 133 (38): 15105–15112. doi:10.1021/ja204910n. PMC   3427765 . PMID   21842904.
  18. Yamashige R, Kimoto M, Takezawa Y, Sato A, Mitsui T, Yokoyama S, Hirao I (March 2012). "Highly specific unnatural base pair systems as a third base pair for PCR amplification". Nucleic Acids Research. 40 (6): 2793–2806. doi:10.1093/nar/gkr1068. PMC   3315302 . PMID   22121213.
  19. 1 2 3 Malyshev DA, Dhami K, Quach HT, Lavergne T, Ordoukhanian P, Torkamani A, Romesberg FE (July 2012). "Efficient and sequence-independent replication of DNA containing a third base pair establishes a functional six-letter genetic alphabet". Proceedings of the National Academy of Sciences of the United States of America. 109 (30): 12005–12010. Bibcode:2012PNAS..10912005M. doi: 10.1073/pnas.1205176109 . PMC   3409741 . PMID   22773812.
  20. Takezawa Y, Müller J, Shionoya M (2017-05-05). "Artificial DNA Base Pairing Mediated by Diverse Metal Ions". Chemistry Letters. 46 (5): 622–633. doi: 10.1246/cl.160985 . ISSN   0366-7022.
  21. Switzer C, Moroney SE, Benner SA (1989). "Enzymatic incorporation of a new base pair into DNA and RNA". J. Am. Chem. Soc. 111 (21): 8322–8323. doi:10.1021/ja00203a067.
  22. 1 2 Callaway E (May 7, 2014). "Scientists Create First Living Organism With 'Artificial' DNA". Nature News. Huffington Post. Retrieved 8 May 2014.
  23. Hirao I, Ohtsuki T, Fujiwara T, Mitsui T, Yokogawa T, Okuni T, et al. (February 2002). "An unnatural base pair for incorporating amino acid analogs into proteins". Nature Biotechnology. 20 (2): 177–182. doi:10.1038/nbt0202-177. PMID   11821864. S2CID   22055476.
  24. Hirao I, Kimoto M, Mitsui T, Fujiwara T, Kawai R, Sato A, et al. (September 2006). "An unnatural hydrophobic base pair system: site-specific incorporation of nucleotide analogs into DNA and RNA". Nature Methods. 3 (9): 729–735. doi:10.1038/nmeth915. PMID   16929319. S2CID   6494156.
  25. Kimoto M, Kawai R, Mitsui T, Yokoyama S, Hirao I (February 2009). "An unnatural base pair system for efficient PCR amplification and functionalization of DNA molecules". Nucleic Acids Research. 37 (2): e14. doi:10.1093/nar/gkn956. PMC   2632903 . PMID   19073696.
  26. Yamashige R, Kimoto M, Takezawa Y, Sato A, Mitsui T, Yokoyama S, Hirao I (March 2012). "Highly specific unnatural base pair systems as a third base pair for PCR amplification". Nucleic Acids Research. 40 (6): 2793–2806. doi:10.1093/nar/gkr1068. PMC   3315302 . PMID   22121213.
  27. Kimoto M, Yamashige R, Matsunaga K, Yokoyama S, Hirao I (May 2013). "Generation of high-affinity DNA aptamers using an expanded genetic alphabet". Nature Biotechnology. 31 (5): 453–457. doi:10.1038/nbt.2556. PMID   23563318. S2CID   23329867.
  28. 1 2 3 4 Malyshev DA, Dhami K, Lavergne T, Chen T, Dai N, Foster JM, et al. (May 2014). "A semi-synthetic organism with an expanded genetic alphabet". Nature. 509 (7500): 385–388. Bibcode:2014Natur.509..385M. doi:10.1038/nature13314. PMC   4058825 . PMID   24805238.
  29. Sample I (May 7, 2014). "First life forms to pass on artificial DNA engineered by US scientists". The Guardian. Retrieved 8 May 2014.
  30. 1 2 "Scientists create first living organism containing artificial DNA". The Wall Street Journal. Fox News. May 8, 2014. Retrieved 8 May 2014.
  31. 1 2 Pollack A (May 7, 2014). "Scientists Add Letters to DNA's Alphabet, Raising Hope and Fear". New York Times. Retrieved 8 May 2014.
  32. 1 2 Nikolova EN, Kim E, Wise AA, O'Brien PJ, Andricioaei I, Al-Hashimi HM (February 2011). "Transient Hoogsteen base pairs in canonical duplex DNA". Nature. 470 (7335): 498–502. Bibcode:2011Natur.470..498N. doi:10.1038/nature09775. PMC   3074620 . PMID   21270796.
  33. Murphy FV, Ramakrishnan V (December 2004). "Structure of a purine-purine wobble base pair in the decoding center of the ribosome". Nature Structural & Molecular Biology. 11 (12): 1251–1252. doi:10.1038/nsmb866. PMID   15558050. S2CID   27022506.
  34. Vargas-Rodriguez O, Musier-Forsyth K (June 2014). "Structural biology: wobble puts RNA on target". Nature. 510 (7506): 480–481. doi:10.1038/nature13502. PMID   24919145. S2CID   205239383.
  35. Garg A, Heinemann U (February 2018). "A novel form of RNA double helix based on G·U and C·A+ wobble base pairing". RNA. 24 (2): 209–218. doi:10.1261/rna.064048.117. PMC   5769748 . PMID   29122970.
  36. Aishima J, Gitti RK, Noah JE, Gan HH, Schlick T, Wolberger C (December 2002). "A Hoogsteen base pair embedded in undistorted B-DNA". Nucleic Acids Research. 30 (23): 5244–5252. doi:10.1093/nar/gkf661. PMC   137974 . PMID   12466549.
  37. 1 2 Leontis NB, Westhof E (June 2003). "Analysis of RNA motifs". Current Opinion in Structural Biology. 13 (3): 300–308. doi:10.1016/S0959-440X(03)00076-9. PMID   12831880.

Further reading