DNA computing is an emerging branch of unconventional computing which uses DNA, biochemistry, and molecular biology hardware, instead of the traditional electronic computing. Research and development in this area concerns theory, experiments, and applications of DNA computing. Although the field originally started with the demonstration of a computing application by Len Adleman in 1994, it has now been expanded to several other avenues such as the development of storage technologies, [1] [2] [3] nanoscale imaging modalities, [4] [5] [6] synthetic controllers and reaction networks, [7] [8] [9] [10] etc.
Leonard Adleman of the University of Southern California initially developed this field in 1994. [11] Adleman demonstrated a proof-of-concept use of DNA as a form of computation which solved the seven-point Hamiltonian path problem. Since the initial Adleman experiments, advances have occurred and various Turing machines have been proven to be constructible. [12] [13]
Since then the field has expanded into several avenues. In 1995, the idea for DNA-based memory was proposed by Eric Baum [14] who conjectured that a vast amount of data can be stored in a tiny amount of DNA due to its ultra-high density. This expanded the horizon of DNA computing into the realm of memory technology although the in vitro demonstrations were made almost after a decade.
The field of DNA computing can be categorized as a sub-field of the broader DNA nanoscience field started by Ned Seeman about a decade before Len Adleman's demonstration. [15] Ned's original idea in the 1980s was to build arbitrary structures using bottom-up DNA self-assembly for applications in crystallography. However, it morphed into the field of structural DNA self-assembly [16] [17] [18] which as of 2020 is extremely sophisticated. Self-assembled structure from a few nanometers tall all the way up to several tens of micrometers in size have been demonstrated in 2018.
In 1994, Prof. Seeman's group demonstrated early DNA lattice structures using a small set of DNA components. While the demonstration by Adleman showed the possibility of DNA-based computers, the DNA design was trivial because as the number of nodes in a graph grows, the number of DNA components required in Adleman's implementation would grow exponentially. Therefore, computer scientists and biochemists started exploring tile-assembly where the goal was to use a small set of DNA strands as tiles to perform arbitrary computations upon growth. Other avenues that were theoretically explored in the late 90's include DNA-based security and cryptography, [19] computational capacity of DNA systems, [20] DNA memories and disks, [21] and DNA-based robotics. [22]
Before 2002, Lila Kari showed that the DNA operations performed by genetic recombination in some organisms are Turing complete. [23]
In 2003, John Reif's group first demonstrated the idea of a DNA-based walker that traversed along a track similar to a line follower robot. They used molecular biology as a source of energy for the walker. Since this first demonstration, a wide variety of DNA-based walkers have been demonstrated.
In 1994 Leonard Adleman presented the first prototype of a DNA computer. The TT-100 was a test tube filled with 100 microliters of a DNA solution. He managed to solve an instance of the directed Hamiltonian path problem. [24] In Adleman's experiment, the Hamiltonian Path Problem was implemented notationally as the "travelling salesman problem". For this purpose, different DNA fragments were created, each one of them representing a city that had to be visited. Every one of these fragments is capable of a linkage with the other fragments created. These DNA fragments were produced and mixed in a test tube. Within seconds, the small fragments form bigger ones, representing the different travel routes. Through a chemical reaction, the DNA fragments representing the longer routes were eliminated. The remains are the solution to the problem, but overall, the experiment lasted a week. [25] However, current technical limitations prevent the evaluation of the results. Therefore, the experiment isn't suitable for the application, but it is nevertheless a proof of concept.
First results to these problems were obtained by Leonard Adleman.
In 2002, J. Macdonald, D. Stefanović and M. Stojanović created a DNA computer able to play tic-tac-toe against a human player. [26] The calculator consists of nine bins corresponding to the nine squares of the game. Each bin contains a substrate and various combinations of DNA enzymes. The substrate itself is composed of a DNA strand onto which was grafted a fluorescent chemical group at one end, and the other end, a repressor group. Fluorescence is only active if the molecules of the substrate are cut in half. The DNA enzymes simulate logical functions. For example, such a DNA will unfold if two specific types of DNA strand are introduced to reproduce the logic function AND.
By default, the computer is considered to have played first in the central square. The human player starts with eight different types of DNA strands corresponding to the eight remaining boxes that may be played. To play box number i, the human player pours into all bins the strands corresponding to input #i. These strands bind to certain DNA enzymes present in the bins, resulting, in one of these bins, in the deformation of the DNA enzymes which binds to the substrate and cuts it. The corresponding bin becomes fluorescent, indicating which box is being played by the DNA computer. The DNA enzymes are divided among the bins in such a way as to ensure that the best the human player can achieve is a draw, as in real tic-tac-toe.
Kevin Cherry and Lulu Qian at Caltech developed a DNA-based artificial neural network that can recognize 100-bit hand-written digits. They achieve this by programming on computer in advance with appropriate set of weights represented by varying concentrations weight molecules which will later be added to the test tube that holds the input DNA strands. [27] [28]
One of the challenges of DNA computing is its speed. While DNA as a substrate is biologically compatible i.e. it can be used at places where silicon technology cannot, its computation speed is still very slow. For example, the square-root circuit used as a benchmark in field took over 100 hours to complete. [29] While newer ways with external enzyme sources are reporting faster and more compact circuits, [30] Chatterjee et al. demonstrated an interesting idea in the field to speed up computation through localized DNA circuits, [31] a concept being further explored by other groups. [32] This idea, while originally proposed in the field of computer architecture, has been adopted in this field as well. In computer architecture, it is very well-known that if the instructions are executed in sequence, having them loaded in the cache will inevitably lead to fast performance, also called the principle of localization. This is because with instructions in fast cache memory, there is no need swap them in and out of main memory, which can be slow. Similarly, in localized DNA computing, the DNA strands responsible for computation are fixed on a breadboard-like substrate ensuring physical proximity of the computing gates. Such localized DNA computing techniques have shown to potentially reduce the computation time by orders of magnitude.
Subsequent research on DNA computing has produced reversible DNA computing, bringing the technology one step closer to the silicon-based computing used in (for example) PCs. In particular, John Reif and his group at Duke University have proposed two different techniques to reuse the computing DNA complexes. The first design uses dsDNA gates, [33] while the second design uses DNA hairpin complexes. [34] While both the designs face some issues (such as reaction leaks), this appears to represent a significant breakthrough in the field of DNA computing. Some other groups have also attempted to address the gate reusability problem. [35] [36]
Using strand displacement reactions (SRDs), reversible proposals are presented in the "Synthesis Strategy of Reversible Circuits on DNA Computers" paper [37] for implementing reversible gates and circuits on DNA computers by combining DNA computing and reversible computing techniques. This paper also proposes a universal reversible gate library (URGL) for synthesizing n-bit reversible circuits on DNA computers with an average length and cost of the constructed circuits better than the previous methods.
There are multiple methods for building a computing device based on DNA, each with its own advantages and disadvantages. Most of these build the basic logic gates (AND, OR, NOT) associated with digital logic from a DNA basis. Some of the different bases include DNAzymes, deoxyoligonucleotides, enzymes, and toehold exchange.
The most fundamental operation in DNA computing and molecular programming is the strand displacement mechanism. Currently, there are two ways to perform strand displacement:
Besides simple strand displacement schemes, DNA computers have also been constructed using the concept of toehold exchange. [28] In this system, an input DNA strand binds to a sticky end, or toehold, on another DNA molecule, which allows it to displace another strand segment from the molecule. This allows the creation of modular logic components such as AND, OR, and NOT gates and signal amplifiers, which can be linked into arbitrarily large computers. This class of DNA computers does not require enzymes or any chemical capability of the DNA. [38]
The full stack for DNA computing looks very similar to a traditional computer architecture. At the highest level, a C-like general purpose programming language is expressed using a set of chemical reaction networks (CRNs). This intermediate representation gets translated to domain-level DNA design and then implemented using a set of DNA strands. In 2010, Erik Winfree's group showed that DNA can be used as a substrate to implement arbitrary chemical reactions. This opened the way to design and synthesis of biochemical controllers since the expressive power of CRNs is equivalent to a Turing machine. [7] [8] [9] [10] Such controllers can potentially be used in vivo for applications such as preventing hormonal imbalance.
Catalytic DNA (deoxyribozyme or DNAzyme) catalyze a reaction when interacting with the appropriate input, such as a matching oligonucleotide. These DNAzymes are used to build logic gates analogous to digital logic in silicon; however, DNAzymes are limited to 1-, 2-, and 3-input gates with no current implementation for evaluating statements in series.
The DNAzyme logic gate changes its structure when it binds to a matching oligonucleotide and the fluorogenic substrate it is bonded to is cleaved free. While other materials can be used, most models use a fluorescence-based substrate because it is very easy to detect, even at the single molecule limit. [39] The amount of fluorescence can then be measured to tell whether or not a reaction took place. The DNAzyme that changes is then "used", and cannot initiate any more reactions. Because of this, these reactions take place in a device such as a continuous stirred-tank reactor, where old product is removed and new molecules added.
Two commonly used DNAzymes are named E6 and 8-17. These are popular because they allow cleaving of a substrate in any arbitrary location. [40] Stojanovic and MacDonald have used the E6 DNAzymes to build the MAYA I [41] and MAYA II [42] machines, respectively; Stojanovic has also demonstrated logic gates using the 8-17 DNAzyme. [43] While these DNAzymes have been demonstrated to be useful for constructing logic gates, they are limited by the need for a metal cofactor to function, such as Zn2+ or Mn2+, and thus are not useful in vivo. [39] [44]
A design called a stem loop, consisting of a single strand of DNA which has a loop at an end, are a dynamic structure that opens and closes when a piece of DNA bonds to the loop part. This effect has been exploited to create several logic gates. These logic gates have been used to create the computers MAYA I and MAYA II which can play tic-tac-toe to some extent. [45]
Enzyme-based DNA computers are usually of the form of a simple Turing machine; there is analogous hardware, in the form of an enzyme, and software, in the form of DNA. [46]
Benenson, Shapiro and colleagues have demonstrated a DNA computer using the FokI enzyme [47] and expanded on their work by going on to show automata that diagnose and react to prostate cancer: under expression of the genes PPAP2B and GSTP1 and an over expression of PIM1 and HPN. [48] Their automata evaluated the expression of each gene, one gene at a time, and on positive diagnosis then released a single strand DNA molecule (ssDNA) that is an antisense for MDM2. MDM2 is a repressor of protein 53, which itself is a tumor suppressor. [49] On negative diagnosis it was decided to release a suppressor of the positive diagnosis drug instead of doing nothing. A limitation of this implementation is that two separate automata are required, one to administer each drug. The entire process of evaluation until drug release took around an hour to complete. This method also requires transition molecules as well as the FokI enzyme to be present. The requirement for the FokI enzyme limits application in vivo, at least for use in "cells of higher organisms". [50] It should also be pointed out that the 'software' molecules can be reused in this case.
DNA nanotechnology has been applied to the related field of DNA computing. DNA tiles can be designed to contain multiple sticky ends with sequences chosen so that they act as Wang tiles. A DX array has been demonstrated whose assembly encodes an XOR operation; this allows the DNA array to implement a cellular automaton which generates a fractal called the Sierpinski gasket. This shows that computation can be incorporated into the assembly of DNA arrays, increasing its scope beyond simple periodic arrays. [51]
DNA computing is a form of parallel computing in that it takes advantage of the many different molecules of DNA to try many different possibilities at once. [52] For certain specialized problems, DNA computers are faster and smaller than any other computer built so far. Furthermore, particular mathematical computations have been demonstrated to work on a DNA computer.
DNA computing does not provide any new capabilities from the standpoint of computability theory, the study of which problems are computationally solvable using different models of computation. For example, if the space required for the solution of a problem grows exponentially with the size of the problem (EXPSPACE problems) on von Neumann machines, it still grows exponentially with the size of the problem on DNA machines. For very large EXPSPACE problems, the amount of DNA required is too large to be practical.
A partnership between IBM and Caltech was established in 2009 aiming at "DNA chips" production. [53] A Caltech group is working on the manufacturing of these nucleic-acid-based integrated circuits. One of these chips can compute whole square roots. [54] A compiler has been written [55] in Perl.
The slow processing speed of a DNA computer (the response time is measured in minutes, hours or days, rather than milliseconds) is compensated by its potential to make a high amount of multiple parallel computations. This allows the system to take a similar amount of time for a complex calculation as for a simple one. This is achieved by the fact that millions or billions of molecules interact with each other simultaneously. However, it is much harder to analyze the answers given by a DNA computer than by a digital one.
Deoxyribonucleic acid is a polymer composed of two polynucleotide chains that coil around each other to form a double helix. The polymer carries genetic instructions for the development, functioning, growth and reproduction of all known organisms and many viruses. DNA and ribonucleic acid (RNA) are nucleic acids. Alongside proteins, lipids and complex carbohydrates (polysaccharides), nucleic acids are one of the four major types of macromolecules that are essential for all known forms of life.
A quantum computer is a computer that exploits quantum mechanical phenomena. On small scales, physical matter exhibits properties of both particles and waves, and quantum computing leverages this behavior using specialized hardware. Classical physics cannot explain the operation of these quantum devices, and a scalable quantum computer could perform some calculations exponentially faster than any modern "classical" computer. In particular, a large-scale quantum computer could break widely used encryption schemes and aid physicists in performing physical simulations; however, the current state of the art is largely experimental and impractical, with several obstacles to useful applications.
Leonard Adleman is an American computer scientist. He is one of the creators of the RSA encryption algorithm, for which he received the 2002 Turing Award. He is also known for the creation of the field of DNA computing.
Deoxyribozymes, also called DNA enzymes, DNAzymes, or catalytic DNA, are DNA oligonucleotides that are capable of performing a specific chemical reaction, often but not always catalytic. This is similar to the action of other biological enzymes, such as proteins or ribozymes . However, in contrast to the abundance of protein enzymes in biological systems and the discovery of biological ribozymes in the 1980s, there is only little evidence for naturally occurring deoxyribozymes. Deoxyribozymes should not be confused with DNA aptamers which are oligonucleotides that selectively bind a target ligand, but do not catalyze a subsequent chemical reaction.
Optical computing or photonic computing uses light waves produced by lasers or incoherent sources for data processing, data storage or data communication for computing. For decades, photons have shown promise to enable a higher bandwidth than the electrons used in conventional computers.
Unconventional computing is computing by any of a wide range of new or unusual methods.
A molecular logic gate is a molecule that performs a logical operation based on one or more physical or chemical inputs and a single output. The field has advanced from simple logic systems based on a single chemical or physical input to molecules capable of combinatorial and sequential operations such as arithmetic operations. Molecular logic gates work with input signals based on chemical processes and with output signals based on spectroscopic phenomena.
A chemical computer, also called a reaction-diffusion computer, Belousov–Zhabotinsky (BZ) computer, or gooware computer, is an unconventional computer based on a semi-solid chemical "soup" where data are represented by varying concentrations of chemicals. The computations are performed by naturally occurring chemical reactions.
Type II topoisomerases are topoisomerases that cut both strands of the DNA helix simultaneously in order to manage DNA tangles and supercoils. They use the hydrolysis of ATP, unlike Type I topoisomerase. In this process, these enzymes change the linking number of circular DNA by ±2. Topoisomerases are ubiquitous enzymes, found in all living organisms.
Biological computers use biologically derived molecules — such as DNA and/or proteins — to perform digital or real computations.
The adaptor hypothesis is a theoretical scheme in molecular biology to explain how information encoded in the nucleic acid sequences of messenger RNA (mRNA) is used to specify the amino acids that make up proteins during the process of translation. It was formulated by Francis Crick in 1955 in an informal publication of the RNA Tie Club, and later elaborated in 1957 along with the central dogma of molecular biology and the sequence hypothesis. It was formally published as an article "On protein synthesis" in 1958. The name "adaptor hypothesis" was given by Sydney Brenner.
Nucleic acid design is the process of generating a set of nucleic acid base sequences that will associate into a desired conformation. Nucleic acid design is central to the fields of DNA nanotechnology and DNA computing. It is necessary because there are many possible sequences of nucleic acid strands that will fold into a given secondary structure, but many of these sequences will have undesired additional interactions which must be avoided. In addition, there are many tertiary structure considerations which affect the choice of a secondary structure for a given design.
Molecular models of DNA structures are representations of the molecular geometry and topology of deoxyribonucleic acid (DNA) molecules using one of several means, with the aim of simplifying and presenting the essential, physical and chemical, properties of DNA molecular structures either in vivo or in vitro. These representations include closely packed spheres made of plastic, metal wires for skeletal models, graphic computations and animations by computers, artistic rendering. Computer molecular models also allow animations and molecular dynamics simulations that are very important for understanding how DNA functions in vivo.
DNA nanotechnology is the design and manufacture of artificial nucleic acid structures for technological uses. In this field, nucleic acids are used as non-biological engineering materials for nanotechnology rather than as the carriers of genetic information in living cells. Researchers in the field have created static structures such as two- and three-dimensional crystal lattices, nanotubes, polyhedra, and arbitrary shapes, and functional devices such as molecular machines and DNA computers. The field is beginning to be used as a tool to solve basic science problems in structural biology and biophysics, including applications in X-ray crystallography and nuclear magnetic resonance spectroscopy of proteins to determine structures. Potential applications in molecular scale electronics and nanomedicine are also being investigated.
Natural computing, also called natural computation, is a terminology introduced to encompass three classes of methods: 1) those that take inspiration from nature for the development of novel problem-solving techniques; 2) those that are based on the use of computers to synthesize natural phenomena; and 3) those that employ natural materials to compute. The main fields of research that compose these three branches are artificial neural networks, evolutionary algorithms, swarm intelligence, artificial immune systems, fractal geometry, artificial life, DNA computing, and quantum computing, among others.
Nucleic acid secondary structure is the basepairing interactions within a single nucleic acid polymer or between two polymers. It can be represented as a list of bases which are paired in a nucleic acid molecule. The secondary structures of biological DNAs and RNAs tend to be different: biological DNA mostly exists as fully base paired double helices, while biological RNA is single stranded and often forms complex and intricate base-pairing interactions due to its increased ability to form hydrogen bonds stemming from the extra hydroxyl group in the ribose sugar.
Linear optical quantum computing or linear optics quantum computation (LOQC), also photonic quantum computing (PQC), is a paradigm of quantum computation, allowing (under certain conditions, described below) universal quantum computation. LOQC uses photons as information carriers, mainly uses linear optical elements, or optical instruments (including reciprocal mirrors and waveplates) to process quantum information, and uses photon detectors and quantum memories to detect and store quantum information.
In quantum computing, quantum supremacy or quantum advantage is the goal of demonstrating that a programmable quantum computer can solve a problem that no classical computer can solve in any feasible amount of time, irrespective of the usefulness of the problem. The term was coined by John Preskill in 2012, but the concept dates to Yuri Manin's 1980 and Richard Feynman's 1981 proposals of quantum computing.
Toehold mediated strand displacement (TMSD) is an enzyme-free molecular tool to exchange one strand of DNA or RNA (output) with another strand (input). It is based on the hybridization of two complementary strands of DNA or RNA via Watson-Crick base pairing (A-T/U and C-G) and makes use of a process called branch migration. Although branch migration has been known to the scientific community since the 1970s, TMSD has not been introduced to the field of DNA nanotechnology until 2000 when Yurke et al. was the first who took advantage of TMSD. He used the technique to open and close a set of DNA tweezers made of two DNA helices using an auxiliary strand of DNA as fuel. Since its first use, the technique has been modified for the construction of autonomous molecular motors, catalytic amplifiers, reprogrammable DNA nanostructures and molecular logic gates. It has also been used in conjunction with RNA for the production of kinetically-controlled ribosensors. TMSD starts with a double-stranded DNA complex composed of the original strand and the protector strand. The original strand has an overhanging region the so-called “toehold” which is complementary to a third strand of DNA referred to as the “invading strand”. The invading strand is a sequence of single-stranded DNA (ssDNA) which is complementary to the original strand. The toehold regions initiate the process of TMSD by allowing the complementary invading strand to hybridize with the original strand, creating a DNA complex composed of three strands of DNA. This initial endothermic step is rate limiting and can be tuned by varying the strength (length and sequence composition e.g. G-C or A-T rich strands) of the toehold region. The ability to tune the rate of strand displacement over a range of 6 orders of magnitude generates the backbone of this technique and allows the kinetic control of DNA or RNA devices. After the binding of the invading strand and the original strand occurred, branch migration of the invading domain then allows the displacement of the initial hybridized strand (protector strand). The protector strand can possess its own unique toehold and can, therefore, turn into an invading strand itself, starting a strand-displacement cascade. The whole process is energetically favored and although a reverse reaction can occur its rate is up to 6 orders of magnitude slower. Additional control over the system of toehold mediated strand displacement can be introduced by toehold sequestering.
Lulu Qian is a Chinese-American biochemist who is a professor at the California Institute of Technology. Her research uses DNA-like molecules to build artificial machines.
{{cite web}}
: CS1 maint: archived copy as title (link){{cite web}}
: CS1 maint: archived copy as title (link){{cite book}}
: CS1 maint: multiple names: authors list (link)— The book starts with an introduction to DNA-related matters, the basics of biochemistry and language and computation theory, and progresses to the advanced mathematical theory of DNA computing.