Darwin (programming language)

Darwin
Paradigm	imperative, structured, object-oriented
Designed by	Gaston Gonnet
First appeared	1991
Typing discipline	Dynamic, Strong
Filename extensions	.drw
Influenced by
	Maple

Last updated January 03, 2024

Darwin is a closed source^[1] programming language developed by Gaston Gonnet and colleagues at ETH Zurich.^[2]^[3] It is used to develop the OMA orthology inference software,^[4] which was also initially developed by Gonnet.^[5] The language backend consists of the kernel, responsible for performing simple mathematical calculations, for transporting and storing data and for interpreting the user's commands, and the library, a set of programs which can perform more complicated calculations.^[6] The target audience for the language is the biosciences, so the library consisted of routines such as those to compute pairwise alignments, phylogenetic trees, multiple sequence alignments, and to make secondary structure predictions.

Example Code

One would write the Hello World program as:

printf('Hello,world!\n');

The following procedure calculates the factorial of a number:^[6]

factorial:=proc(n)if(n=0)thenreturn(1);elsereturn(n*factorial(n-1));fi;end:

Related Research Articles

Bioinformatics is an interdisciplinary field of science that develops methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, chemistry, physics, computer science, computer programming, information engineering, mathematics and statistics to analyze and interpret biological data. The subsequent process of analyzing and interpreting data is referred to as computational biology.

ML is a functional programming language. It is known for its use of the polymorphic Hindley–Milner type system, which automatically assigns the data types of most expressions without requiring explicit type annotations, and ensures type safety; there is a formal proof that a well-typed ML program does not cause runtime type errors. ML provides pattern matching for function arguments, garbage collection, imperative programming, call-by-value and currying. While a general-purpose programming language, ML is used heavily in programming language research and is one of the few languages to be completely specified and verified using formal semantics. Its types and pattern matching make it well-suited and commonly used to operate on other formal languages, such as in compiler writing, automated theorem proving, and formal verification.

In computer science, denotational semantics is an approach of formalizing the meanings of programming languages by constructing mathematical objects that describe the meanings of expressions from the languages. Other approaches providing formal semantics of programming languages include axiomatic semantics and operational semantics.

Maple is a symbolic and numeric computing environment as well as a multi-paradigm programming language. It covers several areas of technical computing, such as symbolic mathematics, numerical analysis, data processing, visualization, and others. A toolbox, MapleSim, adds functionality for multidomain physical modeling and code generation.

In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. Aligned sequences of nucleotide or amino acid residues are typically represented as rows within a matrix. Gaps are inserted between the residues so that identical or similar characters are aligned in successive columns. Sequence alignments are also used for non-biological sequences such as calculating the distance cost between strings in a natural language, or to display financial data.

Structural alignment attempts to establish homology between two or more polymer structures based on their shape and three-dimensional conformation. This process is usually applied to protein tertiary structures but can also be used for large RNA molecules. In contrast to simple structural superposition, where at least some equivalent residues of the two structures are known, structural alignment requires no a priori knowledge of equivalent positions. Structural alignment is a valuable tool for the comparison of proteins with low sequence similarity, where evolutionary relationships between proteins cannot be easily detected by standard sequence alignment techniques. Structural alignment can therefore be used to imply evolutionary relationships between proteins that share very little common sequence. However, caution should be used in using the results as evidence for shared evolutionary ancestry because of the possible confounding effects of convergent evolution by which multiple unrelated amino acid sequences converge on a common tertiary structure.

<span class="mw-page-title-main">Smith–Waterman algorithm</span> Algorithm for determining similar regions between two molecular sequences

The Smith–Waterman algorithm performs local sequence alignment; that is, for determining similar regions between two strings of nucleic acid sequences or protein sequences. Instead of looking at the entire sequence, the Smith–Waterman algorithm compares segments of all possible lengths and optimizes the similarity measure.

Internal Coordinate Mechanics (ICM) is a software program and algorithm to predict low-energy conformations of molecules by sampling the space of internal coordinates defining molecular geometry. In ICM each molecule is constructed as a tree from an entry atom where each next atom is built iteratively from the preceding three atoms via three internal variables. The rings kept rigid or imposed via additional restraints. ICM is used for modelling peptides and interactions with substrates and coenzymes.

Multiple sequence alignment (MSA) may refer to the process or the result of sequence alignment of three or more biological sequences, generally protein, DNA, or RNA. In many cases, the input set of query sequences are assumed to have an evolutionary relationship by which they share a linkage and are descended from a common ancestor. From the resulting MSA, sequence homology can be inferred and phylogenetic analysis can be conducted to assess the sequences' shared evolutionary origins. Visual depictions of the alignment as in the image at right illustrate mutation events such as point mutations that appear as differing characters in a single alignment column, and insertion or deletion mutations that appear as hyphens in one or more of the sequences in the alignment. Multiple sequence alignment is often used to assess sequence conservation of protein domains, tertiary and secondary structures, and even individual amino acids or nucleotides.

Steven Albert Benner is an American chemist. He has been a professor at Harvard University, ETH Zurich, and most recently at the University of Florida, where he was the V.T. & Louise Jackson Distinguished Professor of Chemistry. In 2005, he founded The Westheimer Institute of Science and Technology (TWIST) and the Foundation For Applied Molecular Evolution. Benner has also founded the companies EraGen Biosciences and Firebird BioMolecular Sciences LLC.

Nucleic acid structure prediction is a computational method to determine secondary and tertiary nucleic acid structure from its sequence. Secondary structure can be predicted from one or several nucleic acid sequences. Tertiary structure can be predicted from the sequence, or by comparative modeling.

In molecular biology and genetics, DNA annotation or genome annotation is the process of describing the structure and function of the components of a genome, by analyzing and interpreting them in order to extract their biological significance and understand the biological processes in which they participate. Among other things, it identifies the locations of genes and all the coding regions in a genome and determines what those genes do.

Gaston H. Gonnet is a Uruguayan Canadian computer scientist and entrepreneur. He is best known for his contributions to the Maple computer algebra system and the creation of a digital version of the Oxford English Dictionary.

PhylomeDB is a public biological database for complete catalogs of gene phylogenies (phylomes). It allows users to interactively explore the evolutionary history of genes through the visualization of phylogenetic trees and multiple sequence alignments. Moreover, phylomeDB provides genome-wide orthology and paralogy predictions which are based on the analysis of the phylogenetic trees. The automated pipeline used to reconstruct trees aims at providing a high-quality phylogenetic analysis of different genomes, including Maximum Likelihood tree inference, alignment trimming and evolutionary model testing.

Red is a programming language designed to overcome the limitations of the programming language Rebol. Red was introduced in 2011 by Nenad Rakočević, and is both an imperative and functional programming language. Its syntax and general usage overlaps that of the interpreted Rebol language.

Chris Sander is a computational biologist based at the Dana-Farber Cancer Center and Harvard Medical School. Previously he was chair of the Computational Biology Programme at the Memorial Sloan–Kettering Cancer Center in New York City. In 2015, he moved his lab to the Dana–Farber Cancer Institute and the Cell Biology Department at Harvard Medical School.

David Sankoff is a Canadian mathematician, bioinformatician, computer scientist and linguist. He holds the Canada Research Chair in Mathematical Genomics in the Mathematics and Statistics Department at the University of Ottawa, and is cross-appointed to the Biology Department and the School of Information Technology and Engineering. He was founding editor of the scientific journal Language Variation and Change (Cambridge) and serves on the editorial boards of a number of bioinformatics, computational biology and linguistics journals. Sankoff is best known for his pioneering contributions in computational linguistics and computational genomics. He is considered to be one of the founders of bioinformatics. In particular, he had a key role in introducing dynamic programming for sequence alignment and other problems in computational biology. In Pavel Pevzner's words, "[ Michael Waterman ] and David Sankoff are responsible for transforming bioinformatics from a ‘stamp collection' of ill-defined problems into a rigorous discipline with important biological applications."

In molecular phylogenetics, relationships among individuals are determined using character traits, such as DNA, RNA or protein, which may be obtained using a variety of sequencing technologies. High-throughput next-generation sequencing has become a popular technique in transcriptomics, which represent a snapshot of gene expression. In eukaryotes, making phylogenetic inferences using RNA is complicated by alternative splicing, which produces multiple transcripts from a single gene. As such, a variety of approaches may be used to improve phylogenetic inference using transcriptomic data obtained from RNA-Seq and processed using computational phylogenetics.

Christophe Dessimoz is a Swiss National Science Foundation (SNSF) Professor at the University of Lausanne, Associate Professor at University College London and a group leader at the Swiss Institute of Bioinformatics. He was awarded the Overton Prize in 2019 for his contributions to computational biology. Starting in April 2022, he will be joint executive director of the SIB Swiss Institute of Bioinformatics, along with Ron Appel.

References

↑ Gonnet, G. H.; Hallett, M. T.; Korostensky, C.; Bernardin, L. (2000). "Darwin v2.0: an interpreted computer language for the biosciences". Bioinformatics. 16 (2): 101–103. doi: 10.1093/bioinformatics/16.2.101 . hdl: 20.500.11850/422531 . PMID 10842729. S2CID 1531041.
↑ "Personal page of Gaston Gonnet" . Retrieved 2017-11-10.
↑ Haigh, Thomas (2005), Gaston Gonnet Oral history interview, 16–18 March, 2005, Zurich, Switzerland, Philadelphia, PA: Society for Industrial and Applied Mathematics
↑ "OMA Standalone" . Retrieved 2017-11-10.
↑ "OMA: web-based database interface for orthology prediction" . Retrieved 2017-11-10.
1 2 "The Darwin Manual" . Retrieved 2017-11-10.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Gonnet, G. H.; Hallett, M. T.; Korostensky, C.; Bernardin, L. (2000). "Darwin v2.0: an interpreted computer language for the biosciences". Bioinformatics. 16 (2): 101–103. doi: 10.1093/bioinformatics/16.2.101 . hdl: 20.500.11850/422531 . PMID 10842729. S2CID 1531041.

[2] "Personal page of Gaston Gonnet" . Retrieved 2017-11-10.

[haigh-3] Haigh, Thomas (2005), Gaston Gonnet Oral history interview, 16–18 March, 2005, Zurich, Switzerland, Philadelphia, PA: Society for Industrial and Applied Mathematics

[4] "OMA Standalone" . Retrieved 2017-11-10.

[5] "OMA: web-based database interface for orthology prediction" . Retrieved 2017-11-10.

[manual-6] 1 2 "The Darwin Manual" . Retrieved 2017-11-10.

[1]

[2]

[3]

[4]

[5]

[6]

Darwin (programming language)

Contents

Example Code

See also

Related Research Articles

References