Computational biology

Last updated April 03, 2024

Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships.^[1] An intersection of computer science, biology, and big data, the field also has foundations in applied mathematics, chemistry, and genetics.^[2] It differs from biological computing, a subfield of computer science and engineering which uses bioengineering to build computers.

History

Bioinformatics, the analysis of informatics processes in biological systems, began in the early 1970s. At this time, research in artificial intelligence was using network models of the human brain in order to generate new algorithms. This use of biological data pushed biological researchers to use computers to evaluate and compare large data sets in their own field.^[3]

By 1982, researchers shared information via punch cards. The amount of data grew exponentially by the end of the 1980s, requiring new computational methods for quickly interpreting relevant information.^[3]

Perhaps the best-known example of computational biology, the Human Genome Project, officially began in 1990.^[4] By 2003, the project had mapped around 85% of the human genome, satisfying its initial goals.^[5] Work continued, however, and by 2021 level "complete genome" was reached with only 0.3% remaining bases covered by potential issues.^[6]^[7] The missing Y chromosome was added in January 2022.

Since the late 1990s, computational biology has become an important part of biology, leading to numerous subfields.^[8] Today, the International Society for Computational Biology recognizes 21 different 'Communities of Special Interest', each representing a slice of the larger field.^[9] In addition to helping sequence the human genome, computational biology has helped create accurate models of the human brain, map the 3D structure of genomes, and model biological systems.^[3]

Applications

Anatomy

Computational anatomy is the study of anatomical shape and form at the visible or gross anatomical $50-100\mu$ scale of morphology. It involves the development of computational mathematical and data-analytical methods for modeling and simulating biological structures. It focuses on the anatomical structures being imaged, rather than the medical imaging devices. Due to the availability of dense 3D measurements via technologies such as magnetic resonance imaging, computational anatomy has emerged as a subfield of medical imaging and bioengineering for extracting anatomical coordinate systems at the morpheme scale in 3D.

The original formulation of computational anatomy is as a generative model of shape and form from exemplars acted upon via transformations.^[10] The diffeomorphism group is used to study different coordinate systems via coordinate transformations as generated via the Lagrangian and Eulerian velocities of flow from one anatomical configuration in ${\mathbb {R} }^{3}$ to another. It relates with shape statistics and morphometrics, with the distinction that diffeomorphisms are used to map coordinate systems, whose study is known as diffeomorphometry.

Data and modeling

Mathematical biology is the use of mathematical models of living organisms to examine the systems that govern structure, development, and behavior in biological systems. This entails a more theoretical approach to problems, rather than its more empirically-minded counterpart of experimental biology.^[11] Mathematical biology draws on discrete mathematics, topology (also useful for computational modeling), Bayesian statistics, linear algebra and Boolean algebra.^[12]

These mathematical approaches have enabled the creation of databases and other methods for storing, retrieving, and analyzing biological data, a field known as bioinformatics. Usually, this process involves genetics and analyzing genes.

Gathering and analyzing large datasets have made room for growing research fields such as data mining,^[12] and computational biomodeling, which refers to building computer models and visual simulations of biological systems. This allows researchers to predict how such systems will react to different environments, which is useful for determining if a system can "maintain their state and functions against external and internal perturbations".^[13] While current techniques focus on small biological systems, researchers are working on approaches that will allow for larger networks to be analyzed and modeled. A majority of researchers believe this will be essential in developing modern medical approaches to creating new drugs and gene therapy.^[13] A useful modeling approach is to use Petri nets via tools such as esyN.^[14]

Along similar lines, until recent decades theoretical ecology has largely dealt with analytic models that were detached from the statistical models used by empirical ecologists. However, computational methods have aided in developing ecological theory via simulation of ecological systems, in addition to increasing application of methods from computational statistics in ecological analyses.

Systems Biology

Systems biology consists of computing the interactions between various biological systems ranging from the cellular level to entire populations with the goal of discovering emergent properties. This process usually involves networking cell signaling and metabolic pathways. Systems biology often uses computational techniques from biological modeling and graph theory to study these complex interactions at cellular levels.^[12]

Evolutionary biology

Computational biology has assisted evolutionary biology by:

Using DNA data to reconstruct the tree of life with computational phylogenetics
Fitting population genetics models (either forward time^[15] or backward time) to DNA data to make inferences about demographic or selective history
Building population genetics models of evolutionary systems from first principles in order to predict what is likely to evolve

Genomics

Computational genomics is the study of the genomes of cells and organisms. The Human Genome Project is one example of computational genomics. This project looks to sequence the entire human genome into a set of data. Once fully implemented, this could allow for doctors to analyze the genome of an individual patient.^[16] This opens the possibility of personalized medicine, prescribing treatments based on an individual's pre-existing genetic patterns. Researchers are looking to sequence the genomes of animals, plants, bacteria, and all other types of life.^[17]

One of the main ways that genomes are compared is by sequence homology. Homology is the study of biological structures and nucleotide sequences in different organisms that come from a common ancestor. Research suggests that between 80 and 90% of genes in newly sequenced prokaryotic genomes can be identified this way.^[17]

Sequence alignment is another process for comparing and detecting similarities between biological sequences or genes. Sequence alignment is useful in a number of bioinformatics applications, such as computing the longest common subsequence of two genes or comparing variants of certain diseases.^[18]

An untouched project in computational genomics is the analysis of intergenic regions, which comprise roughly 97% of the human genome.^[17] Researchers are working to understand the functions of non-coding regions of the human genome through the development of computational and statistical methods and via large consortia projects such as ENCODE and the Roadmap Epigenomics Project.

Understanding how individual genes contribute to the biology of an organism at the molecular, cellular, and organism levels is known as gene ontology. The Gene Ontology Consortium's mission is to develop an up-to-date, comprehensive, computational model of biological systems, from the molecular level to larger pathways, cellular, and organism-level systems. The Gene Ontology resource provides a computational representation of current scientific knowledge about the functions of genes (or, more properly, the protein and non-coding RNA molecules produced by genes) from many different organisms, from humans to bacteria.^[19]

3D genomics is a subsection in computational biology that focuses on the organization and interaction of genes within a eukaryotic cell. One method used to gather 3D genomic data is through Genome Architecture Mapping (GAM). GAM measures 3D distances of chromatin and DNA in the genome by combining cryosectioning, the process of cutting a strip from the nucleus to examine the DNA, with laser microdissection. A nuclear profile is simply this strip or slice that is taken from the nucleus. Each nuclear profile contains genomic windows, which are certain sequences of nucleotides - the base unit of DNA. GAM captures a genome network of complex, multi enhancer chromatin contacts throughout a cell.^[20]

Neuroscience

Computational neuroscience is the study of brain function in terms of the information processing properties of the nervous system. A subset of neuroscience, it looks to model the brain to examine specific aspects of the neurological system.^[21] Models of the brain include:

Realistic Brain Models: These models look to represent every aspect of the brain, including as much detail at the cellular level as possible. Realistic models provide the most information about the brain, but also have the largest margin for error. More variables in a brain model create the possibility for more error to occur. These models do not account for parts of the cellular structure that scientists do not know about. Realistic brain models are the most computationally heavy and the most expensive to implement.^[22]
Simplifying Brain Models: These models look to limit the scope of a model in order to assess a specific physical property of the neurological system. This allows for the intensive computational problems to be solved, and reduces the amount of potential error from a realistic brain model.^[22]

It is the work of computational neuroscientists to improve the algorithms and data structures currently used to increase the speed of such calculations.

Computational neuropsychiatry is an emerging field that uses mathematical and computer-assisted modeling of brain mechanisms involved in mental disorders. Several initiatives have demonstrated that computational modeling is an important contribution to understand neuronal circuits that could generate mental functions and dysfunctions.^[23]^[24]^[25]

Pharmacology

Computational pharmacology is "the study of the effects of genomic data to find links between specific genotypes and diseases and then screening drug data".^[26] The pharmaceutical industry requires a shift in methods to analyze drug data. Pharmacologists were able to use Microsoft Excel to compare chemical and genomic data related to the effectiveness of drugs. However, the industry has reached what is referred to as the Excel barricade. This arises from the limited number of cells accessible on a spreadsheet. This development led to the need for computational pharmacology. Scientists and researchers develop computational methods to analyze these massive data sets. This allows for an efficient comparison between the notable data points and allows for more accurate drugs to be developed.^[27]

Analysts project that if major medications fail due to patents, that computational biology will be necessary to replace current drugs on the market. Doctoral students in computational biology are being encouraged to pursue careers in industry rather than take Post-Doctoral positions. This is a direct result of major pharmaceutical companies needing more qualified analysts of the large data sets required for producing new drugs.^[27]

Similarly, computational oncology aims to determine the future mutations in cancer through algorithmic approaches. Research in this field has led to the use of high-throughput measurement that millions of data points using robotics and other sensing devices. This data is collected from DNA, RNA, and other biological structures. Areas of focus include determining the characteristics of tumors, analyzing molecules that are deterministic in causing cancer, and understanding how the human genome relates to the causation of tumors and cancer.^[28]^[29]

Techniques

Computational biologists use a wide range of software and algorithms to carry out their research.

Unsupervised Learning

Unsupervised learning is a type of algorithm that finds patterns in unlabeled data. One example is k-means clustering, which aims to partition n data points into k clusters, in which each data point belongs to the cluster with the nearest mean. Another version is the k-medoids algorithm, which, when selecting a cluster center or cluster centroid, will pick one of its data points in the set, and not just an average of the cluster.

A heat-map of the Jaccard distances of nuclear profiles Jmatrix.png — A heat-map of the Jaccard distances of nuclear profiles

The algorithm follows these steps:

Randomly select k distinct data points. These are the initial clusters.
Measure the distance between each point and each of the 'k' clusters. (This is the distance of the points from each point k).
Assign each point to the nearest cluster.
Find the center of each cluster (medoid).
Repeat until the clusters no longer change.
Assess the quality of the clustering by adding up the variation within each cluster.
Repeat the processes with different values of k.
Pick the best value for 'k' by finding the "elbow" in the plot of which k value has the lowest variance.

One example of this in biology is used in the 3D mapping of a genome. Information of a mouse's HIST1 region of chromosome 13 is gathered from Gene Expression Omnibus.^[30] This information contains data on which nuclear profiles show up in certain genomic regions. With this information, the Jaccard distance can be used to find a normalized distance between all the loci.

Graph Analytics

Graph analytics, or network analysis, is the study of graphs that represent connections between different objects. Graphs can represent all kinds of networks in biology such as Protein-protein interaction networks, regulatory networks, Metabolic and biochemical networks and much more. There are many ways to analyze these networks. One of which is looking at Centrality in graphs. Finding centrality in graphs assigns nodes rankings to their popularity or centrality in the graph. This can be useful in finding what nodes are most important. This can be very useful in biology in many ways. For example, if we were to have data on the activity of genes in a given time period, we can use degree centrality to see what genes are most active throughout the network, or what genes interact with others the most throughout the network. This can help us understand what roles certain genes play in the network.

There are many ways to calculate centrality in graphs all of which can give different kinds of information on centrality. Finding centralities in biology can be applied in many different circumstances, some of which are gene regulatory, protein interaction and metabolic networks.^[31]

Supervised Learning

Supervised learning is a type of algorithm that learns from labeled data and learns how to assign labels to future data that is unlabeled. In biology supervised learning can be helpful when we have data that we know how to categorize and we would like to categorize more data into those categories.

A common supervised learning algorithm is the random forest, which uses numerous decision trees to train a model to classify a dataset. Forming the basis of the random forest, a decision tree is a structure which aims to classify, or label, some set of data using certain known features of that data. A practical biological example of this would be taking an individual's genetic data and predicting whether or not that individual is predisposed to develop a certain disease or cancer. At each internal node the algorithm checks the dataset for exactly one feature, a specific gene in the previous example, and then branches left or right based on the result. Then at each leaf node, the decision tree assigns a class label to the dataset. So in practice, the algorithm walks a specific root-to-leaf path based on the input dataset through the decision tree, which results in the classification of that dataset. Commonly, decision trees have target variables that take on discrete values, like yes/no, in which case it is referred to as a classification tree, but if the target variable is continuous then it is called a regression tree. To construct a decision tree, it must first be trained using a training set to identify which features are the best predictors of the target variable.

Open source software

Open source software provides a platform for computational biology where everyone can access and benefit from software developed in research. PLOS cites^{[ citation needed ]} four main reasons for the use of open source software:

Reproducibility: This allows for researchers to use the exact methods used to calculate the relations between biological data.
Faster development: developers and researchers do not have to reinvent existing code for minor tasks. Instead they can use pre-existing programs to save time on the development and implementation of larger projects.
Increased quality: Having input from multiple researchers studying the same topic provides a layer of assurance that errors will not be in the code.
Long-term availability: Open source programs are not tied to any businesses or patents. This allows for them to be posted to multiple web pages and ensure that they are available in the future.^[32]

Research

There are several large conferences that are concerned with computational biology. Some notable examples are Intelligent Systems for Molecular Biology, European Conference on Computational Biology and Research in Computational Molecular Biology.

There are also numerous journals dedicated to computational biology. Some notable examples include Journal of Computational Biology and PLOS Computational Biology, a peer-reviewed open access journal that has many notable research projects in the field of computational biology. They provide reviews on software, tutorials for open source software, and display information on upcoming computational biology conferences.^{[ citation needed ]}

Related fields

Computational biology, bioinformatics and mathematical biology are all interdisciplinary approaches to the life sciences that draw from quantitative disciplines such as mathematics and information science. The NIH describes computational/mathematical biology as the use of computational/mathematical approaches to address theoretical and experimental questions in biology and, by contrast, bioinformatics as the application of information science to understand complex life-sciences data.^[1]

Specifically, the NIH defines

Computational biology: The development and application of data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems.^[1]

Bioinformatics: Research, development, or application of computational tools and approaches for expanding the use of biological, medical, behavioral or health data, including those to acquire, store, organize, archive, analyze, or visualize such data.^[1]

While each field is distinct, there may be significant overlap at their interface,^[1] so much so that to many, bioinformatics and computational biology are terms that are used interchangeably.

The terms computational biology and evolutionary computation have a similar name, but are not to be confused. Unlike computational biology, evolutionary computation is not concerned with modeling and analyzing biological data. It instead creates algorithms based on the ideas of evolution across species. Sometimes referred to as genetic algorithms, the research of this field can be applied to computational biology. While evolutionary computation is not inherently a part of computational biology, computational evolutionary biology is a subfield of it.^[33]

Related Research Articles

Bioinformatics is an interdisciplinary field of science that develops methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, chemistry, physics, computer science, computer programming, information engineering, mathematics and statistics to analyze and interpret biological data. The subsequent process of analyzing and interpreting data is referred to as computational biology.

Systems biology is the computational and mathematical analysis and modeling of complex biological systems. It is a biology-based interdisciplinary field of study that focuses on complex interactions within biological systems, using a holistic approach to biological research.

In computational biology, gene prediction or gene finding refers to the process of identifying the regions of genomic DNA that encode genes. This includes protein-coding genes as well as RNA genes, but may also include prediction of other functional elements such as regulatory regions. Gene finding is one of the first and most important steps in understanding the genome of a species once it has been sequenced.

Comparative genomics is a field of biological research in which the genomic features of different organisms are compared. The genomic features may include the DNA sequence, genes, gene order, regulatory sequences, and other genomic structural landmarks. In this branch of genomics, whole or large parts of genomes resulting from genome projects are compared to study basic biological similarities and differences as well as evolutionary relationships between organisms. The major principle of comparative genomics is that common features of two organisms will often be encoded within the DNA that is evolutionarily conserved between them. Therefore, comparative genomic approaches start with making some form of alignment of genome sequences and looking for orthologous sequences in the aligned genomes and checking to what extent those sequences are conserved. Based on these, genome and molecular evolution are inferred and this may in turn be put in the context of, for example, phenotypic evolution or population genetics.

Modelling biological systems is a significant task of systems biology and mathematical biology. Computational systems biology aims to develop and use efficient algorithms, data structures, visualization and communication tools with the goal of computer modelling of biological systems. It involves the use of computer simulations of biological systems, including cellular subsystems, to both analyze and visualize the complex connections of these cellular processes.

Computational genomics refers to the use of computational and statistical analysis to decipher biology from genome sequences and related data, including both DNA and RNA sequence as well as other "post-genomic" data. These, in combination with computational and statistical approaches to understanding the function of the genes and statistical association analysis, this field is also often referred to as Computational and Statistical Genetics/genomics. As such, computational genomics may be regarded as a subset of bioinformatics and computational biology, but with a focus on using whole genomes to understand the principles of how the DNA of a species controls its biology at the molecular level and beyond. With the current abundance of massive biological datasets, computational studies have become one of the most important means to biological discovery.

Metabolic network modelling, also known as metabolic network reconstruction or metabolic pathway analysis, allows for an in-depth insight into the molecular mechanisms of a particular organism. In particular, these models correlate the genome with molecular physiology. A reconstruction breaks down metabolic pathways into their respective reactions and enzymes, and analyzes them within the perspective of the entire network. In simplified terms, a reconstruction collects all of the relevant metabolic information of an organism and compiles it in a mathematical model. Validation and analysis of reconstructions can allow identification of key features of metabolism such as growth yield, resource distribution, network robustness, and gene essentiality. This knowledge can then be applied to create novel biotechnology.

<span class="mw-page-title-main">David Haussler</span> American bioinformatician

David Haussler is an American bioinformatician known for his work leading the team that assembled the first human genome sequence in the race to complete the Human Genome Project and subsequently for comparative genome analysis that deepens understanding the molecular function and evolution of the genome.

<span class="mw-page-title-main">Biological network inference</span>

Biological network inference is the process of making inferences and predictions about biological networks. By using these networks to analyze patterns in biological systems, such as food-webs, we can visualize the nature and strength of these interactions between species, DNA, proteins, and more.

Physiomics is a systematic study of physiome in biology. Physiomics employs bioinformatics to construct networks of physiological features that are associated with genes, proteins and their networks. A few of the methods for determining individual relationships between the DNA sequence and physiological function include metabolic pathway engineering and RNAi analysis. The relationships derived from methods such as these are organized and processed computationally to form distinct networks. Computer models use these experimentally determined networks to develop further predictions of gene function.

Protein function prediction methods are techniques that bioinformatics researchers use to assign biological or biochemical roles to proteins. These proteins are usually ones that are poorly studied or predicted based on genomic sequence data. These predictions are often driven by data-intensive computational procedures. Information may come from nucleic acid sequence homology, gene expression profiles, protein domain structures, text mining of publications, phylogenetic profiles, phenotypic profiles, and protein-protein interaction. Protein function is a broad term: the roles of proteins range from catalysis of biochemical reactions to transport to signal transduction, and a single protein may play a role in multiple processes or cellular pathways.

De novo transcriptome assembly is the de novo sequence assembly method of creating a transcriptome without the aid of a reference genome.

Translational bioinformatics (TBI) is a field that emerged in the 2010s to study health informatics, focused on the convergence of molecular bioinformatics, biostatistics, statistical genetics and clinical informatics. Its focus is on applying informatics methodology to the increasing amount of biomedical and genomic data to formulate knowledge and medical tools, which can be utilized by scientists, clinicians, and patients. Furthermore, it involves applying biomedical research to improve human health through the use of computer-based information system. TBI employs data mining and analyzing biomedical informatics in order to generate clinical knowledge for application. Clinical knowledge includes finding similarities in patient populations, interpreting biological information to suggest therapy treatments and predict health outcomes.

Ron Shamir is an Israeli professor of computer science known for his work in graph theory and in computational biology. He holds the Raymond and Beverly Sackler Chair in Bioinformatics, and is the founder and former head of the Edmond J. Safra Center for Bioinformatics at Tel Aviv University.

<span class="mw-page-title-main">Gary Stormo</span> American geneticist (born 1950)

Gary Stormo is an American geneticist and currently Joseph Erlanger Professor in the Department of Genetics and the Center for Genome Sciences and Systems Biology at Washington University School of Medicine in St Louis. He is considered one of the pioneers of bioinformatics and genomics. His research combines experimental and computational approaches in order to identify and predict regulatory sequences in DNA and RNA, and their contributions to the regulatory networks that control gene expression.

In bioinformatics, a Gene Disease Database is a systematized collection of data, typically structured to model aspects of reality, in a way to comprehend the underlying mechanisms of complex diseases, by understanding multiple composite interactions between phenotype-genotype relationships and gene-disease mechanisms. Gene Disease Databases integrate human gene-disease associations from various expert curated databases and text mining derived associations including Mendelian, complex and environmental diseases.

Machine learning in bioinformatics is the application of machine learning algorithms to bioinformatics, including genomics, proteomics, microarrays, systems biology, evolution, and text mining.

Christos A. Ouzounis is a computational biologist, a director of research at the CERTH, and Professor of Bioinformatics at Aristotle University in Thessaloniki.

<span class="mw-page-title-main">Genome informatics</span>

Genome Informatics is a scientific study of information processing in genomes.

References

1 2 3 4 5 "NIH working definition of bioinformatics and computational biology" (PDF). Biomedical Information Science and Technology Initiative. 17 July 2000. Archived from the original (PDF) on 5 September 2012. Retrieved 18 August 2012.
↑ "About the CCMB". Center for Computational Molecular Biology. Retrieved 18 August 2012.
1 2 3 Hogeweg, Paulien (7 March 2011). "The Roots of Bioinformatics in Theoretical Biology". PLOS Computational Biology. 3. 7 (3): e1002021. Bibcode:2011PLSCB...7E2021H. doi: 10.1371/journal.pcbi.1002021 . PMC 3068925 . PMID 21483479.
↑ "The Human Genome Project". The Human Genome Project. 22 December 2020. Retrieved 13 April 2022.
↑ "Human Genome Project FAQ". National Human Genome Research Institute. February 24, 2020. Archived from the original on Apr 23, 2022. Retrieved 2022-04-20.
↑ "T2T-CHM13v1.1 - Genome - Assembly". NCBI. Archived from the original on Jun 29, 2023. Retrieved 2022-04-20.
↑ "Genome List - Genome". NCBI. Retrieved 2022-04-20.
↑ Bourne, Philip (2012). "Rise and Demise of Bioinformatics? Promise and Progress". PLOS Computational Biology. 8 (4): e1002487. Bibcode:2012PLSCB...8E2487O. doi: 10.1371/journal.pcbi.1002487 . PMC 3343106 . PMID 22570600.
↑ "COSI Information". www.iscb.org. Archived from the original on 2022-04-21. Retrieved 2022-04-21.
↑ Grenander, Ulf; Miller, Michael I. (1998-12-01). "Computational Anatomy: An Emerging Discipline". Q. Appl. Math. 56 (4): 617–694. doi: 10.1090/qam/1668732 .
↑ "Mathematical Biology | Faculty of Science". www.ualberta.ca. Retrieved 2022-04-18.
1 2 3 "The Sub-fields of Computational Biology". Ninh Laboratory of Computational Biology. 2013-02-18. Retrieved 2022-04-18.
1 2 Kitano, Hiroaki (14 November 2002). "Computational systems biology". Nature. 420 (6912): 206–10. Bibcode:2002Natur.420..206K. doi:10.1038/nature01254. PMID 12432404. S2CID 4401115. ProQuest 204483859.
↑ Favrin, Bean (2 September 2014). "esyN: Network Building, Sharing and Publishing". PLOS ONE. 9 (9): e106035. Bibcode:2014PLoSO...9j6035B. doi: 10.1371/journal.pone.0106035 . PMC 4152123 . PMID 25181461.
↑ Antonio Carvajal-Rodríguez (2012). "Simulation of Genes and Genomes Forward in Time". Current Genomics. 11 (1): 58–61. doi:10.2174/138920210790218007. PMC 2851118 . PMID 20808525.
↑ "Genome Sequencing to the Rest of Us". Scientific American.
1 2 3 Koonin, Eugene (6 March 2001). "Computational Genomics". Curr. Biol. 11 (5): 155–158. Bibcode:2001CBio...11.R155K. doi: 10.1016/S0960-9822(01)00081-1 . PMID 11267880. S2CID 17202180.
↑ "Sequence Alignment - an overview | ScienceDirect Topics". www.sciencedirect.com. Retrieved 2022-04-18.
↑ "Gene Ontology Resource". Gene Ontology Resource. Retrieved 2022-04-18.
↑ Beagrie, Robert A.; Scialdone, Antonio; Schueler, Markus; Kraemer, Dorothee C. A.; Chotalia, Mita; Xie, Sheila Q.; Barbieri, Mariano; de Santiago, Inês; Lavitas, Liron-Mark; Branco, Miguel R.; Fraser, James (March 2017). "Complex multi-enhancer contacts captured by genome architecture mapping". Nature. 543 (7646): 519–524. Bibcode:2017Natur.543..519B. doi:10.1038/nature21411. ISSN 1476-4687. PMC 5366070 . PMID 28273065.
↑ "Computational Neuroscience | Neuroscience". www.bu.edu.
1 2 Sejnowski, Terrence; Christof Koch; Patricia S. Churchland (9 September 1988). "Computational Neuroscience". Science. 4871. 241 (4871): 1299–306. Bibcode:1988Sci...241.1299S. doi:10.1126/science.3045969. PMID 3045969.
↑ Dauvermann, Maria R.; Whalley, Heather C.; Schmidt, Andrã©; Lee, Graham L.; Romaniuk, Liana; Roberts, Neil; Johnstone, Eve C.; Lawrie, Stephen M.; Moorhead, Thomas W. J. (2014). "Computational Neuropsychiatry – Schizophrenia as a Cognitive Brain Network Disorder". Frontiers in Psychiatry. 5: 30. doi: 10.3389/fpsyt.2014.00030 . PMC 3971172 . PMID 24723894.
↑ Tretter, F.; Albus, M. (December 2007). "'Computational Neuropsychiatry' of Working Memory Disorders in Schizophrenia: The Network Connectivity in Prefrontal Cortex - Data and Models". Pharmacopsychiatry. 40 (S 1): S2–S16. doi:10.1055/S-2007-993139. S2CID 18574327.
↑ Marin-Sanguino, A.; Mendoza, E. (2008). "Hybrid Modeling in Computational Neuropsychiatry". Pharmacopsychiatry. 41: S85–S88. doi:10.1055/s-2008-1081464. PMID 18756425. S2CID 22996341.
↑ Price, Michael (2012-04-13). "Computational Biologists: The Next Pharma Scientists?".
1 2 Jessen, Walter (2012-04-15). "Pharma's shifting strategy means more jobs for computational biologists".
↑ Barbolosi, Dominique; Ciccolini, Joseph; Lacarelle, Bruno; Barlesi, Fabrice; Andre, Nicolas (2016). "Computational oncology--mathematical modelling of drug regimens for precision medicine". Nature Reviews Clinical Oncology. 13 (4): 242–254. doi:10.1038/nrclinonc.2015.204. PMID 26598946. S2CID 22492353.
↑ Yakhini, Zohar (2011). "Cancer Computational Biology". BMC Bioinformatics. 12: 120. doi: 10.1186/1471-2105-12-120 . PMC 3111371 . PMID 21521513.
↑ "GEO Accession viewer".
↑ Koschützki, Dirk; Schreiber, Falk (2008-05-15). "Centrality Analysis Methods for Biological Networks and Their Application to Gene Regulatory Networks". Gene Regulation and Systems Biology. 2: 193–201. doi:10.4137/grsb.s702. ISSN 1177-6250. PMC 2733090 . PMID 19787083.
↑ Prlić, Andreas; Lapp, Hilmar (2012). "The PLOS Computational Biology Software Section". PLOS Computational Biology. 8 (11): e1002799. Bibcode:2012PLSCB...8E2799P. doi: 10.1371/journal.pcbi.1002799 . PMC 3510099 .
↑ Foster, James (June 2001). "Evolutionary Computation". Nature Reviews Genetics. 2 (6): 428–436. doi:10.1038/35076523. PMID 11389459. S2CID 205017006.

External links

bioinformatics.org

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[nih-1] 1 2 3 4 5 "NIH working definition of bioinformatics and computational biology" (PDF). Biomedical Information Science and Technology Initiative. 17 July 2000. Archived from the original (PDF) on 5 September 2012. Retrieved 18 August 2012.

[brown-2] "About the CCMB". Center for Computational Molecular Biology. Retrieved 18 August 2012.

[Hogeweg_2011-3] 1 2 3 Hogeweg, Paulien (7 March 2011). "The Roots of Bioinformatics in Theoretical Biology". PLOS Computational Biology. 3. 7 (3): e1002021. Bibcode:2011PLSCB...7E2021H. doi: 10.1371/journal.pcbi.1002021 . PMC 3068925 . PMID 21483479.

[:0-4] "The Human Genome Project". The Human Genome Project. 22 December 2020. Retrieved 13 April 2022.

[5] "Human Genome Project FAQ". National Human Genome Research Institute. February 24, 2020. Archived from the original on Apr 23, 2022. Retrieved 2022-04-20.

[6] "T2T-CHM13v1.1 - Genome - Assembly". NCBI. Archived from the original on Jun 29, 2023. Retrieved 2022-04-20.

[7] "Genome List - Genome". NCBI. Retrieved 2022-04-20.

[:1-8] Bourne, Philip (2012). "Rise and Demise of Bioinformatics? Promise and Progress". PLOS Computational Biology. 8 (4): e1002487. Bibcode:2012PLSCB...8E2487O. doi: 10.1371/journal.pcbi.1002487 . PMC 3343106 . PMID 22570600.

[9] "COSI Information". www.iscb.org. Archived from the original on 2022-04-21. Retrieved 2022-04-21.

[:20-10] Grenander, Ulf; Miller, Michael I. (1998-12-01). "Computational Anatomy: An Emerging Discipline". Q. Appl. Math. 56 (4): 617–694. doi: 10.1090/qam/1668732 .

[11] "Mathematical Biology | Faculty of Science". www.ualberta.ca. Retrieved 2022-04-18.

[nlcb.wordpress.com-12] 1 2 3 "The Sub-fields of Computational Biology". Ninh Laboratory of Computational Biology. 2013-02-18. Retrieved 2022-04-18.

[Kitano_2002_206–10-13] 1 2 Kitano, Hiroaki (14 November 2002). "Computational systems biology". Nature. 420 (6912): 206–10. Bibcode:2002Natur.420..206K. doi:10.1038/nature01254. PMID 12432404. S2CID 4401115. ProQuest 204483859.

[Bean_2014-14] Favrin, Bean (2 September 2014). "esyN: Network Building, Sharing and Publishing". PLOS ONE. 9 (9): e106035. Bibcode:2014PLoSO...9j6035B. doi: 10.1371/journal.pone.0106035 . PMC 4152123 . PMID 25181461.

[:2-15] Antonio Carvajal-Rodríguez (2012). "Simulation of Genes and Genomes Forward in Time". Current Genomics. 11 (1): 58–61. doi:10.2174/138920210790218007. PMC 2851118 . PMID 20808525.

[16] "Genome Sequencing to the Rest of Us". Scientific American.

[Koonin_2001_155–158-17] 1 2 3 Koonin, Eugene (6 March 2001). "Computational Genomics". Curr. Biol. 11 (5): 155–158. Bibcode:2001CBio...11.R155K. doi: 10.1016/S0960-9822(01)00081-1 . PMID 11267880. S2CID 17202180.

[18] "Sequence Alignment - an overview | ScienceDirect Topics". www.sciencedirect.com. Retrieved 2022-04-18.

[19] "Gene Ontology Resource". Gene Ontology Resource. Retrieved 2022-04-18.

[20] Beagrie, Robert A.; Scialdone, Antonio; Schueler, Markus; Kraemer, Dorothee C. A.; Chotalia, Mita; Xie, Sheila Q.; Barbieri, Mariano; de Santiago, Inês; Lavitas, Liron-Mark; Branco, Miguel R.; Fraser, James (March 2017). "Complex multi-enhancer contacts captured by genome architecture mapping". Nature. 543 (7646): 519–524. Bibcode:2017Natur.543..519B. doi:10.1038/nature21411. ISSN 1476-4687. PMC 5366070 . PMID 28273065.

[21] "Computational Neuroscience | Neuroscience". www.bu.edu.

[Sejnowski_1988-22] 1 2 Sejnowski, Terrence; Christof Koch; Patricia S. Churchland (9 September 1988). "Computational Neuroscience". Science. 4871. 241 (4871): 1299–306. Bibcode:1988Sci...241.1299S. doi:10.1126/science.3045969. PMID 3045969.

[23] Dauvermann, Maria R.; Whalley, Heather C.; Schmidt, Andrã©; Lee, Graham L.; Romaniuk, Liana; Roberts, Neil; Johnstone, Eve C.; Lawrie, Stephen M.; Moorhead, Thomas W. J. (2014). "Computational Neuropsychiatry – Schizophrenia as a Cognitive Brain Network Disorder". Frontiers in Psychiatry. 5: 30. doi: 10.3389/fpsyt.2014.00030 . PMC 3971172 . PMID 24723894.

[24] Tretter, F.; Albus, M. (December 2007). "'Computational Neuropsychiatry' of Working Memory Disorders in Schizophrenia: The Network Connectivity in Prefrontal Cortex - Data and Models". Pharmacopsychiatry. 40 (S 1): S2–S16. doi:10.1055/S-2007-993139. S2CID 18574327.

[25] Marin-Sanguino, A.; Mendoza, E. (2008). "Hybrid Modeling in Computational Neuropsychiatry". Pharmacopsychiatry. 41: S85–S88. doi:10.1055/s-2008-1081464. PMID 18756425. S2CID 22996341.

[26] Price, Michael (2012-04-13). "Computational Biologists: The Next Pharma Scientists?".

[Walter-27] 1 2 Jessen, Walter (2012-04-15). "Pharma's shifting strategy means more jobs for computational biologists".

[28] Barbolosi, Dominique; Ciccolini, Joseph; Lacarelle, Bruno; Barlesi, Fabrice; Andre, Nicolas (2016). "Computational oncology--mathematical modelling of drug regimens for precision medicine". Nature Reviews Clinical Oncology. 13 (4): 242–254. doi:10.1038/nrclinonc.2015.204. PMID 26598946. S2CID 22492353.

[29] Yakhini, Zohar (2011). "Cancer Computational Biology". BMC Bioinformatics. 12: 120. doi: 10.1186/1471-2105-12-120 . PMC 3111371 . PMID 21521513.

[30] "GEO Accession viewer".

[:3-31] Koschützki, Dirk; Schreiber, Falk (2008-05-15). "Centrality Analysis Methods for Biological Networks and Their Application to Gene Regulatory Networks". Gene Regulation and Systems Biology. 2: 193–201. doi:10.4137/grsb.s702. ISSN 1177-6250. PMC 2733090 . PMID 19787083.

[32] Prlić, Andreas; Lapp, Hilmar (2012). "The PLOS Computational Biology Software Section". PLOS Computational Biology. 8 (11): e1002799. Bibcode:2012PLSCB...8E2799P. doi: 10.1371/journal.pcbi.1002799 . PMC 3510099 .

[33] Foster, James (June 2001). "Evolutionary Computation". Nature Reviews Genetics. 2 (6): 428–436. doi:10.1038/35076523. PMID 11389459. S2CID 205017006.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

v t e Computational science
Biology	Anatomy Biological systems Cognition Genomics Neuroscience Phylogenetics
Chemistry	Electronic structure Molecular mechanics
Physics	Astrophysics Electromagnetics Fluid dynamics Geophysics Mechanics Particle physics Thermodynamics
Linguistics	Semantics Lexicology
Social science	Politics Sociology Economics
Other	Finance Materials science Mathematics Engineering Semiotics Transportation science

v t e Bioinformatics
Databases	Sequence databases: GenBank, European Nucleotide Archive, DNA Data Bank of Japan and China National GeneBank Secondary databases: UniProt, database of protein sequences grouping together Swiss-Prot, TrEMBL and Protein Information Resource Other databases: BioNumbers, Protein Data Bank, Ensembl, InterPro, KEGG, and Gene Ontology Specialised genomic databases: BOLD, Saccharomyces Genome Database, FlyBase, VectorBase, WormBase, Rat Genome Database, PHI-base, Arabidopsis Information Resource, GISAID and Zebrafish Information Network
Software	BLAST Bowtie Clustal EMBOSS HMMER MUSCLE PANGOLIN SAMtools SOAP suite TopHat
Other	Server: ExPASy Rosalind (education platform)
Institutions	Broad Institute Computational Biology Department (CBD) Microsoft Research - University of Trento Centre for Computational and Systems Biology (COSBI) Database Center for Life Science (DBCLS) DNA Data Bank of Japan (DDBJ) European Bioinformatics Institute (EMBL-EBI) European Molecular Biology Laboratory (EMBL) Flatiron Institute J. Craig Venter Institute (JCVI) Max Planck Institute of Molecular Cell Biology and Genetics (MPI-CBG) US National Center for Biotechnology Information (NCBI) Japanese Institute of Genetics Netherlands Bioinformatics Centre (NBIC) Philippine Genome Center (PGC) Scripps Research Swiss Institute of Bioinformatics (SIB) Wellcome Sanger Institute Whitehead Institute
Organizations	African Society for Bioinformatics and Computational Biology (ASBCB) Australia Bioinformatics Resource (EMBL-AR) European Molecular Biology network (EMBnet) International Nucleotide Sequence Database Collaboration (INSDC) International Society for Biocuration (ISB) International Society for Computational Biology (ISCB) Student Council (ISCB-SC) Institute of Genomics and Integrative Biology (CSIR-IGIB) Japanese Society for Bioinformatics (JSBi)
Meetings	Basel Computational Biology Conference‎ ([BC²]) European Conference on Computational Biology (ECCB) Intelligent Systems for Molecular Biology (ISMB) International Conference on Bioinformatics (InCoB) International Conference on Computational Intelligence Methods for Bioinformatics and Biostatistics (CIBB) ISCB Africa ASBCB Conference on Bioinformatics Pacific Symposium on Biocomputing (PSB) Research in Computational Molecular Biology (RECOMB)
File formats	CRAM format FASTA format FASTQ format NeXML format Nexus format Pileup format SAM format Stockholm format VCF format GFF format
Related topics	Computational biology List of biobanks List of biological databases Molecular phylogenetics Sequencing Sequence database Sequence alignment
Category Commons