Genome informatics

Last updated
A section of DNA; the sequence of the plate-like units (nucleotides) in the center carries information. DNA animation.gif
A section of DNA; the sequence of the plate-like units (nucleotides) in the center carries information.

Genome Informatics (also genoinformatics or genetic information processing) [1] is a scientific study of information processing in genomes.

Contents

Introduction

Information processing and information flow occur in the course of an organism's development and throughout its lifespan. [2] The essence of computation is information processing, and the essence of biological information processing is control of the molecular events inside a cell. [3] Genome informatics introduces computational techniques and applies them to derive information from genome sequences. [4] Genome informatics includes methods to analyze DNA sequence information and to predict protein sequence and structure. [4] Methods of studying a large genomic data include variant-calling, transcriptomic analysis, and variant interpretation. [5] Genome informatics can analyze DNA sequence information and to predict protein sequence and structure. [4] Genome informatics dealing with [6] microbial and metagenomics, sequencing algorithms, variant discovery and genome assembly, evolution, complex traits and phylogenetics, personal and medical genomics, transcriptomics, genome structure and function. [6] Genoinformatics refers to genome and chromosome dynamics, quantitative biology and modeling, molecular and cellular pathologies. [7] Genome informatics also includes the field of genome design. There still a lot more we can do and develop in Genome Informatics. Find a potential disease, searching a solution for a disease, or proving why people get sick for no reason. For genomic informatics there are several main applications for it, including:

Applications

Biomolecular systems that can process information are sought for computational applications, because of their potential for parallelism and miniaturization and because their biocompatibility also makes them suitable for future biomedical applications. DNA has been used to design machines, motors, finite automata, logic gates, reaction networks and logic programs, amongst many other structures and dynamic behaviours. [10]

See also

Related Research Articles

<span class="mw-page-title-main">Bioinformatics</span> Computational analysis of large, complex sets of biological data

Bioinformatics is an interdisciplinary field of science that develops methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, chemistry, physics, computer science, computer programming, information engineering, mathematics and statistics to analyze and interpret biological data. The subsequent process of analyzing and interpreting data is referred to as computational biology.

<span class="mw-page-title-main">Genomics</span> Discipline in genetics

Genomics is an interdisciplinary field of biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, three-dimensional structural configuration. In contrast to genetics, which refers to the study of individual genes and their roles in inheritance, genomics aims at the collective characterization and quantification of all of an organism's genes, their interrelations and influence on the organism. Genes may direct the production of proteins with the assistance of enzymes and messenger molecules. In turn, proteins make up body structures such as organs and tissues as well as control chemical reactions and carry signals between cells. Genomics also involves the sequencing and analysis of genomes through uses of high throughput DNA sequencing and bioinformatics to assemble and analyze the function and structure of entire genomes. Advances in genomics have triggered a revolution in discovery-based research and systems biology to facilitate understanding of even the most complex biological systems such as the brain.

<span class="mw-page-title-main">Computational biology</span> Branch of biology

Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and big data, the field also has foundations in applied mathematics, chemistry, and genetics. It differs from biological computing, a subfield of computer science and engineering which uses bioengineering to build computers.

<span class="mw-page-title-main">Systems biology</span> Computational and mathematical modeling of complex biological systems

Systems biology is the computational and mathematical analysis and modeling of complex biological systems. It is a biology-based interdisciplinary field of study that focuses on complex interactions within biological systems, using a holistic approach to biological research.

<span class="mw-page-title-main">Omics</span> Suffix in biology

The branches of science known informally as omics are various disciplines in biology whose names end in the suffix -omics, such as genomics, proteomics, metabolomics, metagenomics, phenomics and transcriptomics. Omics aims at the collective characterization and quantification of pools of biological molecules that translate into the structure, function, and dynamics of an organism or organisms.

The transcriptome is the set of all RNA transcripts, including coding and non-coding, in an individual or a population of cells. The term can also sometimes be used to refer to all RNAs, or just mRNA, depending on the particular experiment. The term transcriptome is a portmanteau of the words transcript and genome; it is associated with the process of transcript production during the biological process of transcription.

Computational genomics refers to the use of computational and statistical analysis to decipher biology from genome sequences and related data, including both DNA and RNA sequence as well as other "post-genomic" data. These, in combination with computational and statistical approaches to understanding the function of the genes and statistical association analysis, this field is also often referred to as Computational and Statistical Genetics/genomics. As such, computational genomics may be regarded as a subset of bioinformatics and computational biology, but with a focus on using whole genomes to understand the principles of how the DNA of a species controls its biology at the molecular level and beyond. With the current abundance of massive biological datasets, computational studies have become one of the most important means to biological discovery.

Mark Bender Gerstein is an American scientist working in bioinformatics and Data Science. As of 2009, he is co-director of the Yale Computational Biology and Bioinformatics program.

<span class="mw-page-title-main">David Haussler</span> American bioinformatician

David Haussler is an American bioinformatician known for his work leading the team that assembled the first human genome sequence in the race to complete the Human Genome Project and subsequently for comparative genome analysis that deepens understanding the molecular function and evolution of the genome.

<span class="mw-page-title-main">RNA-Seq</span> Lab technique in cellular biology

RNA-Seq is a technique that uses next-generation sequencing (NGS) to reveal the presence and quantity of RNA molecules in a biological sample, providing a snapshot of gene expression in the sample, also known as transcriptome.

<span class="mw-page-title-main">Proteogenomics</span>

Proteogenomics is a field of biological research that utilizes a combination of proteomics, genomics, and transcriptomics to aid in the discovery and identification of peptides. Proteogenomics is used to identify new peptides by comparing MS/MS spectra against a protein database that has been derived from genomic and transcriptomic information. Proteogenomics often refers to studies that use proteomic information, often derived from mass spectrometry, to improve gene annotations. The utilization of both proteomics and genomics data alongside advances in the availability and power of spectrographic and chromatographic technology led to the emergence of proteogenomics as its own field in 2004.

<span class="mw-page-title-main">DNA annotation</span> The process of describing the structure and function of a genome

In molecular biology and genetics, DNA annotation or genome annotation is the process of describing the structure and function of the components of a genome, by analyzing and interpreting them in order to extract their biological significance and understand the biological processes in which they participate. Among other things, it identifies the locations of genes and all the coding regions in a genome and determines what those genes do.

<span class="mw-page-title-main">In silico PCR</span>

In silico PCR refers to computational tools used to calculate theoretical polymerase chain reaction (PCR) results using a given set of primers (probes) to amplify DNA sequences from a sequenced genome or transcriptome.

Translational bioinformatics (TBI) is a field that emerged in the 2010s to study health informatics, focused on the convergence of molecular bioinformatics, biostatistics, statistical genetics and clinical informatics. Its focus is on applying informatics methodology to the increasing amount of biomedical and genomic data to formulate knowledge and medical tools, which can be utilized by scientists, clinicians, and patients. Furthermore, it involves applying biomedical research to improve human health through the use of computer-based information system. TBI employs data mining and analyzing biomedical informatics in order to generate clinical knowledge for application. Clinical knowledge includes finding similarities in patient populations, interpreting biological information to suggest therapy treatments and predict health outcomes.

<span class="mw-page-title-main">Gary Stormo</span> American geneticist (born 1950)

Gary Stormo is an American geneticist and currently Joseph Erlanger Professor in the Department of Genetics and the Center for Genome Sciences and Systems Biology at Washington University School of Medicine in St Louis. He is considered one of the pioneers of bioinformatics and genomics. His research combines experimental and computational approaches in order to identify and predict regulatory sequences in DNA and RNA, and their contributions to the regulatory networks that control gene expression.

In bioinformatics, a Gene Disease Database is a systematized collection of data, typically structured to model aspects of reality, in a way to comprehend the underlying mechanisms of complex diseases, by understanding multiple composite interactions between phenotype-genotype relationships and gene-disease mechanisms. Gene Disease Databases integrate human gene-disease associations from various expert curated databases and text mining derived associations including Mendelian, complex and environmental diseases.

Centre for Genomic Regulation

The Centre for Genomic Regulation is a biomedical and genomics research centre based on Barcelona. Most of its facilities and laboratories are located in the Barcelona Biomedical Research Park, in front of Somorrostro beach.

Transcriptomics technologies are the techniques used to study an organism's transcriptome, the sum of all of its RNA transcripts. The information content of an organism is recorded in the DNA of its genome and expressed through transcription. Here, mRNA serves as a transient intermediary molecule in the information network, whilst non-coding RNAs perform additional diverse functions. A transcriptome captures a snapshot in time of the total transcripts present in a cell. Transcriptomics technologies provide a broad account of which cellular processes are active and which are dormant. A major challenge in molecular biology is to understand how a single genome gives rise to a variety of cells. Another is how gene expression is regulated.

Debasis Dash is an Indian computational biologist and chief scientist at the Institute of Genomics and Integrative Biology (IGIB). Known for his research on proteomics and Big Data and Artificial Intelligence studies, his studies have been documented by way of a number of articles and ResearchGate, an online repository of scientific articles has listed 120 of them. The Department of Biotechnology of the Government of India awarded him the National Bioscience Award for Career Development, one of the highest Indian science awards, for his contributions to biosciences, in 2014. He is appointed as the director of Institute of Life Sciences, Bhubaneswar on 18 May 2023.

References

  1. Patel, A. (2001). "Why genetic information processing could have a quantum basis". Journal of Biosciences. 26 (2): 145–151. arXiv: quant-ph/0105001 . Bibcode:2001quant.ph..5001P. doi:10.1007/BF02703638. ISSN   0250-5991. PMID   11426050. S2CID   12348859.
  2. Bajic, Vladimir B; Wee, Tan Tin (2005). "Information Processing and Living Systems". Series on Advances in Bioinformatics and Computational Biology. 2. doi:10.1142/p391. ISBN   978-1-86094-563-2. ISSN   1751-6404.
  3. Wills, Peter R. (2016-03-13). "DNA as information". Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences. 374 (2063): 20150417. Bibcode:2016RSPTA.37450417W. doi: 10.1098/rsta.2015.0417 . PMID   26857666.
  4. 1 2 3 "Genome informatics - Latest research and news | Nature". www.nature.com. Retrieved 2020-04-20.
  5. "Genome Informatics (Virtual Conference)". Wellcome Genome Campus Advanced Courses and Scientific Conferences.
  6. 1 2 "Genome Informatics | CSHL". meetings.cshl.edu. Retrieved 2020-04-20.
  7. "ePole of GenoInformatics". www.ijm.fr (in French). Retrieved 2020-04-21.
  8. Human genome informatics : translating genes into health. Lambert, Christophe G., Baker, Darrol J., Patrinos, George P. London, United Kingdom. 2 August 2018. ISBN   978-0-12-813431-3. OCLC   1047959760.{{cite book}}: CS1 maint: location missing publisher (link) CS1 maint: others (link)
  9. 1 2 3 Bajic, Vladimir B; Wee, Tan Tin (2005). "Information Processing and Living Systems". Series on Advances in Bioinformatics and Computational Biology. 2. doi:10.1142/p391. ISBN   978-1-86094-563-2. ISSN   1751-6404.
  10. Santini, Cristina Costa; Bath, Jonathan; Turberfield, Andrew J.; Tyrrell, Andy M. (2012-04-23). "A DNA Network as an Information Processing System". International Journal of Molecular Sciences. 13 (4): 5125–5137. doi: 10.3390/ijms13045125 . ISSN   1422-0067. PMC   3344270 . PMID   22606034.