Nutritious Rice for the World

Last updated

Nutritious Rice for the World
Developer(s) Samudrala Computational Biology Group, [1] University of Washington
Initial release2008-5-12
Platform Cross-platform
Type Distributed computing
Website protinfo.compbio.washington.edu/rice

Nutritious Rice for the World [2] is a World Community Grid research project in the field of agronomy led by the Samudrala Computational Biology Research Group [1] at the University of Washington. It was launched on May 12, 2008. [3] [4] The objective of this project is to predict the structure of proteins of major strains of rice. The intent is to help farmers breed better rice strains with higher crop yields, promote greater disease and pest resistance, and utilize a full range of bioavailable nutrients that can benefit people around the world, especially in regions where malnutrition is a critical concern.

Contents

Determining the structure of proteins is an extremely difficult and expensive process. Though it is possible to computationally predict a protein's structure from its corresponding DNA sequence, there are thousands of distinct proteins found in rice. This presents a computational challenge that a single computer cannot solve within a reasonable timeframe. [5]

Once that the entire rice genome had been sequenced, the effort shifted to identifying genes that are involved in increased yield, disease resistance and nutritional value. This problem is made more difficult because very few cereal plants have been sequenced, and therefore, many of the rice genes do not resemble any genes of known function. The Computational Biology Research Group at the University of Washington developed the Protinfo [6] software, which can produce protein structures at a fraction of the cost and time.

Protinfo is being used to create three-dimensional models of the tens of thousands of rice proteins. These models are then used to predict the function of each protein and to understand the role of the gene that encodes it. The models, and any analysis resulting from examining them, will be housed at the Bioverse database and webserver, which is a comprehensive framework to relate molecules such as proteins and DNA to an organism's pathways and systems.

Volunteers' computers on World Community Grid will run the Protinfo software to create models of all proteins encoded by the rice genome whose structure can be predicted reliably. These models will be analyzed to choose the best ones. From the resulting structures, prediction tools will determine the function of each protein and the role of the gene that encodes it. Using the power of Protinfo, World Community Grid will initially examine over 10,000 genes, and produce 100,000 models per gene.

Eventually, the structures of 30,000 to 60,000 proteins will be studied. [7] Generating one billion models on the 320 CPU cluster at the Computational Biology Research Group was anticipated to take about 30 years to accomplish; however, using World Community Grid took only about two years working at 167 TFLOPS. [8] The distributed computing function was suspended in April 2010 while in-house analysis of results continues. The DC function will resume when funding is secured for further phases.

The resulting knowledge base will hopefully lead to the development of improved hybrids of rice strains with higher yield, greater disease and pest resistance, and a full range of bioavailable nutrients. This knowledge can also be extended to other food crops such as wheat and maize. [9]

System Requirements

The project has minimum system requirements [10] that the computer that does calculations for the project must comply to, which include:

See also

Related Research Articles

<span class="mw-page-title-main">Bioinformatics</span> Computational analysis of large, complex sets of biological data

Bioinformatics is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combines biology, chemistry, physics, computer science, information engineering, mathematics and statistics to analyze and interpret the biological data. Bioinformatics has been used for in silico analyses of biological queries using computational and statistical techniques.

<span class="mw-page-title-main">Genomics</span> Discipline in genetics

Genomics is an interdisciplinary field of biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, three-dimensional structural configuration. In contrast to genetics, which refers to the study of individual genes and their roles in inheritance, genomics aims at the collective characterization and quantification of all of an organism's genes, their interrelations and influence on the organism. Genes may direct the production of proteins with the assistance of enzymes and messenger molecules. In turn, proteins make up body structures such as organs and tissues as well as control chemical reactions and carry signals between cells. Genomics also involves the sequencing and analysis of genomes through uses of high throughput DNA sequencing and bioinformatics to assemble and analyze the function and structure of entire genomes. Advances in genomics have triggered a revolution in discovery-based research and systems biology to facilitate understanding of even the most complex biological systems such as the brain.

<span class="mw-page-title-main">Computational biology</span> Branch of biology

Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and big data, the field also has foundations in applied mathematics, chemistry, and genetics. It differs from biological computing, a subfield of computer engineering which uses bioengineering to build computers.

Protein engineering is the process of developing useful or valuable proteins. It is a young discipline, with much research taking place into the understanding of protein folding and recognition for protein design principles. It has been used to improve the function of many enzymes for industrial catalysis. It is also a product and services market, with an estimated value of $168 billion by 2017.

In bioinformatics, sequence analysis is the process of subjecting a DNA, RNA or peptide sequence to any of a wide range of analytical methods to understand its features, function, structure, or evolution. Methodologies used include sequence alignment, searches against biological databases, and others.

<span class="mw-page-title-main">Molecular genetics</span> Scientific study of genes at the molecular level

Molecular genetics is a sub-field of biology that addresses how differences in the structures or expression of DNA molecules manifests as variation among organisms. Molecular genetics often applies an "investigative approach" to determine the structure and/or function of genes in an organism's genome using genetic screens. The field of study is based on the merging of several sub-fields in biology: classical Mendelian inheritance, cellular biology, molecular biology, biochemistry, and biotechnology. Researchers search for mutations in a gene or induce mutations in a gene to link a gene sequence to a specific phenotype. Molecular genetics is a powerful methodology for linking mutations to genetic conditions that may aid the search for treatments/cures for various genetics diseases.

In computational biology, gene prediction or gene finding refers to the process of identifying the regions of genomic DNA that encode genes. This includes protein-coding genes as well as RNA genes, but may also include prediction of other functional elements such as regulatory regions. Gene finding is one of the first and most important steps in understanding the genome of a species once it has been sequenced.

<i>In silico</i> Latin phrase referring to computer simulations

In biology and other experimental sciences, an in silico experiment is one performed on computer or via computer simulation. The phrase is pseudo-Latin for 'in silicon', referring to silicon in computer chips. It was coined in 1987 as an allusion to the Latin phrases in vivo, in vitro, and in situ, which are commonly used in biology. The latter phrases refer, respectively, to experiments done in living organisms, outside living organisms, and where they are found in nature.

<span class="mw-page-title-main">Functional genomics</span> Field of molecular biology

Functional genomics is a field of molecular biology that attempts to describe gene functions and interactions. Functional genomics make use of the vast data generated by genomic and transcriptomic projects. Functional genomics focuses on the dynamic aspects such as gene transcription, translation, regulation of gene expression and protein–protein interactions, as opposed to the static aspects of the genomic information such as DNA sequence or structures. A key characteristic of functional genomics studies is their genome-wide approach to these questions, generally involving high-throughput methods rather than a more traditional "gene-by-gene" approach.

<span class="mw-page-title-main">World Community Grid</span> BOINC based volunteer computing project to aid scientific research

World Community Grid (WCG) is an effort to create the world's largest volunteer computing platform to tackle scientific research that benefits humanity. Launched on November 16, 2004, with proprietary Grid MP client from United Devices and adding support for Berkeley Open Infrastructure for Network Computing (BOINC) in 2005, World Community Grid eventually discontinued the Grid MP client and consolidated on the BOINC platform in 2008. In September 2021, it was announced that IBM transferred ownership to the Krembil Research Institute of University Health Network in Toronto, Ontario.

Modelling biological systems is a significant task of systems biology and mathematical biology. Computational systems biology aims to develop and use efficient algorithms, data structures, visualization and communication tools with the goal of computer modelling of biological systems. It involves the use of computer simulations of biological systems, including cellular subsystems, to both analyze and visualize the complex connections of these cellular processes.

<span class="mw-page-title-main">ENCODE</span> Research consortium investigating functional elements in human and model organism DNA

The Encyclopedia of DNA Elements (ENCODE) is a public research project which aims to identify functional elements in the human genome.

Protein–protein interaction prediction is a field combining bioinformatics and structural biology in an attempt to identify and catalog physical interactions between pairs or groups of proteins. Understanding protein–protein interactions is important for the investigation of intracellular signaling pathways, modelling of protein complex structures and for gaining insights into various biochemical processes.

<span class="mw-page-title-main">Michael Levitt</span> Nobel laureate, biophysicist, and professor of structural biology (born 1947)

Michael Levitt, is a South African-born biophysicist and a professor of structural biology at Stanford University, a position he has held since 1987. Levitt received the 2013 Nobel Prize in Chemistry, together with Martin Karplus and Arieh Warshel, for "the development of multiscale models for complex chemical systems". In 2018, Levitt was a founding co-editor of the Annual Review of Biomedical Data Science.

The Fiocruz Genome Comparison Project is a collaborative effort involving Brazil's Oswaldo Cruz Institute and IBM's World Community Grid, designed to produce a database comparing the genes from many genomes with each other using SSEARCH. The program SSEARCH performs a rigorous Smith–Waterman alignment between a protein sequence and another protein sequence, a protein database, a DNA or a DNA library.

In computational biology, de novo protein structure prediction refers to an algorithmic process by which protein tertiary structure is predicted from its amino acid primary sequence. The problem itself has occupied leading scientists for decades while still remaining unsolved. According to Science, the problem remains one of the top 125 outstanding issues in modern science. At present, some of the most successful methods have a reasonable probability of predicting the folds of small, single-domain proteins within 1.5 angstroms over the entire structure.

<span class="mw-page-title-main">Ram Samudrala</span>

Ram Samudrala is a professor of computational biology and bioinformatics at the University at Buffalo, United States. He researches protein folding, structure, function, interaction, design, and evolution.

Protein function prediction methods are techniques that bioinformatics researchers use to assign biological or biochemical roles to proteins. These proteins are usually ones that are poorly studied or predicted based on genomic sequence data. These predictions are often driven by data-intensive computational procedures. Information may come from nucleic acid sequence homology, gene expression profiles, protein domain structures, text mining of publications, phylogenetic profiles, phenotypic profiles, and protein-protein interaction. Protein function is a broad term: the roles of proteins range from catalysis of biochemical reactions to transport to signal transduction, and a single protein may play a role in multiple processes or cellular pathways.

The Critical Assessment of Functional Annotation (CAFA) is an experiment designed to provide a large-scale assessment of computational methods dedicated to predicting protein function. Different algorithms are evaluated by their ability to predict the Gene Ontology (GO) terms in the categories of Molecular Function, Biological Process, and Cellular Component.

Machine learning in bioinformatics is the application of machine learning algorithms to bioinformatics, including genomics, proteomics, microarrays, systems biology, evolution, and text mining.

References

  1. 1 2 Samudrala Computational Biology Research Group Archived June 14, 2008, at the Wayback Machine
  2. Nutritious Rice for the World web site
  3. Nutritious Rice for the World [ dead link ]
  4. Research Projects Page on World Community Grid
  5. Project Page on World Community Grid Website
  6. Protinfo
  7. "Using Your Computer to Grow More Nutritious Rice for a Hungry World". NSF.
  8. "IBM World Community Grid "Supercomputer" to Tackle Rice Crisis".
  9. About The Project page on WCG
  10. Nutritious Rice for the World FAQ