Developer(s) | University of Washington, Center for Game Science, [1] Department of Biochemistry [2] |
---|---|
Initial release | May 8, 2008 |
Preview release | |
Operating system | Cross-platform: Windows, macOS, Linux |
Size | ≈434 MB |
Available in | 13 languages |
List of languages Czech, Dutch, English, French, German, Hebrew, Indonesian, Italian, Polish, Romanian, Russian, Spanish, Scientist | |
Type | Puzzle video game, protein folding |
License | Proprietary freeware for academic and non-profit use |
Website | fold |
Foldit is an online puzzle video game about protein folding. It is part of an experimental research project developed by the University of Washington, Center for Game Science, in collaboration with the UW Department of Biochemistry. The objective of Foldit is to fold the structures of selected proteins as perfectly as possible, using tools provided in the game. The highest scoring solutions are analyzed by researchers, who determine whether or not there is a native structural configuration (native state) that can be applied to relevant proteins in the real world. Scientists can then use these solutions to target and eradicate diseases and create biological innovations. A 2010 paper in the science journal Nature credited Foldit's 57,000 players with providing useful results that matched or outperformed algorithmically computed solutions. [3] [4]
Prof. David Baker, a protein research scientist at the University of Washington, founded the Foldit project. Seth Cooper was the lead game designer. Before starting the project, Baker and his laboratory coworkers relied on another research project named Rosetta [5] to predict the native structures of various proteins using special computer protein structure prediction algorithms. Rosetta was eventually extended to use the power of distributed computing: The Rosetta@home program was made available for public download, and displayed its protein-folding progress as a screensaver. Its results were sent to a central server for verification. [6]
Some Rosetta@home users became frustrated when they saw ways to solve protein structures, but could not interact with the program. Hoping that humans could improve the computers' attempts to solve protein structures, Baker approached David Salesin and Zoran Popović, computer science professors at the same university, to help conceptualize and build an interactive program, a video game, that would appeal to the public and help efforts to find native protein structures. [7] [8] [9]
Many of the same people who created Rosetta@home worked on Foldit. The public beta version was released in May 2008 [10] and has 240,000 registered players. [11]
Since 2008, Foldit has participated in Critical Assessment of Techniques for Protein Structure Prediction (CASP) experiments, submitting its best solutions to targets based on unknown protein structures. CASP is an international program to assess methods of protein structure prediction and identify those that are most productive.
Protein structure prediction is important in several fields of science, including bioinformatics, molecular biology, and medicine. Identifying natural proteins' structural configurations enables scientists to understand them better. This can lead to creating novel proteins by design, advances in treating disease, and solutions for other real-world problems such as invasive species, waste, and pollution.
The process by which living beings create the primary structure of proteins, protein biosynthesis, is reasonably well understood, as is the means by which proteins are encoded as DNA. However, determining how a given protein's primary structure becomes a functioning three-dimensional structure, how the molecule folds, is more difficult. The general process is understood, but predicting a protein's eventual, functioning structure is computationally demanding. [12] [13]
Similarly to Rosetta@home, Foldit is a means to discover native protein structures faster through distributed computing. However, Foldit has a greater emphasis on community collaboration through its forums, where users can collaborate on certain folds. [3] Furthermore, Foldit's crowdsourced approach places a greater emphasis on the user. [6] Foldit's virtual interaction and gamification create a unique and innovative environment with the potential to greatly advance protein folding research.
Foldit attempts to apply the human brain's three-dimensional pattern matching and spatial reasoning abilities to help solve the problem of protein structure prediction. 2016 puzzles are based on well-understood proteins. By analysing how humans intuitively approach these puzzles, researchers hope to improve the algorithms used by protein-folding software. [14]
Foldit includes a series of tutorials where users manipulate simple protein-like structures and a periodically updated set of puzzles based on real proteins. It shows a graphical representation of each protein which users can manipulate using a set of tools.
Foldit's developers wanted to attract as many people as possible to the cause of protein folding. So, rather than only building a useful science tool, they used gamification (the inclusion of gaming elements) to make Foldit appealing and engaging to the general public.
As a protein structure is modified, a score is calculated based on how well-folded the protein is, and a list of high scores for each puzzle is maintained. Foldit users may create and join groups, and members of groups can share puzzle solutions. Groups have been found to be useful in training new players. A separate list of group high scores is maintained, as well as two leaderboards for groups and individuals. [15]
Results from Foldit have been included in a number of scientific publications.
Foldit players have been cited collectively as "Foldit players" or "Players, F." in some cases. Individual players have also been listed as authors on at least one paper, and on four related Protein Data Bank depositions.
Foldit's toolbox is mainly for the design of protein molecules. The game's creator announced the plan to add, by 2013, the chemical building blocks of organic subcomponents to enable players to design small molecules. [25] The small molecule design system termed Drugit was tested on the Von Hippel-Lindau tumor suppressor (VHL). Results of the VHL experiment were presented in a March 2023 preprint paper [26] and at an August 2023 American Chemical Society conference session. [27]
Bioinformatics is an interdisciplinary field of science that develops methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, chemistry, physics, computer science, computer programming, information engineering, mathematics and statistics to analyze and interpret biological data. The subsequent process of analyzing and interpreting data is referred to as computational biology.
Protein folding is the physical process by which a protein, after synthesis by a ribosome as a linear chain of amino acids, changes from an unstable random coil into a more ordered three-dimensional structure. This structure permits the protein to become biologically functional.
Protein structure prediction is the inference of the three-dimensional structure of a protein from its amino acid sequence—that is, the prediction of its secondary and tertiary structure from primary structure. Structure prediction is different from the inverse problem of protein design. Protein structure prediction is one of the most important goals pursued by computational biology; it is important in medicine and biotechnology.
Top7 is an artificial protein, classified as a de novo protein. This means that the protein itself was designed to have a specific structure and functional properties.
The Human Proteome Folding Project (HPF) is a collaborative effort between New York University, the Institute for Systems Biology (ISB) and the University of Washington, using the Rosetta software developed by the Rosetta Commons. The project is managed by the Bonneau lab.
Protein design is the rational design of new protein molecules to design novel activity, behavior, or purpose, and to advance basic understanding of protein function. Proteins can be designed from scratch or by making calculated variants of a known protein structure and its sequence. Rational protein design approaches make protein-sequence predictions that will fold to specific structures. These predicted sequences can then be validated experimentally through methods such as peptide synthesis, site-directed mutagenesis, or artificial gene synthesis.
In molecular biology, protein threading, also known as fold recognition, is a method of protein modeling which is used to model those proteins which have the same fold as proteins of known structures, but do not have homologous proteins with known structure. It differs from the homology modeling method of structure prediction as it is used for proteins which do not have their homologous protein structures deposited in the Protein Data Bank (PDB), whereas homology modeling is used for those proteins which do. Threading works by using statistical knowledge of the relationship between the structures deposited in the PDB and the sequence of the protein which one wishes to model.
Rosetta@home is a volunteer computing project researching protein structure prediction on the Berkeley Open Infrastructure for Network Computing (BOINC) platform, run by the Baker lab. Rosetta@home aims to predict protein–protein docking and design new proteins with the help of about fifty-five thousand active volunteered computers processing at over 487,946 GigaFLOPS on average as of September 19, 2020. Foldit, a Rosetta@home videogame, aims to reach these goals with a crowdsourcing approach. Though much of the project is oriented toward basic research to improve the accuracy and robustness of proteomics methods, Rosetta@home also does applied research on malaria, Alzheimer's disease, and other pathologies.
The TIM barrel, also known as an alpha/beta barrel, is a conserved protein fold consisting of eight alpha helices (α-helices) and eight parallel beta strands (β-strands) that alternate along the peptide backbone. The structure is named after triose-phosphate isomerase, a conserved metabolic enzyme. TIM barrels are ubiquitous, with approximately 10% of all enzymes adopting this fold. Further, five of seven enzyme commission (EC) enzyme classes include TIM barrel proteins. The TIM barrel fold is evolutionarily ancient, with many of its members possessing little similarity today, instead falling within the twilight zone of sequence similarity.
David Baker is an American biochemist and computational biologist who has pioneered methods to predict and design the three-dimensional structures of proteins. He is the Henrietta and Aubrey Davis Endowed Professor in Biochemistry and an adjunct professor of genome sciences, bioengineering, chemical engineering, computer science, and physics at the University of Washington. He serves as the director of the Rosetta Commons, a consortium of labs and researchers that develop biomolecular structure prediction and design software. The problem of protein structure prediction to which Baker has contributed significantly has now been largely solved by DeepMind using artificial intelligence. Baker is a Howard Hughes Medical Institute investigator and a member of the United States National Academy of Sciences. He is also the director of the University of Washington's Institute for Protein Design.
In computational biology, de novo protein structure prediction refers to an algorithmic process by which protein tertiary structure is predicted from its amino acid primary sequence. The problem itself has occupied leading scientists for decades while still remaining unsolved. According to Science, the problem remains one of the top 125 outstanding issues in modern science. At present, some of the most successful methods have a reasonable probability of predicting the folds of small, single-domain proteins within 1.5 angstroms over the entire structure.
JADE1 is a protein that in humans is encoded by the JADE1 gene.
Eterna is a browser-based "game with a purpose", developed by scientists at Carnegie Mellon University and Stanford University, that engages users to solve puzzles related to the folding of RNA molecules. The project is supported by the Bill and Melinda Gates Foundation, Stanford University, and the National Institutes of Health. Prior funders include the National Science Foundation.
CS-ROSETTA is a framework for structure calculation of biological macromolecules on the basis of conformational information from NMR, which is built on top of the biomolecular modeling and design software called ROSETTA. The name CS-ROSETTA for this branch of ROSETTA stems from its origin in combining NMR chemical shift (CS) data with ROSETTA structure prediction protocols. The software package was later extended to include additional NMR conformational parameters, such as Residual Dipolar Couplings (RDC), NOE distance restraints, pseudocontact chemical shifts (PCS) and restraints derived from homologous proteins. This software can be used together with other molecular modeling protocols, such as docking to model protein oligomers. In addition, CS-ROSETTA can be combined with chemical shift resonance assignment algorithms to create a fully automated NMR structure determination pipeline. The CS-ROSETTA software is freely available for academic use and can be licensed for commercial use. A software manual and tutorials are provided on the supporting website https://csrosetta.chemistry.ucsc.edu/.
CS23D is a web server to generate 3D structural models from NMR chemical shifts. CS23D combines maximal fragment assembly with chemical shift threading, de novo structure generation, chemical shift-based torsion angle prediction, and chemical shift refinement. CS23D makes use of RefDB and ShiftX.
I-TASSER is a bioinformatics method for predicting three-dimensional structure model of protein molecules from amino acid sequences. It detects structure templates from the Protein Data Bank by a technique called fold recognition. The full-length structure models are constructed by reassembling structural fragments from threading templates using replica exchange Monte Carlo simulations. I-TASSER is one of the most successful protein structure prediction methods in the community-wide CASP experiments.
A neutral network is a set of genes all related by point mutations that have equivalent function or fitness. Each node represents a gene sequence and each line represents the mutation connecting two sequences. Neutral networks can be thought of as high, flat plateaus in a fitness landscape. During neutral evolution, genes can randomly move through neutral networks and traverse regions of sequence space which may have consequences for robustness and evolvability.
AlphaFold is an artificial intelligence (AI) program developed by DeepMind, a subsidiary of Alphabet, which performs predictions of protein structure. The program is designed as a deep learning system.
John Michael Jumper is an American senior research scientist at DeepMind Technologies. Jumper and his colleagues created AlphaFold, an artificial intelligence (AI) model to predict protein structures from their amino acid sequence with high accuracy. Jumper has stated that the AlphaFold team plans to release 100 million protein structures. The scientific journal Nature included Jumper as one of the ten "people who mattered" in science in their annual listing of Nature's 10 in 2021.