Computational methods that use protein sequence and/ or protein structure to predict protein aggregation. The table below, shows the main features of software for prediction of protein aggregation
Method | Last Update | Access (Web server/downloadable) | Principle | Input | Output | |
---|---|---|---|---|---|---|
Sequence / 3D Structure | Additional parameters | |||||
Amyloidogenic Patten [1] | 2004 | Web Server- AMYLPRED2 | Secondary structure-related Amyloidogenic pattern Submissions are scanned for the existence of this pattern {P}-{PKRHW}-[VLSCWFNQE]-[ILTYWFNE]-[FIY]-{PKRH} at identity level, with the use of a simple custom script. | sequence | - | Amyloidogenic regions |
Tango [2] [3] [4] | 2004 | Web Server-TANGO | Phenomenological Based on physico-chemical principles of secondary structure formation extended by the assumption that the core regions of an aggregate are fully buried. | sequence | pH/ionic strength | Overall aggregation and amyloidoidogenic regions |
Average Packing Density [5] | 2006 | Web Server-AMYLPRED2 | Secondary structure-related Relates average packing density of residues to the formation of amyloid fibrils. | sequence | - | Amyloidogenic regions |
Beta-strand contiguity [6] | 2007 | Web Server- AMYLPRED2 | Phenomenological Prediction of B-strand propensity score to locate in the amyloid fibril. | sequence | - | beta-strand formation |
Hexapeptide Conformational Energy /Pre-amyl [7] | 2007 | Web Server- AMYLPRED2 | Secondary structure-related Hexapeptides of a submitted protein are threaded onto over 2500 templates of microcrystallic structure of NNQQNY, energy values below -27.00 are considered as hits. | sequence | - | Amyloidogenic regions and energy |
AGGRESCAN [8] | 2007 | Web Servers -AMLYPRED2 & AGGRESCAN | Phenomenological Prediction of 'aggregation-prone' in protein sequences, based on an aggregation propensity scale for natural amino acids derived from in vivo experiments. | sequence | - | Overall aggregation and amyloidogenic regions |
Salsa [9] | 2007 | Web server - AMYPdb [10] | Phenomenological Prediction of the aggregation propensities single or multiple sequences based on physicochemical properties. | sequence | hot spot length | Amyloidogenic regions |
Pafig [11] | 2009 | Web server- AMYLPRED2 | Phenomenological Identification of Hexapeptides associated to amyloid fibrillar aggregates. | sequence | - | Amyloidogenic regions |
Net-CSSP [12] [13] [14] [15] | 2020 | Web Server - Net-CSSP | Secondary structure-related Quantification of the influence of the tertiary interation on secondary structural preference. | sequence/pdb | single/dual network-threshold | Amyloidogenic propensity regions |
Betascan [16] | 2009 | Web Server - Betascan Download - Betascan | Secondary structure-related Predict the probability that particular portions of a protein will form amyloid. | sequence | length | Amyloidogenic regions |
FoldAmyloid [17] | 2010 | Web Server - FoldAmyloid | Secondary structure-related Prediction of amyloid regions using expected probability of hydrogen bonds formation and packing densitites of residues. | sequence | scale, threshold, averaging frame | Amyloidogenic regions |
Waltz [18] [19] | 2010 | Web Server - Waltz & | Secondary structure-related Application of position-specific substitution matrices (PSSM) obtained from amyloidogenic peptides. | sequence | pH, specificity, sensitivity | Amyloidogenic regions |
Zipper DB [20] [21] [22] [23] | 2010 | Web Server- Zipper DB | Secondary structure-related Structure based prediction of fribrillation propoensities, using crystal strucutrue of the fibril forming peptide NNQQNY from the sup 35 prion protein of Saccharomyces cerevisiae. | sequence | - | Amyloidogenic regions and, energy and beta-sheet conformation |
STITCHER [24] | 2012 | Web Server - Stitcher (currently offline) | Secondary structure-related | sequence | - | Amyloidogenic regions |
MetAmyl [25] [26] [27] [28] | 2013 | Web Server - MetAmyl | Consensus method Amyloidogenic patterns, average packing density, beta-strand contiguity, pafig, Net-CSSP, STITCHER | sequence | threshold | Overall generic and amyloidogenic regions based on the consensus |
AmylPred2 [29] | 2013 | Web Server - AMYLPRED2 | Consensus method Amyloidogenic patterns, average packing density, beta-strand contiguity, pafig, Net-CSSP, STITCHER | sequence | - | Overall generic and amyloidogenic regions based on the consensus |
PASTA 2.0 [30] | 2014 | Web Server - PASTA 2.0 | Secondary structure-related Predicts the most aggregation-prone portions and the corresponding β-strand inter-molecular pairing for multiple input sequences. | sequence | top pairings and energies, mutations and protein-protein | Amyloidogenic regions, energy, and beta-sheet orientation in aggregates |
FISH Amyloid [31] | 2014 | Web Server - Comprec (currently offline) | Secondary structure-related | sequence | threshold | Amyloidogenic regions |
GAP [32] [33] [34] [35] | 2014 | Web Server - GAP | Secondary structure-related Identification of amyloid forming peptides and amorphous peptides using a dataset of 139 amyloids and 168 amorphous peptides. | sequence | - | Overall aggregation and amyloidogenic regions |
APPNN [36] | 2015 | Download - CRAN | Phenomenological Amyloidogenicity propensity predictor based on a machine learning approach through recursive feature selection and feed-forward neural networks, taking advantage of newly published sequences with experimental, in vitro, evidence of amyloid formation. | sequence | - | Amyloidogenic regions |
ArchCandy [37] | 2015 | Download- BiSMM | Secondary structure-related Based on an assumption that protein sequences that are able to form β-arcades are amyloidogenic. | sequence | - | Amyloidogenic regions |
Amyload [38] | 2015 | Web Server - Comprec (currently offline) | Consensus method | sequence | - | Overall generic and amyloidogenic regions |
SolubiS [39] [40] | 2016 | Web Server - SolubiS | 3D structure | pdb file | chain, threshold, gatekeeper | Aggregation propensity and stability vs mutations |
CamSol Structurally Corrected [41] [42] | 2017 | Web Server - Chemistry of Health | 3D structure | pdb file | pH, patch radius | Exposed aggregation-prone patches and mutated variants design |
CamSol intrinsic [43] [44] | 2017 | Web Server- Chemistry of Health | Phenomenological Sequence-based method of predicting protein solubility and generic aggregation propensity. | sequence | pH | Calculation of the overall intrinsic solubility score and solubility profile |
AmyloGram [45] | 2017 | Web Server - AmyloGram | Phenomenological AmyloGram predicts amyloid proteins using n-gram encoding and random forests. | sequence | - | Overall aggregation and amyloidogenic regions |
BetaSerpentine [46] | 2017 | Web Server - BetaSerpentine-1.0 | Sequence-related Reconstruction of amyloid structures containing adjacent β-arches. | sequence | - | Amyloidogenic regions |
AggScore [47] | 2018 | AggScore is available through Schrödinger's BioLuminate Suite as of software release 2018-1. | Secondary structure-related Method that uses the distribution of hydrophobic and electrostatic patches on the surface of the protein, factoring in the intensity and relative orientation of the respective surface patches into an aggregation propensity function that has been trained on a benchmark set of 31 adnectin proteins. | sequence | - | Amyloidogenic regions |
AggreRATE-Pred [48] | 2018 | Web Server - AggreRAE-Pred | Secondary structure-related Predict changes in aggregation rate upon point mutations | sequence pdb | mutations | |
AGGRESCAN 3D 2.0 [49] [50] [51] [52] [53] | 2019 | Web Server - Aggrescan3D | 3D structure | pdb file | dynamic mode, mutations, patch radius, stability, enhance solubility | Dynamic exposed aggregation-prone patches and mutated variants design |
Budapest amyloid predictor [54] | 2021 | Web Server - Budapest amyloid predictor | Hexapeptide | sequence | Amyloidgenecity of hexapeptide | |
ANuPP [55] | 2021 | Web Server - ANuPP | Hexapeptide and Sequence Identification amyloid-fibril forming peptides and regions in protein sequences | sequence | Amyloidogenic hexapeptides and aggregation prone regions |
The beta sheet, (β-sheet) is a common motif of the regular protein secondary structure. Beta sheets consist of beta strands (β-strands) connected laterally by at least two or three backbone hydrogen bonds, forming a generally twisted, pleated sheet. A β-strand is a stretch of polypeptide chain typically 3 to 10 amino acids long with backbone in an extended conformation. The supramolecular association of β-sheets has been implicated in the formation of the fibrils and protein aggregates observed in amyloidosis, Alzheimer's disease and other proteinopathies.
Amyloids are aggregates of proteins characterised by a fibrillar morphology of typically 7–13 nm in diameter, a β-sheet secondary structure and ability to be stained by particular dyes, such as Congo red. In the human body, amyloids have been linked to the development of various diseases. Pathogenic amyloids form when previously healthy proteins lose their normal structure and physiological functions (misfolding) and form fibrous deposits within and around cells. These protein misfolding and deposition processes disrupt the healthy function of tissues and organs.
Amylin, or islet amyloid polypeptide (IAPP), is a 37-residue peptide hormone. It is co-secreted with insulin from the pancreatic β-cells in the ratio of approximately 100:1 (insulin:amylin). Amylin plays a role in glycemic regulation by slowing gastric emptying and promoting satiety, thereby preventing post-prandial spikes in blood glucose levels.
Amyloid beta denotes peptides of 36–43 amino acids that are the main component of the amyloid plaques found in the brains of people with Alzheimer's disease. The peptides derive from the amyloid-beta precursor protein (APP), which is cleaved by beta secretase and gamma secretase to yield Aβ in a cholesterol-dependent process and substrate presentation. Aβ molecules can aggregate to form flexible soluble oligomers which may exist in several forms. It is now believed that certain misfolded oligomers can induce other Aβ molecules to also take the misfolded oligomeric form, leading to a chain reaction akin to a prion infection. The oligomers are toxic to nerve cells. The other protein implicated in Alzheimer's disease, tau protein, also forms such prion-like misfolded oligomers, and there is some evidence that misfolded Aβ can induce tau to misfold.
In academia, computational immunology is a field of science that encompasses high-throughput genomic and bioinformatics approaches to immunology. The field's main aim is to convert immunological data into computational problems, solve these problems using mathematical and computational approaches and then convert these results into immunologically meaningful interpretations.
Alpha sheet is an atypical secondary structure in proteins, first proposed by Linus Pauling and Robert Corey in 1951. The hydrogen bonding pattern in an alpha sheet is similar to that of a beta sheet, but the orientation of the carbonyl and amino groups in the peptide bond units is distinctive; in a single strand, all the carbonyl groups are oriented in the same direction on one side of the pleat, and all the amino groups are oriented in the same direction on the opposite side of the sheet. Thus the alpha sheet accumulates an inherent separation of electrostatic charge, with one edge of the sheet exposing negatively charged carbonyl groups and the opposite edge exposing positively charged amino groups. Unlike the alpha helix and beta sheet, the alpha sheet configuration does not require all component amino acid residues to lie within a single region of dihedral angles; instead, the alpha sheet contains residues of alternating dihedrals in the traditional right-handed (αR) and left-handed (αL) helical regions of Ramachandran space. Although the alpha sheet is only rarely observed in natural protein structures, it has been speculated to play a role in amyloid disease and it was found to be a stable form for amyloidogenic proteins in molecular dynamics simulations. Alpha sheets have also been observed in X-ray crystallography structures of designed peptides.
Structural and physical properties of DNA provide important constraints on the binding sites formed on surfaces of DNA-binding proteins. Characteristics of such binding sites may be used for predicting DNA-binding sites from the structural and even sequence properties of unbound proteins. This approach has been successfully implemented for predicting the protein–protein interface. Here, this approach is adopted for predicting DNA-binding sites in DNA-binding proteins. First attempt to use sequence and evolutionary features to predict DNA-binding sites in proteins was made by Ahmad et al. (2004) and Ahmad and Sarai (2005). Some methods use structural information to predict DNA-binding sites and therefore require a three-dimensional structure of the protein, while others use only sequence information and do not require protein structure in order to make a prediction.
Anders Krogh is a bioinformatician at the University of Copenhagen, where he leads the university's bioinformatics center. He is known for his pioneering work on the use of hidden Markov models in bioinformatics, and is co-author of a widely used textbook in bioinformatics. In addition, he also co-authored one of the early textbooks on neural networks. His current research interests include promoter analysis, non-coding RNA, gene prediction and protein structure prediction.
Computational Resources for Drug Discovery (CRDD) is one of the important silico modules of Open Source for Drug Discovery (OSDD). The CRDD web portal provides computer resources related to drug discovery on a single platform. It provides computational resources for researchers in computer-aided drug design, a discussion forum, and resources to maintain a wiki related to drug discovery, predict inhibitors, and predict the ADME-Tox property of molecules. One of the major objectives of CRDD is to promote open source software in the field of chemoinformatics and pharmacoinformatics.
David Tudor Jones is a Professor of Bioinformatics, and Head of Bioinformatics Group in the University College London. He is also the director in Bloomsbury Center for Bioinformatics, which is a joint Research Centre between UCL and Birkbeck, University of London and which also provides bioinformatics training and support services to biomedical researchers. In 2013, he is a member of editorial boards for PLoS ONE, BioData Mining, Advanced Bioinformatics, Chemical Biology & Drug Design, and Protein: Structure, Function and Bioinformatics.
Molecular recognition features (MoRFs) are small intrinsically disordered regions in proteins that undergo a disorder-to-order transition upon binding to their partners. MoRFs are implicated in protein-protein interactions, which serve as the initial step in molecular recognition. MoRFs are disordered prior to binding to their partners, whereas they form a common 3D structure after interacting with their partners. As MoRF regions tend to resemble disordered proteins with some characteristics of ordered proteins, they can be classified as existing in an extended semi-disordered state.
p3 peptide also known as amyloid β- peptide (Aβ)17–40/42 is the peptide resulting from the α- and γ-secretase cleavage from the amyloid precursor protein (APP). It is known to be the major constituent of diffuse plaques observed in Alzheimer's disease (AD) brains and pre-amyloid plaques in people affected by Down syndrome. However, p3 peptide's role in these diseases is not truly known yet.
DIMPL is a bioinformatic pipeline that enables the extraction and selection of bacterial GC-rich intergenic regions (IGRs) that are enriched for structured non-coding RNAs (ncRNAs). The method of enriching bacterial IGRs for ncRNA motif discovery was first reported for a study in "Genome-wide discovery of structured noncoding RNAs in bacteria".
{{cite journal}}
: Cite journal requires |journal=
(help)