PEAKS

Last updated

PEAKS
Developer(s) Bioinformatics Solutions Inc
Stable release
PEAKS 8.5 / October 12, 2017
Operating system Windows
Type Mass Spec Protein Identification, De Novo Sequencing, Database Search & Quantification
License Proprietary commercial software
Website http://www.bioinfor.com/

PEAKS is a proteomics software program for tandem mass spectrometry designed for peptide sequencing, protein identification and quantification.

Description

PEAKS is commonly used for peptide identification (Protein ID) through de novo peptide sequencing assisted search engine database searching. [1] PEAKS has also integrated PTM and mutation characterization through automatic peptide sequence tag based searching (SPIDER) [2] and PTM Identification. [3]

PEAKS provides a complete sequence for each peptide, confidence scores on individual amino acid assignments, simple reporting for high-throughput analysis, amongst other information.

The software has the ability to compare results of multiple search engines. PEAKS inChorus will cross check test results automatically with other protein ID search engines, like Sequest, OMSSA, X!Tandem and Mascot. This approach guards against false positive peptide assignments.

PEAKS Q is an add-on tool for protein quantification, supporting label (ICAT, iTRAQ, SILAC, TMT, 018, etc.) and label free techniques.

SPIDER is a sequence tag based search tool within PEAKS, which deals with the possible overlaps between the de novo sequencing errors and the homology mutations. It reconstructs the real peptide sequence by combining both the de novo sequence tag and the homolog, automatically and efficiently. [2]

SPIDER-SequenceTagHomologySearchTool.JPG

A collection of algorithms used within the PEAKS software have been adapted and configured into a specialized project, PEAKS AB, which has proven to be the first method for automatic monoclonal antibody sequencing. [4]

Notes

  1. Zhang, J.; Xin, L.; Shan, B.; Chen, W.; Xie, M.; Yuen, D.; Zhang, W.; Zhang, Z.; Lajoie, G.; Ma, B. (2011). "PEAKS DB: De Novo Sequencing Assisted Database Search for Sensitive and Accurate Peptide Identification". Molecular & Cellular Proteomics. 11 (4): M111.010587. doi:10.1074/mcp.M111.010587. PMC   3322562 . PMID   22186715.
  2. 1 2 Ma B, Johnson. De Novo sequencing and homology searching Molecular & Cellular Proteomics. 10.1074/mcp.O111.014902 (2011).
  3. Han, X.; He, L.; Xin, L.; Shan, B.; Ma, B. (2011). "PEAKS PTM: Mass Spectrometry Based Identification of Peptides with Unspecified Modifications". Journal of Proteome Research. 10 (7): 2930–2936. doi:10.1021/pr200153k. PMID   21609001.
  4. Tran, Ngoc Hieu; Rahman, M. Ziaur; He, Lin; Xin, Lei; Shan, Baozhen; Li, Ming (26 August 2016). "Complete De Novo Assembly of Monoclonal Antibody Sequences". Scientific Reports. 6: 31730. doi:10.1038/srep31730. ISSN   2045-2322. PMC   4999880 . PMID   27562653.

Related Research Articles

Proteomics Large-scale study of proteins

Proteomics is the large-scale study of proteins. Proteins are vital parts of living organisms, with many functions. The proteome is the entire set of proteins produced or modified by an organism or system. Proteomics enables the identification of ever-increasing numbers of proteins. This varies with time and distinct requirements, or stresses, that a cell or organism undergoes. Proteomics is an interdisciplinary domain that has benefitted greatly from the genetic information of various genome projects, including the Human Genome Project. It covers the exploration of proteomes from the overall level of protein composition, structure, and activity, and is an important component of functional genomics.

Tandem mass spectrometry

Tandem mass spectrometry, also known as MS/MS or MS2, is a technique in instrumental analysis where two or more mass analyzers are coupled together using an additional reaction step to increase their abilities to analyse chemical samples. A common use of tandem MS is the analysis of biomolecules, such as proteins and peptides.

Protein sequencing

Protein sequencing is the practical process of determining the amino acid sequence of all or part of a protein or peptide. This may serve to identify the protein or characterize its post-translational modifications. Typically, partial sequencing of a protein provides sufficient information to identify it with reference to databases of protein sequences derived from the conceptual translation of genes.

Peptide mass fingerprinting

Peptide mass fingerprinting (PMF) is an analytical technique for protein identification in which the unknown protein of interest is first cleaved into smaller peptides, whose absolute masses can be accurately measured with a mass spectrometer such as MALDI-TOF or ESI-TOF. The method was developed in 1993 by several groups independently. The peptide masses are compared to either a database containing known protein sequences or even the genome. This is achieved by using computer programs that translate the known genome of the organism into proteins, then theoretically cut the proteins into peptides, and calculate the absolute masses of the peptides from each protein. They then compare the masses of the peptides of the unknown protein to the theoretical peptide masses of each protein encoded in the genome. The results are statistically analyzed to find the best match.

Immunoproteomics

Immunoproteomics is the study of large sets of proteins (proteomics) involved in the immune response.

The Trans-Proteomic Pipeline (TPP) is an open-source data analysis software for proteomics developed at the Institute for Systems Biology (ISB) by the Ruedi Aebersold group under the Seattle Proteome Center. The TPP includes PeptideProphet, ProteinProphet, ASAPRatio, XPRESS and Libra.

A peptide sequence tag is a piece of information about a peptide obtained by tandem mass spectrometry that can be used to identify this peptide in a protein database.

Mascot is a software search engine that uses mass spectrometry data to identify proteins from peptide sequence databases. Mascot is widely used by research facilities around the world. Mascot uses a probabilistic scoring algorithm for protein identification that was adapted from the MOWSE algorithm. Mascot is freely available to use on the website of Matrix Science. A license is required for in-house use where more features can be incorporated.

Protein mass spectrometry

Protein mass spectrometry refers to the application of mass spectrometry to the study of proteins. Mass spectrometry is an important method for the accurate mass determination and characterization of proteins, and a variety of methods and instrumentations have been developed for its many uses. Its applications include the identification of proteins and their post-translational modifications, the elucidation of protein complexes, their subunits and functional interactions, as well as the global measurement of proteins in proteomics. It can also be used to localize proteins to the various organelles, and determine the interactions between different proteins as well as with membrane lipids.

Shotgun proteomics refers to the use of bottom-up proteomics techniques in identifying proteins in complex mixtures using a combination of high performance liquid chromatography combined with mass spectrometry. The name is derived from shotgun sequencing of DNA which is itself named after the rapidly expanding, quasi-random firing pattern of a shotgun. The most common method of shotgun proteomics starts with the proteins in the mixture being digested and the resulting peptides are separated by liquid chromatography. Tandem mass spectrometry is then used to identify the peptides.

Bottom-up proteomics

Bottom-up proteomics is a common method to identify proteins and characterize their amino acid sequences and post-translational modifications by proteolytic digestion of proteins prior to analysis by mass spectrometry. The major alternative workflow used in proteomics is called top-down proteomics where intact proteins are purified prior to digestion and/or fragmentation either within the mass spectrometer or by 2D electrophoresis. Essentially, bottom-up proteomics is a relatively simple and reliable means of determining the protein make-up of a given sample of cells, tissues, etc.

Quantitative proteomics

Quantitative proteomics is an analytical chemistry technique for determining the amount of proteins in a sample. The methods for protein identification are identical to those used in general proteomics, but include quantification as an additional dimension. Rather than just providing lists of proteins identified in a certain sample, quantitative proteomics yields information about the physiological differences between two biological samples. For example, this approach can be used to compare samples from healthy and diseased patients. Quantitative proteomics is mainly performed by two-dimensional gel electrophoresis (2-DE) or mass spectrometry (MS). However, a recent developed method of quantitative dot blot (QDB) analysis is able to measure both the absolute and relative quantity of an individual proteins in the sample in high throughput format, thus open a new direction for proteomic research. In contrast to 2-DE, which requires MS for the downstream protein identification, MS technology can identify and quantify the changes.

PLEKHA6

Pleckstrin homology domain-containing family A member 6 is a protein that in humans is encoded by the PLEKHA6 gene.

Isobaric labeling

Isobaric labeling is a mass spectrometry strategy used in quantitative proteomics. Peptides or proteins are labeled with various chemical groups that are identical masses (isobaric), but vary in terms of distribution of heavy isotopes around their structure. These tags, commonly referred to as tandem mass tags, are designed so that the mass tag is cleaved at a specific linker region upon high-energy CID (HCD) during tandem mass spectrometry yielding reporter ions of different masses. The most common isobaric tags are amine-reactive tags. However, tags that react with cysteine residues and carbonyl groups have also been described. These amine-reactive groups go through N-hydroxysuccinimide (NHS) reactions, which are based around three types of functional groups. Isobaric labeling methods include tandem mass tags (TMT), isobaric tags for relative and absolute quantification (iTRAQ), mass differential tags for absolute and relative quantification, and dimethyl labeling. TMTs and iTRAQ methods are most common and developed of these methods. Tandem mass tags have a mass reporter region, a cleavable linker region, a mass normalization region, and a protein reactive group and have the same total mass.

Proteogenomics

Proteogenomics is a field of biological research that utilizes a combination of proteomics, genomics, and transcriptomics to aid in the discovery and identification of peptides. Proteogenomics is used to identify new peptides by comparing MS/MS spectra against a protein database that has been derived from genomic and transcriptomic information. Proteogenomics often refers to studies that use proteomic information, often derived from mass spectrometry, to improve gene annotations. Genomics deals with the genetic code of entire organisms, while transcriptomics deals with the study of RNA sequencing and transcripts. Proteomics utilizes tandem mass spectrometry and liquid chromatography to identify and study the functions of proteins. Proteomics is being utilized to discover all the proteins expressed within an organism, known as its proteome. The issue with proteomics is that it relies on the assumption that current gene models are correct and that the correct protein sequences can be found using a reference protein sequence database; however, this is not always the case as some peptides cannot be located in the database. In addition, novel protein sequences can occur through mutations. these issues can be fixed with the use of proteomic, genomic, and trancriptomic data. The utilization of both proteomics and genomics led to proteogenomics which became its own field in 2004.

In bio-informatics, a peptide-mass fingerprint or peptide-mass map is a mass spectrum of a mixture of peptides that comes from a digested protein being analyzed. The mass spectrum serves as a fingerprint in the sense that it is a pattern that can serve to identify the protein. The method for forming a peptide-mass fingerprint, developed in 1993, consists of isolating a protein, breaking it down into individual peptides, and determining the masses of the peptides through some form of mass spectrometry. Once formed, a peptide-mass fingerprint can be used to search in databases for related protein or even genomic sequences, making it a powerful tool for annotation of protein-coding genes.

Single-cell analysis Testbg biochemical processes and reactions in an individual cell

In the field of cellular biology, single-cell analysis is the study of genomics, transcriptomics, proteomics, metabolomics and cell–cell interactions at the single cell level. Due to the heterogeneity seen in both eukaryotic and prokaryotic cell populations, analyzing a single cell makes it possible to discover mechanisms not seen when studying a bulk population of cells. Technologies such as fluorescence-activated cell sorting (FACS) allow the precise isolation of selected single cells from complex samples, while high throughput single cell partitioning technologies, enable the simultaneous molecular analysis of hundreds or thousands of single unsorted cells; this is particularly useful for the analysis of transcriptome variation in genotypically identical cells, allowing the definition of otherwise undetectable cell subtypes. The development of new technologies is increasing our ability to analyze the genome and transcriptome of single cells, as well as to quantify their proteome and metabolome. Mass spectrometry techniques have become important analytical tools for proteomic and metabolomic analysis of single cells. Recent advances have enabled quantifying thousands of protein across hundreds of single cells, and thus make possible new types of analysis. In situ sequencing and fluorescence in situ hybridization (FISH) do not require that cells be isolated and are increasingly being used for analysis of tissues.

In mass spectrometry, de novo peptide sequencing is the method in which a peptide amino acid sequence is determined from tandem mass spectrometry.

Degradomics Sub-discipline of biology

Degradomics is a sub-discipline of biology encompassing all the genomic and proteomic approaches devoted to the study of proteases, their inhibitors, and their substrates on a system-wide scale. This includes the analysis of the protease and protease-substrate repertoires, also called "protease degradomes". The scope of these degradomes can range from cell, tissue, and organism-wide scales.