In bio-informatics, a peptide-mass fingerprint or peptide-mass map is a mass spectrum of a mixture of peptides that comes from a digested protein being analyzed. The mass spectrum serves as a fingerprint in the sense that it is a pattern that can serve to identify the protein. [1] The method for forming a peptide-mass fingerprint, developed in 1993, consists of isolating a protein, breaking it down into individual peptides, and determining the masses of the peptides through some form of mass spectrometry. [2] Once formed, a peptide-mass fingerprint can be used to search in databases for related protein or even genomic sequences, making it a powerful tool for annotation of protein-coding genes. [3]
One major advantage to mass fingerprinting is that it is significantly faster to carry out than peptide sequencing, yet the results are equally useful. [4] Disadvantages include the need for a single protein for analysis and the requirement that the protein sequence is located, at least with significant homology, in a database. Because the mass of individual peptides is measured in forming a fingerprint, mixtures of different proteins can yield unreliable results. Therefore, sample preparation is an important step in the process. Even then, if reliable results are obtained, there must be a matching peptide sequence in the database you are searching in order for the results to be useful. [5]
Before analyzing with mass spectrometry, a protein must be accurately isolated and digested. If not isolated, the results will represent a mixture of two or more proteins and will therefore be unreliable in protein identification. Because of this sensitivity, sample preparation is likely the most important step in forming a peptide-mass fingerprint.
Isolation of a specific protein is most often done through a form of gel electrophoresis, in which proteins are separated by size and can be subsequently extracted for further preparation. However, they can also be isolated by liquid chromatography. This method also separates proteins by size. [6]
Once an individual protein is isolated, it needs to be digested and fractionated for further analysis by a spectrometer. This is done by the addition of proteolytic enzymes such as trypsin and chymotrypsin. [7]
Another method commonly used that combines both the isolation and digestion steps is SDS-PAGE, a form of electrophoresis that separates and fractionates proteins simultaneously.
The digested protein can be analyzed with different types of mass spectrometers such as ESI-TOF or MALDI-TOF. MALDI-TOF is often the preferred instrument because it allows a high sample throughput and several proteins can be analyzed in a single experiment, if complemented by MS/MS analysis. [8]
In matrix-assisted laser desorption ionization (MALDI), a fragmented peptide sample is loaded onto a matrix and ionized through the use of a high energy laser. The fragmented ions are then separated by mass-to-charge ratio based on the time of flight (TOF) through the spectrometer. They can then be further fragmented and re-analyzed in tandem mass spectrometry, often with a quadrupole ion trap, [9] but also possible with tandem time of flight. [10]
The output received from a mass spectrometer comes in the form of a peak list. This spectrum shows the masses and relative abundances of the peptide fragments present in the sample. In reading a spectrum like the one shown, all possible major fragmentations of a protein would need to be considered. Then the masses of those fragments would correlate to the numbers in the peaks of the spectrum. While it can be analyzed to some degree on its own, in forming a peptide-mass fingerprint, the peak list is run through a database search to find homologous peptide sequences.
The peak list obtained through spectrometric means is used as the query in a database search using the software MASCOT. [11] The MASCOT software uses an algorithm that looks for significant peptide sequence homology to present the most statistically likely protein in the sample, based on the results.
In performing the search, you much choose a database to go through. Such databases include, among others, Swissprot, often used when researching well characterized organisms like humans, mice, and yeasts; and NCBInr for more general, robust searches.
A detailed tutorial on using MASCOT software can be found in a link below.
The use of a peptide-mass fingerprint is fairly widespread in proteomic research. Some specific examples of how it has been used in the field are as follows:
The authors of this study sought to determine which yeasts were metabolically active at lower temperatures and could therefore be used for colder industrial processes. They grew various yeasts on medium at different temperatures, then determined enzyme activity by separating proteins on a gel and fingerprinting the individual bands. Through database search they found the enzyme of interest and discovered two individual yeasts that had higher activity at lower temperatures. [12]
The authors of this study sought to determine the effect on metabolism of the drug risperidone in schizophrenia patients. After discovering that risperidone did have negative metabolic side effects, they tested membrane proteins for glucose and lipid transport in control and experimental groups by MALDI-TOF and fingerprinting. Results showed altered fingerprints and therefore altered levels of folding in the proteins. So, they concluded that risperidone negatively effects glucose and lipid transport proteins in the cell membranes of patients. [13]
Mass spectrometry (MS) is an analytical technique that is used to measure the mass-to-charge ratio of ions. The results are presented as a mass spectrum, a plot of intensity as a function of the mass-to-charge ratio. Mass spectrometry is used in many different fields and is applied to pure samples as well as complex mixtures.
Tandem mass spectrometry, also known as MS/MS or MS2, is a technique in instrumental analysis where two or more mass analyzers are coupled together using an additional reaction step to increase their abilities to analyse chemical samples. A common use of tandem MS is the analysis of biomolecules, such as proteins and peptides.
Protein sequencing is the practical process of determining the amino acid sequence of all or part of a protein or peptide. This may serve to identify the protein or characterize its post-translational modifications. Typically, partial sequencing of a protein provides sufficient information to identify it with reference to databases of protein sequences derived from the conceptual translation of genes.
Peptide mass fingerprinting (PMF) is an analytical technique for protein identification in which the unknown protein of interest is first cleaved into smaller peptides, whose absolute masses can be accurately measured with a mass spectrometer such as MALDI-TOF or ESI-TOF. The method was developed in 1993 by several groups independently. The peptide masses are compared to either a database containing known protein sequences or even the genome. This is achieved by using computer programs that translate the known genome of the organism into proteins, then theoretically cut the proteins into peptides, and calculate the absolute masses of the peptides from each protein. They then compare the masses of the peptides of the unknown protein to the theoretical peptide masses of each protein encoded in the genome. The results are statistically analyzed to find the best match.
In mass spectrometry, matrix-assisted laser desorption/ionization (MALDI) is an ionization technique that uses a laser energy-absorbing matrix to create ions from large molecules with minimal fragmentation. It has been applied to the analysis of biomolecules and various organic molecules, which tend to be fragile and fragment when ionized by more conventional ionization methods. It is similar in character to electrospray ionization (ESI) in that both techniques are relatively soft ways of obtaining ions of large molecules in the gas phase, though MALDI typically produces far fewer multi-charged ions.
Immunoproteomics is the study of large sets of proteins (proteomics) involved in the immune response.
A peptide sequence tag is a piece of information about a peptide obtained by tandem mass spectrometry that can be used to identify this peptide in a protein database.
Surface-enhanced laser desorption/ionization (SELDI) is a soft ionization method in mass spectrometry (MS) used for the analysis of protein mixtures. It is a variation of matrix-assisted laser desorption/ionization (MALDI). In MALDI, the sample is mixed with a matrix material and applied to a metal plate before irradiation by a laser, whereas in SELDI, proteins of interest in a sample become bound to a surface before MS analysis. The sample surface is a key component in the purification, desorption, and ionization of the sample. SELDI is typically used with time-of-flight (TOF) mass spectrometers and is used to detect proteins in tissue samples, blood, urine, or other clinical samples, however, SELDI technology can potentially be used in any application by simply modifying the sample surface.
Mascot is a software search engine that uses mass spectrometry data to identify proteins from peptide sequence databases. Mascot is widely used by research facilities around the world. Mascot uses a probabilistic scoring algorithm for protein identification that was adapted from the MOWSE algorithm. Mascot is freely available to use on the website of Matrix Science. A license is required for in-house use where more features can be incorporated.
Electron-transfer dissociation (ETD) is a method of fragmenting multiply-charged gaseous macromolecules in a mass spectrometer between the stages of tandem mass spectrometry (MS/MS). Similar to electron-capture dissociation, ETD induces fragmentation of large, multiply-charged cations by transferring electrons to them. ETD is used extensively with polymers and biological molecules such as proteins and peptides for sequence analysis. Transferring an electron causes peptide backbone cleavage into c- and z-ions while leaving labile post translational modifications (PTM) intact. The technique only works well for higher charge state peptide or polymer ions (z>2). However, relative to collision-induced dissociation (CID), ETD is advantageous for the fragmentation of longer peptides or even entire proteins. This makes the technique important for top-down proteomics. The method was developed by Hunt and coworkers at the University of Virginia.
MALDI mass spectrometry imaging (MALDI-MSI) is the use of matrix-assisted laser desorption ionization as a mass spectrometry imaging technique in which the sample, often a thin tissue section, is moved in two dimensions while the mass spectrum is recorded. Advantages, like measuring the distribution of a large amount of analytes at one time without destroying the sample, make it a useful method in tissue-based study.
Protein mass spectrometry refers to the application of mass spectrometry to the study of proteins. Mass spectrometry is an important method for the accurate mass determination and characterization of proteins, and a variety of methods and instrumentations have been developed for its many uses. Its applications include the identification of proteins and their post-translational modifications, the elucidation of protein complexes, their subunits and functional interactions, as well as the global measurement of proteins in proteomics. It can also be used to localize proteins to the various organelles, and determine the interactions between different proteins as well as with membrane lipids.
Time-of-flight mass spectrometry (TOFMS) is a method of mass spectrometry in which an ion's mass-to-charge ratio is determined by a time of flight measurement. Ions are accelerated by an electric field of known strength. This acceleration results in an ion having the same kinetic energy as any other ion that has the same charge. The velocity of the ion depends on the mass-to-charge ratio. The time that it subsequently takes for the ion to reach a detector at a known distance is measured. This time will depend on the velocity of the ion, and therefore is a measure of its mass-to-charge ratio. From this ratio and known experimental parameters, one can identify the ion.
Shotgun proteomics refers to the use of bottom-up proteomics techniques in identifying proteins in complex mixtures using a combination of high performance liquid chromatography combined with mass spectrometry. The name is derived from shotgun sequencing of DNA which is itself named after the rapidly expanding, quasi-random firing pattern of a shotgun. The most common method of shotgun proteomics starts with the proteins in the mixture being digested and the resulting peptides are separated by liquid chromatography. Tandem mass spectrometry is then used to identify the peptides.
Top-down proteomics is a method of protein identification that either uses an ion trapping mass spectrometer to store an isolated protein ion for mass measurement and tandem mass spectrometry (MS/MS) analysis or other protein purification methods such as two-dimensional gel electrophoresis in conjunction with MS/MS. Top-down proteomics is capable of identifying and quantitating unique proteoforms through the analysis of intact proteins. The name is derived from the similar approach to DNA sequencing. During mass spectrometry intact proteins are typically ionized by electrospray ionization and trapped in a Fourier transform ion cyclotron resonance, quadrupole ion trap or Orbitrap mass spectrometer. Fragmentation for tandem mass spectrometry is accomplished by electron-capture dissociation or electron-transfer dissociation. Effective fractionation is critical for sample handling before mass-spectrometry-based proteomics. Proteome analysis routinely involves digesting intact proteins followed by inferred protein identification using mass spectrometry (MS). Top-down MS (non-gel) proteomics interrogates protein structure through measurement of an intact mass followed by direct ion dissociation in the gas phase.
Bottom-up proteomics is a common method to identify proteins and characterize their amino acid sequences and post-translational modifications by proteolytic digestion of proteins prior to analysis by mass spectrometry. The major alternative workflow used in proteomics is called top-down proteomics where intact proteins are purified prior to digestion and/or fragmentation either within the mass spectrometer or by 2D electrophoresis. Essentially, bottom-up proteomics is a relatively simple and reliable means of determining the protein make-up of a given sample of cells, tissues, etc.
Quantitative proteomics is an analytical chemistry technique for determining the amount of proteins in a sample. The methods for protein identification are identical to those used in general proteomics, but include quantification as an additional dimension. Rather than just providing lists of proteins identified in a certain sample, quantitative proteomics yields information about the physiological differences between two biological samples. For example, this approach can be used to compare samples from healthy and diseased patients. Quantitative proteomics is mainly performed by two-dimensional gel electrophoresis (2-DE), preparative one-dimensional gel electrophoresis, or mass spectrometry (MS). However, a recent developed method of quantitative dot blot (QDB) analysis is able to measure both the absolute and relative quantity of an individual proteins in the sample in high throughput format, thus open a new direction for proteomic research. In contrast to 2-DE, which requires MS for the downstream protein identification, MS technology can identify and quantify the changes.
Stable isotope standards and capture by anti-peptide antibodies (SISCAPA) is a mass spectrometry method for measuring the amount of a protein in a biological sample.
Zooarchaeology by mass spectrometry, commonly referred to by the abbreviation ZooMS, is a scientific method that identifies animal species by means of characteristic peptide sequences in the protein collagen. ZooMS is the most common archaeological application of peptide mass fingerprinting (PMF) and can be used for species identification of bones, teeth, skin and antler. It is commonly used to identify objects that cannot be identified morphologically. In an archaeological context this usually means that the object is too fragmented or that it has been shaped into an artefact. Archaeologists use these species identification to study among others past environments, diet and raw material selection for the production of tools.