Peptide mass fingerprinting

Last updated
A typical workflow of a peptide mass fingerprinting experiment. Peptide mass fig.jpg
A typical workflow of a peptide mass fingerprinting experiment.

Peptide mass fingerprinting (PMF), also known as protein fingerprinting, is an analytical technique for protein identification in which the unknown protein of interest is first cleaved into smaller peptides, whose absolute masses can be accurately measured with a mass spectrometer such as MALDI-TOF or ESI-TOF. [1] The method was developed in 1993 by several groups independently. [2] [3] [4] [5] [6] The peptide masses are compared to either a database containing known protein sequences or even the genome. This is achieved by using computer programs that translate the known genome of the organism into proteins, then theoretically cut the proteins into peptides, and calculate the absolute masses of the peptides from each protein. They then compare the masses of the peptides of the unknown protein to the theoretical peptide masses of each protein encoded in the genome. The results are statistically analyzed to find the best match.

Contents

The advantage of this method is that only the masses of the peptides have to be known. A disadvantage is that the protein sequence has to be present in the database of interest. Additionally most PMF algorithms assume that the peptides come from a single protein. [7] The presence of a mixture can significantly complicate the analysis and potentially compromise the results. Typical for the PMF-based protein identification is the requirement for an isolated protein. Mixtures exceeding a number of 2–3 proteins typically require the additional use of MS/MS-based protein identification to achieve sufficient specificity of identification. Therefore, typical PMF samples are isolated proteins from two-dimensional gel electrophoresis (2D gels) or isolated SDS-PAGE bands. Additional analyses by MS/MS can either be direct, e.g., MALDI-TOF/TOF analysis or downstream nanoLC-ESI-MS/MS analysis of gel spot eluates. [7] [8]

Origins

Due to the long, tedious process of analyzing proteins, peptide mass fingerprinting was developed. Edman degradation was used in protein analysis, and it required almost an hour to analyze one amino acid residue. [9] SDS-PAGE was also used to separate proteins in very complex mixtures, which also employed methods of electroblotting and staining. [10] Then, bands would be extracted from the gel and sequenced, automatically. A recurring problem in the process was that interfering proteins would also purify with the protein of interest. The sequences of these interfering proteins were compiled into what came to known as the Dayhoff database. [11] Ultimately, having the sequences of these known protein contaminants in databases decreased instrument time and expenses involved in protein analysis.

Sample preparation

Protein samples can be derived from SDS-PAGE [7] or reversed phase HPLC, and are then subject to some chemical modifications. Disulfide bridges in proteins are reduced and cysteine amino acids are carbamidomethylated chemically or acrylamidated during the gel electrophoresis.

Then the proteins are cut into several fragments using proteolytic enzymes such as trypsin, chymotrypsin or Glu-C. A typical sample:protease ratio is 50:1. The proteolysis is typically carried out overnight and the resulting peptides are extracted with acetonitrile and dried under vacuum. The peptides are then dissolved in a small amount of distilled water or further concentrated and purified and are ready for mass spectrometric analysis.

Mass spectrometric analysis

The digested protein can be analyzed with different types of mass spectrometers such as ESI-TOF or MALDI-TOF. MALDI-TOF is often the preferred instrument because it allows a high sample throughput and several proteins can be analyzed in a single experiment, if complemented by MS/MS analysis. LC/ESI-MS and CE/ESI-MS are also great techniques for peptide mass fingerprinting. [12] [13]

A small fraction of the peptide (usually 1 microliter or less) is pipetted onto a MALDI target and a chemical called a matrix is added to the peptide mix. Common matrices are Sinapinic acid, Alpha-Cyano-4-hydroxycinnamic acid, and 2,3-Dihydroxybenzoic acid. The matrix molecules are required for the desorption of the peptide molecules. Matrix and peptide molecules co-crystallize on the MALDI target and are ready to be analyzed. There is one predominantly MALDI-MS sample preparation technique, namely dried droplet technique. [14] The target is inserted into the vacuum chamber of the mass spectrometer and the desorption and ionisation of the polypeptide fragments is initiated by a pulsed laser beam which transfers high amounts of energy into the matrix molecules. The energy transfer is sufficient to promote the ionisation and transition of matrix molecules and peptides from the solid phase into the gas phase. The ions are accelerated in the electric field of the mass spectrometer and fly towards an ion detector where their arrival is detected as an electric signal. Their mass-to-charge ratio is proportional to their time of flight (TOF) in the drift tube and can be calculated accordingly.

Coupling ESI with capillary LC can separate peptides from protein digests, while obtaining their molecular masses at the same time. [15] Capillary electrophoresis coupled with ESI-MS is another technique; however, it works best when analyzing small amounts of proteins. [13]

Computational analysis

The mass spectrometric analysis produces a list of molecular weights which is often called a peak list. The peptide masses are compared to protein databases such as Swissprot, which contain protein sequence information. Software performs in silico digests on proteins in the database with the same enzyme (e.g. trypsin) used in the chemical cleavage reaction. The mass of these peptides is then calculated and compared to the peak list of measured masses. The results are statistically analyzed and possible matches are returned in a results table.

See also

Related Research Articles

<span class="mw-page-title-main">Glycome</span> Complete set of all sugars, free or bound, in an organism.

A glycome is the entire complement or complete set of all sugars, whether free or chemically bound in more complex molecules, of an organism. An alternative definition is the entirety of carbohydrates in a cell. The glycome may in fact be one of the most complex entities in nature. "Glycomics, analogous to genomics and proteomics, is the systematic study of all glycan structures of a given cell type or organism" and is a subset of glycobiology.

<span class="mw-page-title-main">Mass spectrometry</span> Analytical technique based on determining mass to charge ratio of ions

Mass spectrometry (MS) is an analytical technique that is used to measure the mass-to-charge ratio of ions. The results are presented as a mass spectrum, a plot of intensity as a function of the mass-to-charge ratio. Mass spectrometry is used in many different fields and is applied to pure samples as well as complex mixtures.

<span class="mw-page-title-main">Tandem mass spectrometry</span> Type of mass spectrometry

Tandem mass spectrometry, also known as MS/MS or MS2, is a technique in instrumental analysis where two or more stages of analysis using one or more mass analyzer are performed with an additional reaction step in between these analyses to increase their abilities to analyse chemical samples. A common use of tandem MS is the analysis of biomolecules, such as proteins and peptides.

<span class="mw-page-title-main">Protein sequencing</span> Sequencing of amino acid arrangement in a protein

Protein sequencing is the practical process of determining the amino acid sequence of all or part of a protein or peptide. This may serve to identify the protein or characterize its post-translational modifications. Typically, partial sequencing of a protein provides sufficient information to identify it with reference to databases of protein sequences derived from the conceptual translation of genes.

<span class="mw-page-title-main">Matrix-assisted laser desorption/ionization</span> Ionization technique

In mass spectrometry, matrix-assisted laser desorption/ionization (MALDI) is an ionization technique that uses a laser energy-absorbing matrix to create ions from large molecules with minimal fragmentation. It has been applied to the analysis of biomolecules and various organic molecules, which tend to be fragile and fragment when ionized by more conventional ionization methods. It is similar in character to electrospray ionization (ESI) in that both techniques are relatively soft ways of obtaining ions of large molecules in the gas phase, though MALDI typically produces far fewer multi-charged ions.

The terms glycans and polysaccharides are defined by IUPAC as synonyms meaning "compounds consisting of a large number of monosaccharides linked glycosidically". However, in practice the term glycan may also be used to refer to the carbohydrate portion of a glycoconjugate, such as a glycoprotein, glycolipid, or a proteoglycan, even if the carbohydrate is only an oligosaccharide. Glycans usually consist solely of O-glycosidic linkages of monosaccharides. For example, cellulose is a glycan composed of β-1,4-linked D-glucose, and chitin is a glycan composed of β-1,4-linked N-acetyl-D-glucosamine. Glycans can be homo- or heteropolymers of monosaccharide residues, and can be linear or branched.

Surface-enhanced laser desorption/ionization (SELDI) is a soft ionization method in mass spectrometry (MS) used for the analysis of protein mixtures. It is a variation of matrix-assisted laser desorption/ionization (MALDI). In MALDI, the sample is mixed with a matrix material and applied to a metal plate before irradiation by a laser, whereas in SELDI, proteins of interest in a sample become bound to a surface before MS analysis. The sample surface is a key component in the purification, desorption, and ionization of the sample. SELDI is typically used with time-of-flight (TOF) mass spectrometers and is used to detect proteins in tissue samples, blood, urine, or other clinical samples, however, SELDI technology can potentially be used in any application by simply modifying the sample surface.

Mascot is a software search engine that uses mass spectrometry data to identify proteins from peptide sequence databases. Mascot is widely used by research facilities around the world. Mascot uses a probabilistic scoring algorithm for protein identification that was adapted from the MOWSE algorithm. Mascot is freely available to use on the website of Matrix Science. A license is required for in-house use where more features can be incorporated.

<span class="mw-page-title-main">Electron-transfer dissociation</span>

Electron-transfer dissociation (ETD) is a method of fragmenting multiply-charged gaseous macromolecules in a mass spectrometer between the stages of tandem mass spectrometry (MS/MS). Similar to electron-capture dissociation, ETD induces fragmentation of large, multiply-charged cations by transferring electrons to them. ETD is used extensively with polymers and biological molecules such as proteins and peptides for sequence analysis. Transferring an electron causes peptide backbone cleavage into c- and z-ions while leaving labile post translational modifications (PTM) intact. The technique only works well for higher charge state peptide or polymer ions (z>2). However, relative to collision-induced dissociation (CID), ETD is advantageous for the fragmentation of longer peptides or even entire proteins. This makes the technique important for top-down proteomics. The method was developed by Hunt and coworkers at the University of Virginia.

<span class="mw-page-title-main">MALDI imaging</span>

MALDI mass spectrometry imaging (MALDI-MSI) is the use of matrix-assisted laser desorption ionization as a mass spectrometry imaging technique in which the sample, often a thin tissue section, is moved in two dimensions while the mass spectrum is recorded. Advantages, like measuring the distribution of a large amount of analytes at one time without destroying the sample, make it a useful method in tissue-based study.

<span class="mw-page-title-main">Protein mass spectrometry</span> Application of mass spectrometry

Protein mass spectrometry refers to the application of mass spectrometry to the study of proteins. Mass spectrometry is an important method for the accurate mass determination and characterization of proteins, and a variety of methods and instrumentations have been developed for its many uses. Its applications include the identification of proteins and their post-translational modifications, the elucidation of protein complexes, their subunits and functional interactions, as well as the global measurement of proteins in proteomics. It can also be used to localize proteins to the various organelles, and determine the interactions between different proteins as well as with membrane lipids.

<span class="mw-page-title-main">Time-of-flight mass spectrometry</span> Method of mass spectrometry

Time-of-flight mass spectrometry (TOFMS) is a method of mass spectrometry in which an ion's mass-to-charge ratio is determined by a time of flight measurement. Ions are accelerated by an electric field of known strength. This acceleration results in an ion having the same kinetic energy as any other ion that has the same charge. The velocity of the ion depends on the mass-to-charge ratio. The time that it subsequently takes for the ion to reach a detector at a known distance is measured. This time will depend on the velocity of the ion, and therefore is a measure of its mass-to-charge ratio. From this ratio and known experimental parameters, one can identify the ion.

<span class="mw-page-title-main">Top-down proteomics</span>

Top-down proteomics is a method of protein identification that either uses an ion trapping mass spectrometer to store an isolated protein ion for mass measurement and tandem mass spectrometry (MS/MS) analysis or other protein purification methods such as two-dimensional gel electrophoresis in conjunction with MS/MS. Top-down proteomics is capable of identifying and quantitating unique proteoforms through the analysis of intact proteins. The name is derived from the similar approach to DNA sequencing. During mass spectrometry intact proteins are typically ionized by electrospray ionization and trapped in a Fourier transform ion cyclotron resonance, quadrupole ion trap or Orbitrap mass spectrometer. Fragmentation for tandem mass spectrometry is accomplished by electron-capture dissociation or electron-transfer dissociation. Effective fractionation is critical for sample handling before mass-spectrometry-based proteomics. Proteome analysis routinely involves digesting intact proteins followed by inferred protein identification using mass spectrometry (MS). Top-down MS (non-gel) proteomics interrogates protein structure through measurement of an intact mass followed by direct ion dissociation in the gas phase.

<span class="mw-page-title-main">Bottom-up proteomics</span>

Bottom-up proteomics is a common method to identify proteins and characterize their amino acid sequences and post-translational modifications by proteolytic digestion of proteins prior to analysis by mass spectrometry. The major alternative workflow used in proteomics is called top-down proteomics where intact proteins are purified prior to digestion and/or fragmentation either within the mass spectrometer or by 2D electrophoresis. Essentially, bottom-up proteomics is a relatively simple and reliable means of determining the protein make-up of a given sample of cells, tissues, etc.

<span class="mw-page-title-main">Quantitative proteomics</span> Analytical chemistry technique

Quantitative proteomics is an analytical chemistry technique for determining the amount of proteins in a sample. The methods for protein identification are identical to those used in general proteomics, but include quantification as an additional dimension. Rather than just providing lists of proteins identified in a certain sample, quantitative proteomics yields information about the physiological differences between two biological samples. For example, this approach can be used to compare samples from healthy and diseased patients. Quantitative proteomics is mainly performed by two-dimensional gel electrophoresis (2-DE), preparative native PAGE, or mass spectrometry (MS). However, a recent developed method of quantitative dot blot (QDB) analysis is able to measure both the absolute and relative quantity of an individual proteins in the sample in high throughput format, thus open a new direction for proteomic research. In contrast to 2-DE, which requires MS for the downstream protein identification, MS technology can identify and quantify the changes.

Sample preparation for mass spectrometry is used for the optimization of a sample for analysis in a mass spectrometer (MS). Each ionization method has certain factors that must be considered for that method to be successful, such as volume, concentration, sample phase, and composition of the analyte solution. Quite possibly the most important consideration in sample preparation is knowing what phase the sample must be in for analysis to be successful. In some cases the analyte itself must be purified before entering the ion source. In other situations, the matrix, or everything in the solution surrounding the analyte, is the most important factor to consider and adjust. Often, sample preparation itself for mass spectrometry can be avoided by coupling mass spectrometry to a chromatography method, or some other form of separation before entering the mass spectrometer. In some cases, the analyte itself must be adjusted so that analysis is possible, such as in protein mass spectrometry, where usually the protein of interest is cleaved into peptides before analysis, either by in-gel digestion or by proteolysis in solution.

<span class="mw-page-title-main">Capillary electrophoresis–mass spectrometry</span>

Capillary electrophoresis–mass spectrometry (CE–MS) is an analytical chemistry technique formed by the combination of the liquid separation process of capillary electrophoresis with mass spectrometry. CE–MS combines advantages of both CE and MS to provide high separation efficiency and molecular mass information in a single analysis. It has high resolving power and sensitivity, requires minimal volume and can analyze at high speed. Ions are typically formed by electrospray ionization, but they can also be formed by matrix-assisted laser desorption/ionization or other ionization techniques. It has applications in basic research in proteomics and quantitative analysis of biomolecules as well as in clinical medicine. Since its introduction in 1987, new developments and applications have made CE-MS a powerful separation and identification technique. Use of CE–MS has increased for protein and peptides analysis and other biomolecules. However, the development of online CE–MS is not without challenges. Understanding of CE, the interface setup, ionization technique and mass detection system is important to tackle problems while coupling capillary electrophoresis to mass spectrometry.

In bio-informatics, a peptide-mass fingerprint or peptide-mass map is a mass spectrum of a mixture of peptides that comes from a digested protein being analyzed. The mass spectrum serves as a fingerprint in the sense that it is a pattern that can serve to identify the protein. The method for forming a peptide-mass fingerprint, developed in 1993, consists of isolating a protein, breaking it down into individual peptides, and determining the masses of the peptides through some form of mass spectrometry. Once formed, a peptide-mass fingerprint can be used to search in databases for related protein or even genomic sequences, making it a powerful tool for annotation of protein-coding genes.

Electrostatic spray ionization (ESTASI) is an ambient ionization method for mass spectrometry (MS) analysis of samples located on a flat or porous surface, or inside a microchannel. It was developed in 2011 by Professor Hubert H. Girault’s group at the École Polytechnique Fédérale de Lausanne (EPFL) in Switzerland. In a typical ESTASI process, a droplet of a protic solvent containing analytes is deposited on a sample area of interest which itself is mounted to an insulating substrate. Under this substrate and right below the droplet, an electrode is placed and connected with a pulsed high voltage (HV) to electrostatically charge the droplet during pulsing. When the electrostatic pressure is larger than the surface tension, droplets and ions are sprayed. ESTASI is a contactless process based on capacitive coupling. One advantage of ESTASI is, that the electrode and sample droplet act contact-less avoiding thereby any oxidation or reduction of the sample compounds at the electrode surface, which often happens during standard electrospray ionization (ESI). ESTASI is a powerful new ambient ionization technique that has already found many applications in the detection of different analytes, such as organic molecules, peptides and proteins with molecule weight up to 70 kDa. Furthermore, it was used to couple MS with various separation techniques including capillary electrophoresis and gel isoelectric focusing, and it was successfully applied under atmospheric pressure to the direct analysis of samples with only few preparation steps.

Zooarchaeology by mass spectrometry, commonly referred to by the abbreviation ZooMS, is a scientific method that identifies animal species by means of characteristic peptide sequences in the protein collagen. ZooMS is the most common archaeological application of peptide mass fingerprinting (PMF) and can be used for species identification of bones, teeth, skin and antler. It is commonly used to identify objects that cannot be identified morphologically. In an archaeological context this usually means that the object is too fragmented or that it has been shaped into an artefact. Archaeologists use these species identification to study among others past environments, diet and raw material selection for the production of tools.

References

  1. Clauser KR, Baker P, Burlingame AL (1999). "Role of accurate mass measurement (+/- 10 ppm) in protein identification strategies employing MS or MS/MS and database searching". Anal. Chem. 71 (14): 2871–82. doi:10.1021/ac9810516. PMID   10424174.
  2. Pappin DJ, Hojrup P, Bleasby AJ (1993). "Rapid identification of proteins by peptide-mass fingerprinting". Curr. Biol. 3 (6): 327–32. doi:10.1016/0960-9822(93)90195-T. PMID   15335725. S2CID   40203243.
  3. Henzel WJ, Billeci TM, Stults JT, Wong SC, Grimley C, Watanabe C (1993). "Identifying proteins from two-dimensional gels by molecular mass searching of peptide fragments in protein sequence databases". Proc. Natl. Acad. Sci. U.S.A. 90 (11): 5011–5. Bibcode:1993PNAS...90.5011H. doi: 10.1073/pnas.90.11.5011 . PMC   46643 . PMID   8506346.
  4. Mann M, Højrup P, Roepstorff P (1993). "Use of mass spectrometric molecular weight information to identify proteins in sequence databases". Biological Mass Spectrometry. 22 (6): 338–45. doi:10.1002/bms.1200220605. PMID   8329463.
  5. James P, Quadroni M, Carafoli E, Gonnet G (1993). "Protein identification by mass profile fingerprinting". Biochem. Biophys. Res. Commun. 195 (1): 58–64. doi:10.1006/bbrc.1993.2009. PMID   8363627.
  6. Yates JR, Speicher S, Griffin PR, Hunkapiller T (1993). "Peptide mass maps: a highly informative approach to protein identification". Anal. Biochem. 214 (2): 397–408. doi:10.1006/abio.1993.1514. PMID   8109726.
  7. 1 2 3 Shevchenko A, Jensen ON, Podtelejnikov AV, Sagliocco F, Wilm M, Vorm O, Mortensen P, Shevchenko A, Boucherie H, Mann M (1996). "Linking genome and proteome by mass spectrometry: large-scale identification of yeast proteins from two dimensional gels". Proc. Natl. Acad. Sci. U.S.A. 93 (25): 14440–5. Bibcode:1996PNAS...9314440S. doi: 10.1073/pnas.93.25.14440 . PMC   26151 . PMID   8962070.
  8. Wang W, Sun J, Nimtz M, Deckwer WD, Zeng AP (2003). "Protein identification from two-dimensional gel electrophoresis analysis of Klebsiella pneumoniae by combined use of mass spectrometry data and raw genome sequences". Proteome Science. 1 (1): 6. doi: 10.1186/1477-5956-1-6 . PMC   317362 . PMID   14653859.
  9. Henzel, William J.; Watanabe, Colin; Stults, John T. (2003-09-01). "Protein identification: The origins of peptide mass fingerprinting". Journal of the American Society for Mass Spectrometry. 14 (9): 931–942. doi:10.1016/S1044-0305(03)00214-9. ISSN   1044-0305. PMID   12954162.
  10. Matsudaira, P. (1987-07-25). "Sequence from picomole quantities of proteins electroblotted onto polyvinylidene difluoride membranes". The Journal of Biological Chemistry. 262 (21): 10035–10038. doi: 10.1016/S0021-9258(18)61070-1 . ISSN   0021-9258. PMID   3611052.
  11. B C Orcutt; D G George; Dayhoff, and M. O. (1983). "Protein and Nucleic Acid Sequence Database Systems". Annual Review of Biophysics and Bioengineering. 12 (1): 419–441. doi:10.1146/annurev.bb.12.060183.002223. PMID   6347043.
  12. Moore, R. E.; Licklider, L.; Schumann, D.; Lee, T. D. (1998-12-01). "A microscale electrospray interface incorporating a monolithic, poly(styrene-divinylbenzene) support for on-line liquid chromatography/tandem mass spectrometry analysis of peptides and proteins". Analytical Chemistry. 70 (23): 4879–4884. doi:10.1021/ac980723p. ISSN   0003-2700. PMID   9852776.
  13. 1 2 Whitmore, Colin D.; Gennaro, Lynn A. (2012-06-01). "Capillary electrophoresis-mass spectrometry methods for tryptic peptide mapping of therapeutic antibodies". Electrophoresis. 33 (11): 1550–1556. doi:10.1002/elps.201200066. ISSN   1522-2683. PMID   22736356. S2CID   28717319.
  14. Thiede, Bernd (2005). "Peptide mass fingerprinting". Methods. 35 (3): 237–247. doi:10.1016/j.ymeth.2004.08.015. PMID   15722220.
  15. Dass, Chhabil (2007). Fundamentals of Contemporary Mass Spectrometry | Wiley Online Books. doi:10.1002/0470118490. ISBN   9780470118498.