Protein mass spectrometry refers to the application of mass spectrometry to the study of proteins. Mass spectrometry is an important method for the accurate mass determination and characterization of proteins, and a variety of methods and instrumentations have been developed for its many uses. Its applications include the identification of proteins and their post-translational modifications, the elucidation of protein complexes, their subunits and functional interactions, as well as the global measurement of proteins in proteomics. It can also be used to localize proteins to the various organelles, and determine the interactions between different proteins as well as with membrane lipids.
The two primary methods used for the ionization of protein in mass spectrometry are electrospray ionization (ESI) and matrix-assisted laser desorption/ionization (MALDI). These ionization techniques are used in conjunction with mass analyzers such as tandem mass spectrometry. In general, the protein are analyzed either in a "top-down" approach in which proteins are analyzed intact, or a "bottom-up" approach in which protein are first digested into fragments. An intermediate "middle-down" approach in which larger peptide fragments are analyzed may also sometimes be used.
The application of mass spectrometry to study proteins became popularized in the 1980s after the development of MALDI and ESI. These ionization techniques have played a significant role in the characterization of proteins. (MALDI) Matrix-assisted laser desorption ionization was coined in the late 80's by Franz Hillenkamp and Michael Karas. nm laser. Though important, the breakthrough did not come until 1987. In 1987, Koichi Tanaka used the "ultra fine metal plus liquid matrix method" and ionized biomolecules the size of 34,472 Da protein carboxypeptidase-A.Hillenkamp, Karas and their fellow researchers were able to ionize the amino acid alanine by mixing it with the amino acid tryptophan and irradiated with a pulse 266
In 1968, Malcolm Dole reported the first use of electrospray ionization with mass spectrometry. Around the same time MALDI became popularized, John Bennett Fenn was cited for the development of electrospray ionization.Koichi Tanaka received the 2002 Nobel Prize in Chemistry alongside John Fenn, and Kurt Wüthrich "for the development of methods for identification and structure analyses of biological macromolecules." These ionization methods have greatly facilitated the study of proteins by mass spectrometry. Consequently, protein mass spectrometry now plays a leading role in protein characterization.
Mass spectrometry of proteins requires that the proteins in solution or solid state be turned into an ionized form in the gas phase before they are injected and accelerated in an electric or magnetic field for analysis. The two primary methods for ionization of proteins are electrospray ionization (ESI) and matrix-assisted laser desorption/ionization (MALDI). In electrospray, the ions are created from proteins in solution, and it allows fragile molecules to be ionized intact, sometimes preserving non-covalent interactions. In MALDI, the proteins are embedded within a matrix normally in a solid form, and ions are created by pulses of laser light. Electrospray produces more multiply-charged ions than MALDI, allowing for measurement of high mass protein and better fragmentation for identification, while MALDI is fast and less likely to be affected by contaminants, buffers and additives.
Whole-protein mass analysis is primarily conducted using either time-of-flight (TOF) MS, or Fourier transform ion cyclotron resonance (FT-ICR). These two types of instrument are preferable here because of their wide mass range, and in the case of FT-ICR, its high mass accuracy. Electrospray ionization of a protein often results in generation of multiple charged species of 800 < m/z < 2000 and the resultant spectrum can be deconvoluted to determine the protein's average mass to within 50 ppm or better using TOF or ion-trap instruments.
Mass analysis of proteolytic peptides is a popular method of protein characterization, as cheaper instrument designs can be used for characterization. Additionally, sample preparation is easier once whole proteins have been digested into smaller peptide fragments. The most widely used instrument for peptide mass analysis are the MALDI-TOF instruments as they permit the acquisition of peptide mass fingerprints (PMFs) at high pace (1 PMF can be analyzed in approx. 10 sec). Multiple stage quadrupole-time-of-flight and the quadrupole ion trap also find use in this application.
Tandem mass spectrometry (MS/MS) is used to measure fragmentation spectra and identify proteins at high speed and accuracy. Collision-induced dissociation is used in mainstream applications to generate a set of fragments from a specific peptide ion. The fragmentation process primarily gives rise to cleavage products that break along peptide bonds. Because of this simplicity in fragmentation, it is possible to use the observed fragment masses to match with a database of predicted masses for one of many given peptide sequences. Tandem MS of whole protein ions has been investigated recently using electron capture dissociation and has demonstrated extensive sequence information in principle but is not in common practice.
In keeping with the performance and mass range of available mass spectrometers, two approaches are used for characterizing proteins. In the first, intact proteins are ionized by either of the two techniques described above, and then introduced to a mass analyzer. This approach is referred to as "top-down" strategy of protein analysis as it involves starting with the whole mass and then pulling it apart. The top-down approach however is mostly limited to low-throughput single-protein studies due to issues involved in handling whole proteins, their heterogeneity and the complexity of their analyses.
In the second approach, referred to as the "bottom-up" MS, proteins are enzymatically digested into smaller peptides using a protease such as trypsin. Subsequently, these peptides are introduced into the mass spectrometer and identified by peptide mass fingerprinting or tandem mass spectrometry. Hence, this approach uses identification at the peptide level to infer the existence of proteins pieced back together with de novo repeat detection.The smaller and more uniform fragments are easier to analyze than intact proteins and can be also determined with high accuracy, this "bottom-up" approach is therefore the preferred method of studies in proteomics. A further approach that is beginning to be useful is the intermediate "middle-down" approach in which proteolytic peptides larger than the typical tryptic peptides are analyzed.
Proteins of interest are usually part of a complex mixture of multiple proteins and molecules, which co-exist in the biological medium. This presents two significant problems. First, the two ionization techniques used for large molecules only work well when the mixture contains roughly equal amounts of material, while in biological samples, different proteins tend to be present in widely differing amounts. If such a mixture is ionized using electrospray or MALDI, the more abundant species have a tendency to "drown" or suppress signals from less abundant ones. Second, mass spectrum from a complex mixture is very difficult to interpret due to the overwhelming number of mixture components. This is exacerbated by the fact that enzymatic digestion of a protein gives rise to a large number of peptide products.
In light of these problems, the methods of one- and two-dimensional gel electrophoresis and high performance liquid chromatography are widely used for separation of proteins. The first method fractionates whole proteins via two-dimensional gel electrophoresis. The first-dimension of 2D gel is isoelectric focusing (IEF). In this dimension, the protein is separated by its isoelectric point (pI) and the second-dimension is SDS-polyacrylamide gel electrophoresis (SDS-PAGE). This dimension separates the protein according to its molecular weight.Once this step is completed in-gel digestion occurs. In some situations, it may be necessary to combine both of these techniques. Gel spots identified on a 2D Gel are usually attributable to one protein. If the identity of the protein is desired, usually the method of in-gel digestion is applied, where the protein spot of interest is excised, and digested proteolytically. The peptide masses resulting from the digestion can be determined by mass spectrometry using peptide mass fingerprinting. If this information does not allow unequivocal identification of the protein, its peptides can be subject to tandem mass spectrometry for de novo sequencing. Small changes in mass and charge can be detected with 2D-PAGE. The disadvantages with this technique are its small dynamic range compared to other methods, some proteins are still difficult to separate due to their acidity, basicity, hydrophobicity, and size (too large or too small).
The second method, high performance liquid chromatography is used to fractionate peptides after enzymatic digestion. Characterization of protein mixtures using HPLC/MS is also called shotgun proteomics and MuDPIT (Multi-Dimensional Protein Identification Technology). A peptide mixture that results from digestion of a protein mixture is fractionated by one or two steps of liquid chromatography. The eluent from the chromatography stage can be either directly introduced to the mass spectrometer through electrospray ionization, or laid down on a series of small spots for later mass analysis using MALDI.
There are two main ways MS is used to identify proteins. Peptide mass fingerprinting uses the masses of proteolytic peptides as input to a search of a database of predicted masses that would arise from digestion of a list of known proteins. If a protein sequence in the reference list gives rise to a significant number of predicted masses that match the experimental values, there is some evidence that this protein was present in the original sample. Purification steps therefore limit the throughput of the peptide mass fingerprinting approach. Peptide mass fingerprinting can be achieved with MS/MS.
MS is also the preferred method for the identification of post-translational modifications in proteins as it is more advantageous than other approaches such as the antibody-based methods.
De novo peptide sequencing for mass spectrometry is typically performed without prior knowledge of the amino acid sequence. It is the process of assigning amino acids from peptide fragment masses of a protein. De novo sequencing has proven successful for confirming and expanding upon results from database searches.
As de novo sequencing is based on mass and some amino acids have identical masses (e.g. leucine and isoleucine), accurate manual sequencing can be difficult. Therefore, it may be necessary to utilize a sequence homology search application to work in tandem between a database search and de novo sequencing to address this inherent limitation.
Database searching has the advantage of quickly identifying sequences, provided they have already been documented in a database. Other inherent limitations of database searching include sequence modifications/mutations (some database searches do not adequately account for alterations to the 'documented' sequence, thus can miss valuable information), the unknown (if a sequence is not documented, it will not be found), false positives, and incomplete and corrupted data.
An annotated peptide spectral library can also be used as a reference for protein/peptide identification. It offers the unique strength of reduced search space and increased specificity. The limitations include spectra not included in the library will not be identified, spectra collected from different types of mass spectrometers can have quite distinct features, and reference spectra in the library may contain noise peaks, which may lead to false positive identifications.A number of different algorithmic approaches have been described to identify peptides and proteins from tandem mass spectrometry (MS/MS), peptide de novo sequencing and sequence tag-based searching.
Multiple methods allow for the quantitation of proteins by mass spectrometry (quantitative proteomics),and recent advances have enabled quantifying thousands of proteins in single cells. Typically, stable (e.g. non-radioactive) heavier isotopes of carbon (13C) or nitrogen (15N) are incorporated into one sample while the other one is labeled with corresponding light isotopes (e.g. 12C and 14N). The two samples are mixed before the analysis. Peptides derived from the different samples can be distinguished due to their mass difference. The ratio of their peak intensities corresponds to the relative abundance ratio of the peptides (and proteins). The most popular methods for isotope labeling are SILAC (stable isotope labeling by amino acids in cell culture), trypsin-catalyzed 18O labeling, ICAT (isotope coded affinity tagging), iTRAQ (isobaric tags for relative and absolute quantitation). “Semi-quantitative” mass spectrometry can be performed without labeling of samples. Typically, this is done with MALDI analysis (in linear mode). The peak intensity, or the peak area, from individual molecules (typically proteins) is here correlated to the amount of protein in the sample. However, the individual signal depends on the primary structure of the protein, on the complexity of the sample, and on the settings of the instrument. Other types of "label-free" quantitative mass spectrometry, uses the spectral counts (or peptide counts) of digested proteins as a means for determining relative protein amounts.
Characteristics indicative of the 3-dimensional structure of proteins can be probed with mass spectrometry in various ways.By using chemical crosslinking to couple parts of the protein that are close in space, but far apart in sequence, information about the overall structure can be inferred. By following the exchange of amide protons with deuterium from the solvent, it is possible to probe the solvent accessibility of various parts of the protein. Hydrogen-deuterium exchange mass spectrometry has been used to study proteins and their conformations for over 20 years. This type of protein structural analysis can be suitable for proteins that are challenging for other structural methods. Another interesting avenue in protein structural studies is laser-induced covalent labeling. In this technique, solvent-exposed sites of the protein are modified by hydroxyl radicals. Its combination with rapid mixing has been used in protein folding studies.
In what is now commonly referred to as proteogenomics, peptides identified with mass spectrometry are used for improving gene annotations (for example, gene start sites) and protein annotations. Parallel analysis of the genome and the proteome facilitates discovery of post-translational modifications and proteolytic events,especially when comparing multiple species.
Mass spectrometry (MS) is an analytical technique that is used to measure the mass-to-charge ratio of ions. The results are typically presented as a mass spectrum, a plot of intensity as a function of the mass-to-charge ratio. Mass spectrometry is used in many different fields and is applied to pure samples as well as complex mixtures.
Tandem mass spectrometry, also known as MS/MS or MS2, is a technique in instrumental analysis where two or more mass analyzers are coupled together using an additional reaction step to increase their abilities to analyse chemical samples. A common use of tandem MS is the analysis of biomolecules, such as proteins and peptides.
Protein sequencing is the practical process of determining the amino acid sequence of all or part of a protein or peptide. This may serve to identify the protein or characterize its post-translational modifications. Typically, partial sequencing of a protein provides sufficient information to identify it with reference to databases of protein sequences derived from the conceptual translation of genes.
Peptide mass fingerprinting (PMF) is an analytical technique for protein identification in which the unknown protein of interest is first cleaved into smaller peptides, whose absolute masses can be accurately measured with a mass spectrometer such as MALDI-TOF or ESI-TOF. The method was developed in 1993 by several groups independently. The peptide masses are compared to either a database containing known protein sequences or even the genome. This is achieved by using computer programs that translate the known genome of the organism into proteins, then theoretically cut the proteins into peptides, and calculate the absolute masses of the peptides from each protein. They then compare the masses of the peptides of the unknown protein to the theoretical peptide masses of each protein encoded in the genome. The results are statistically analyzed to find the best match.
In mass spectrometry, matrix-assisted laser desorption/ionization (MALDI) is an ionization technique that uses a laser energy absorbing matrix to create ions from large molecules with minimal fragmentation. It has been applied to the analysis of biomolecules and various organic molecules, which tend to be fragile and fragment when ionized by more conventional ionization methods. It is similar in character to electrospray ionization (ESI) in that both techniques are relatively soft ways of obtaining ions of large molecules in the gas phase, though MALDI typically produces far fewer multi-charged ions.
Liquid chromatography–mass spectrometry (LC–MS) is an analytical chemistry technique that combines the physical separation capabilities of liquid chromatography with the mass analysis capabilities of mass spectrometry (MS). Coupled chromatography - MS systems are popular in chemical analysis because the individual capabilities of each technique are enhanced synergistically. While liquid chromatography separates mixtures with multiple components, mass spectrometry provides structural identity of the individual components with high molecular specificity and detection sensitivity. This tandem technique can be used to analyze biochemical, organic, and inorganic compounds commonly found in complex samples of environmental and biological origin. Therefore, LC-MS may be applied in a wide range of sectors including biotechnology, environment monitoring, food processing, and pharmaceutical, agrochemical, and cosmetic industries.
Surface-enhanced laser desorption/ionization (SELDI) is a soft ionization method in mass spectrometry (MS) used for the analysis of protein mixtures. It is a variation of matrix-assisted laser desorption/ionization (MALDI). In MALDI, the sample is mixed with a matrix material and applied to a metal plate before irradiation by a laser, whereas in SELDI, proteins of interest in a sample become bound to a surface before MS analysis. The sample surface is a key component in the purification, desorption, and ionization of the sample. SELDI is typically used with time-of-flight (TOF) mass spectrometers and is used to detect proteins in tissue samples, blood, urine, or other clinical samples, however, SELDI technology can potentially be used in any application by simply modifying the sample surface.
Soft laser desorption (SLD) is laser desorption of large molecules that results in ionization without fragmentation. "Soft" in the context of ion formation means forming ions without breaking chemical bonds. "Hard" ionization is the formation of ions with the breaking of bonds and the formation of fragment ions.
Electron-transfer dissociation (ETD) is a method of fragmenting multiply-charged gaseous macromolecules in a mass spectrometer between the stages of tandem mass spectrometry (MS/MS). Similar to electron-capture dissociation, ETD induces fragmentation of large, multiply-charged cations by transferring electrons to them. ETD is used extensively with polymers and biological molecules such as proteins and peptides for sequence analysis. Transferring an electron causes peptide backbone cleavage into c- and z-ions while leaving labile post translational modifications (PTM) intact. The technique only works well for higher charge state peptide or polymer ions (z>2). However, relative to collision-induced dissociation (CID), ETD is advantageous for the fragmentation of longer peptides or even entire proteins. This makes the technique important for top-down proteomics. The method was developed by Hunt and coworkers at the University of Virginia.
Shotgun proteomics refers to the use of bottom-up proteomics techniques in identifying proteins in complex mixtures using a combination of high performance liquid chromatography combined with mass spectrometry. The name is derived from shotgun sequencing of DNA which is itself named after the rapidly expanding, quasi-random firing pattern of a shotgun. The most common method of shotgun proteomics starts with the proteins in the mixture being digested and the resulting peptides are separated by liquid chromatography. Tandem mass spectrometry is then used to identify the peptides.
Top-down proteomics is a method of protein identification that either uses an ion trapping mass spectrometer to store an isolated protein ion for mass measurement and tandem mass spectrometry (MS/MS) analysis or other protein purification methods such as two-dimensional gel electrophoresis in conjunction with MS/MS. Top-down proteomics is capable of identifying and quantitating unique proteoforms through the analysis of intact proteins. The name is derived from the similar approach to DNA sequencing. During mass spectrometry intact proteins are typically ionized by electrospray ionization and trapped in a Fourier transform ion cyclotron resonance, quadrupole ion trap or Orbitrap mass spectrometer. Fragmentation for tandem mass spectrometry is accomplished by electron-capture dissociation or electron-transfer dissociation. Effective fractionation is critical for sample handling before mass-spectrometry-based proteomics. Proteome analysis routinely involves digesting intact proteins followed by inferred protein identification using mass spectrometry (MS). Top-down MS (non-gel) proteomics interrogates protein structure through measurement of an intact mass followed by direct ion dissociation in the gas phase.
Bottom-up proteomics is a common method to identify proteins and characterize their amino acid sequences and post-translational modifications by proteolytic digestion of proteins prior to analysis by mass spectrometry. The major alternative workflow used in proteomics is called top-down proteomics where intact proteins are purified prior to digestion and/or fragmentation either within the mass spectrometer or by 2D electrophoresis. Essentially, bottom-up proteomics is a relatively simple and reliable means of determining the protein make-up of a given sample of cells, tissues, etc.
Sample preparation for mass spectrometry is used for the optimization of a sample for analysis in a mass spectrometer (MS). Each ionization method has certain factors that must be considered for that method to be successful, such as volume, concentration, sample phase, and composition of the analyte solution. Quite possibly the most important consideration in sample preparation is knowing what phase the sample must be in for analysis to be successful. In some cases the analyte itself must be purified before entering the ion source. In other situations, the matrix, or everything in the solution surrounding the analyte, is the most important factor to consider and adjust. Often, sample preparation itself for mass spectrometry can be avoided by coupling mass spectrometry to a chromatography method, or some other form of separation before entering the mass spectrometer. In some cases, the analyte itself must be adjusted so that analysis is possible, such as in protein mass spectrometry, where usually the protein of interest is cleaved into peptides before analysis, either by in-gel digestion or by proteolysis in solution.
Laser spray ionization refers to one of several methods for creating ions using a laser interacting with a spray of neutral particles or ablating material to create a plume of charged particles. The ions thus formed can be separated by m/z with mass spectrometry. Laser spray is one of several ion sources that can be coupled with liquid chromatography-mass spectrometry for the detection of larger molecules.
In bio-informatics, a peptide-mass fingerprint or peptide-mass map is a mass spectrum of a mixture of peptides that comes from a digested protein being analyzed. The mass spectrum serves as a fingerprint in the sense that it is a pattern that can serve to identify the protein. The method for forming a peptide-mass fingerprint, developed in 1993, consists of isolating a protein, breaking it down into individual peptides, and determining the masses of the peptides through some form of mass spectrometry. Once formed, a peptide-mass fingerprint can be used to search in databases for related protein or even genomic sequences, making it a powerful tool for annotation of protein-coding genes.
Metal-coded affinity tag is a method used for quantitative proteomics by mass spectrometry that uses a metal chelate complex 1,4,7,10-tetraazacyclododecane-1,4,7,10-tetraacetate (DOTA) coupled to different lanthanide ions. The metal complexes attach to the cysteine residues of proteins in a sample.
Desorption/ionization on silicon (DIOS) is a soft laser desorption method used to generate gas-phase ions for mass spectrometry analysis. DIOS is considered the first surface-based surface-assisted laser desorption/ionization (SALDI-MS) approach. Prior approaches were accomplished using nanoparticles in a matrix of glycerol, while DIOS is a matrix-free technique in which a sample is deposited on a nanostructured surface and the sample desorbed directly from the nanostructured surface through the adsorption of laser light energy. DIOS has been used to analyze organic molecules, metabolites, biomolecules and peptides, and, ultimately, to image tissues and cells.
In mass spectrometry, matrix-assisted ionization is a low fragmentation (soft) ionization technique which involves the transfer of particles of the analyte and matrix sample from atmospheric pressure (AP) to the heated inlet tube connecting the AP region to the vacuum of the mass analyzer.
Paleoproteomics is a relatively young and rapidly growing field of molecular science in which proteomics-based sequencing technology is used to resolve species identification and evolutionary relationships of extinct taxa. While complementary to paleogenomics in application, the study of ancient proteins has the potential to reveal older, more complete phylogenies due to the relative stability of amino acids in proteins as compared to the nucleic acids of DNA. Ancient protein studies can further reveal types and sources of recovered tissues, as well as the developmental stages of fossilized specimens. Paleoproteomics can also be extended to archaeological materials such as textiles, animal skins, food remains, and pottery.