Hydrophilicity plot

Last updated

A hydrophilicity plot is a quantitative analysis of the degree of hydrophobicity or hydrophilicity of amino acids of a protein. It is used to characterize or identify possible structure or domains of a protein.

In analytical chemistry, quantitative analysis is the determination of the absolute or relative abundance of one, several or all particular substance(s) present in a sample.

Amino acid Organic compounds containing amine and carboxylic groups

Amino acids are organic compounds containing amine (-NH2) and carboxyl (-COOH) functional groups, along with a side chain (R group) specific to each amino acid. The key elements of an amino acid are carbon (C), hydrogen (H), oxygen (O), and nitrogen (N), although other elements are found in the side chains of certain amino acids. About 500 naturally occurring amino acids are known (though only 20 appear in the genetic code) and can be classified in many ways. They can be classified according to the core structural functional groups' locations as alpha- (α-), beta- (β-), gamma- (γ-) or delta- (δ-) amino acids; other categories relate to polarity, pH level, and side chain group type (aliphatic, acyclic, aromatic, containing hydroxyl or sulfur, etc.). In the form of proteins, amino acid residues form the second-largest component (water is the largest) of human muscles and other tissues. Beyond their role as residues in proteins, amino acids participate in a number of processes such as neurotransmitter transport and biosynthesis.

Protein biological molecule consisting of chains of amino acid residues

Proteins are large biomolecules, or macromolecules, consisting of one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, responding to stimuli, providing structure to cells and organisms, and transporting molecules from one location to another. Proteins differ from one another primarily in their sequence of amino acids, which is dictated by the nucleotide sequence of their genes, and which usually results in protein folding into a specific three-dimensional structure that determines its activity.

The plot has amino acid sequence of a protein on its x-axis, and degree of hydrophobicity and hydrophilicity on its y-axis. There is a number of methods to measure the degree of interaction of polar solvents such as water with specific amino acids. For instance, the Kyte-Doolittle scale indicates hydrophobic amino acids, whereas the Hopp-Woods scale measures hydrophilic residues.

Analyzing the shape of the plot gives information about partial structure of the protein. For instance, if a stretch of about 20 amino acids shows positive for hydrophobicity, these amino acids may be part of alpha-helix spanning across a lipid bilayer, which is composed of hydrophobic fatty acids. On the converse, amino acids with high hydrophilicity indicate that these residues are in contact with solvent, or water, and that they are therefore likely to reside on the outer surface of the protein.

Lipid bilayer Lipid bilayer

The lipid bilayer is a thin polar membrane made of two layers of lipid molecules. These membranes are flat sheets that form a continuous barrier around all cells. The cell membranes of almost all organisms and many viruses are made of a lipid bilayer, as are the nuclear membrane surrounding the cell nucleus, and other membranes surrounding sub-cellular structures. The lipid bilayer is the barrier that keeps ions, proteins and other molecules where they are needed and prevents them from diffusing into areas where they should not be. Lipid bilayers are ideally suited to this role, even though they are only a few nanometers in width, they are impermeable to most water-soluble (hydrophilic) molecules. Bilayers are particularly impermeable to ions, which allows cells to regulate salt concentrations and pH by transporting ions across their membranes using proteins called ion pumps.

Kyte-Doolittle-Hydropathy Plot for Human RET proto-oncogene. Plot was created using the ExPASy Protscale tool (http://web.expasy.org/protscale/). RET Kyte-Doolittle-Hydropathy Plot RTAHIR.gif
Kyte-Doolittle-Hydropathy Plot for Human RET proto-oncogene. Plot was created using the ExPASy Protscale tool (http://web.expasy.org/protscale/).
Hopp-Woods-Hydropathy Plot for Human RET proto-oncogene. Plot was created using the ExPASy Protscale tool (http://web.expasy.org/protscale/). RET Hopp-Woods-Hydropathy Plot RTAHIR.gif
Hopp-Woods-Hydropathy Plot for Human RET proto-oncogene. Plot was created using the ExPASy Protscale tool (http://web.expasy.org/protscale/).
Amino Acid Hydropathy Scores [1]
Amino AcidOne Letter CodeHydropathy Score
IsoleucineI4.5
ValineV4.2
LeucineL3.8
PhenylalanineF2.8
CysteineC2.5
MethionineM1.9
AlanineA1.8
GlycineG-0.4
ThreonineT-0.7
SerineS-0.8
TryptophanW-0.9
TyrosineY-1.3
ProlineP-1.6
HistidineH-3.2
Glutamic acidE-3.5
GlutamineQ-3.5
Aspartic acidD-3.5
AsparagineN-3.5
LysineK-3.9
ArginineR-4.5

The data in the above table was generated using a computer program that evaluates the average hydrophobicity of segments within a protein and uses data collected from literature.

Related Research Articles

Alpha helix type of secondary structure

The alpha helix (α-helix) is a common motif in the secondary structure of proteins and is a right hand-helix conformation in which every backbone N−H group donates a hydrogen bond to the backbone C=O group of the amino acid located three or four residues earlier along the protein sequence.

Protein folding the process of assisting in the covalent and noncovalent assembly of single chain polypeptides or multisubunit complexes into the correct tertiary structure

Protein folding is the physical process by which a protein chain acquires its native 3-dimensional structure, a conformation that is usually biologically functional, in an expeditious and reproducible manner. It is the physical process by which a polypeptide folds into its characteristic and functional three-dimensional structure from random coil. Each protein exists as an unfolded polypeptide or random coil when translated from a sequence of mRNA to a linear chain of amino acids. This polypeptide lacks any stable (long-lasting) three-dimensional structure. As the polypeptide chain is being synthesized by a ribosome, the linear chain begins to fold into its three-dimensional structure. Folding begins to occur even during translation of the polypeptide chain. Amino acids interact with each other to produce a well-defined three-dimensional structure, the folded protein, known as the native state. The resulting three-dimensional structure is determined by the amino acid sequence or primary structure.

Hexokinase enzyme

A hexokinase is an enzyme that phosphorylates hexoses, forming hexose phosphate. In most organisms, glucose is the most important substrate of hexokinases, and glucose-6-phosphate is the most important product. Hexokinase possesses the ability to transfer an inorganic phosphate group from ATP to a substrate.

Protein structure prediction

Protein structure prediction is the inference of the three-dimensional structure of a protein from its amino acid sequence—that is, the prediction of its folding and its secondary and tertiary structure from its primary structure. Structure prediction is fundamentally different from the inverse problem of protein design. Protein structure prediction is one of the most important goals pursued by bioinformatics and theoretical chemistry; it is highly important in medicine and biotechnology. Every two years, the performance of current methods is assessed in the CASP experiment. A continuous evaluation of protein structure prediction web servers is performed by the community project CAMEO3D.

In bioinformatics and evolutionary biology, a substitution matrix describes the rate at which one character in a sequence changes to other character states over time. Substitution matrices are usually seen in the context of amino acid or DNA sequence alignments, where the similarity between sequences depends on their divergence time and the substitution rates as represented in the matrix.

Proteinogenic amino acid amino acid that is incorporated biosynthetically into proteins during translation

Proteinogenic amino acids are amino acids that are incorporated biosynthetically into proteins during translation. The word "proteinogenic" means "protein creating". Throughout known life, there are 22 genetically encoded (proteinogenic) amino acids, 20 in the standard genetic code and an additional 2 that can be incorporated by special translation mechanisms.

Salting out is an effect based on the electrolyte-non electrolyte interaction, in which the non-electrolyte could be less soluble at high salt concentrations. It is used as a method of purification for proteins, as well as preventing protein denaturation due to excessively diluted samples during experiments. The salt concentration needed for the protein to precipitate out of the solution differs from protein to protein. This process is also used to concentrate dilute solutions of proteins. Dialysis can be used to remove the salt if needed.

Lattice proteins are highly simplified computer models of proteins which are used to investigate protein folding.

Leucine zipper

A leucine zipper is a common three-dimensional structural motif in proteins. They were first described by Landschulz and collaborators in 1988 when they found that an enhancer binding protein had a very characteristic 30-amino acid segment and the display of these amino acid sequences on an idealized alpha helix revealed a periodic repetition of leucine residues at every seventh position over a distance covering eight helical turns. The polypeptide segments containing these periodic arrays of leucine residues were proposed to exist in an alpha-helical conformation and the leucine side chains from one alpha helix interdigitate with those from the alpha helix of a second polypeptide, facilitating dimerization.

Amphiphile

An amphiphile is a chemical compound possessing both hydrophilic and lipophilic (fat-loving) properties. Such a compound is called amphiphilic or amphipathic. This forms the basis for a number of areas of research in chemistry and biochemistry, notably that of lipid polymorphism. Organic compounds containing hydrophilic groups at both ends of a prolate molecule are called bolaamphiphilic. Common amphiphilic substances are soaps, detergents and lipoproteins.

Solvent exposure occurs when a chemical, material, or person comes into contact with a solvent. Chemicals can be dissolved in solvents, materials such as polymers can be broken down chemically by solvents, and people can develop certain ailments from exposure to solvents both organic and inorganic.

SOSUI is a free online tool that predicts a part of the secondary structure of proteins from a given amino acid sequence (AAS). The main objective is to determine whether the protein in question is a soluble or a transmembrane protein.

Beta barrel protein domain

A beta barrel is a beta-sheet composed of tandem repeats that twists and coils to form a closed toroidal structure in which the first strand is bonded to the last strand. Beta-strands in many beta-barrels are arranged in an antiparallel fashion. Beta barrel structures are named for resemblance to the barrels used to contain liquids. Most of them are water-soluble proteins and frequently bind hydrophobic ligands in the barrel center, as in lipocalins. Others span cell membranes and commonly found in porins. Porin-like barrel structures are encoded by as many as 2–3% of the genes in Gram-negative bacteria.

Reversed-phase chromatography includes any chromatographic method that uses a hydrophobic stationary phase. RPC refers to liquid chromatography.

Hydrophobic collapse is a proposed process for the production of the 3-D conformation adopted by polypeptides in polar solvents. The theory states that the nascent polypeptide forms initial secondary structure creating localized regions of predominantly hydrophobic residues. The polypeptide interacts with water, thus placing thermodynamic pressures on these regions which then aggregate or "collapse" into a tertiary conformation with a hydrophobic core. Incidentally, polar residues interact favourably with water, thus the solvent-facing surface of the peptide is usually composed of predominantly hydrophilic regions.

Protein precipitation is widely used in downstream processing of biological products in order to concentrate proteins and purify them from various contaminants. For example, in the biotechnology industry protein precipitation is used to eliminate contaminants commonly contained in blood. The underlying mechanism of precipitation is to alter the solvation potential of the solvent, more specifically, by lowering the solubility of the solute by addition of a reagent.

Helical wheel

A helical wheel is a type of plot or visual representation used to illustrate the properties of alpha helices in proteins.

Hydrophobicity scales are values that define relative hydrophobicity of amino acid residues. The more positive the value, the more hydrophobic are the amino acids located in that region of the protein. These scales are commonly used to predict the transmembrane alpha-helices of membrane proteins. When consecutively measuring amino acids of a protein, changes in value indicate attraction of specific protein regions towards the hydrophobic region inside lipid bilayer.

The Hopp–Woods hydrophilicity scale of amino acids is a method of ranking the amino acids in a protein according to their water solubility in order to search for surface locations on proteins, and especially those locations that tend to form strong interactions with other macromolecules such as proteins, DNA, and RNA.

Volume, Area, Dihedral Angle Reporter (VADAR) is a freely available protein structure validation web server that was developed as a collaboration between Dr. Brian Sykes and Dr. David Wishart at the University of Alberta. VADAR consists of >15 different algorithms and programs for assessing and validating peptide and protein structures from their PDB coordinate data. VADAR is capable of determining secondary structure, identifying and classifying six different types of beta turns, determining and calculating the strength of C=O -- N-H hydrogen bonds, calculating residue-specific accessible surface areas (ASA), calculating residue volumes, determining backbone and side chain torsion angles, assessing local structure quality, evaluating global structure quality and identifying residue “outliers”. The results have been validated through extensive comparison to published data and careful visual inspection. VADAR produces both text and graphical output with most of the quantitative data presented in easily viewed tables. In particular, VADAR’s output is presented in a vertical, tabular format with most of the sequence data, residue numbering and any other calculated property or feature presented from top to bottom, rather than from left to right.

References

  1. Kyte, J; Doolittle, R. F. (1982). "A simple method for displaying the hydropathic character of a protein". Journal of Molecular Biology. 157 (1): 105–32. doi:10.1016/0022-2836(82)90515-0. PMID   7108955.