Proteinogenic amino acid

Last updated
Proteinogenic amino acids are a small fraction of all amino acids Nonproteinogenic AAs.svg
Proteinogenic amino acids are a small fraction of all amino acids

Proteinogenic amino acids are amino acids that are incorporated biosynthetically into proteins during translation. The word "proteinogenic" means "protein creating". Throughout known life, there are 22 genetically encoded (proteinogenic) amino acids, 20 in the standard genetic code and an additional 2 (selenocysteine and pyrrolysine) that can be incorporated by special translation mechanisms. [1]

Contents

In contrast, non-proteinogenic amino acids are amino acids that are either not incorporated into proteins (like GABA, L-DOPA, or triiodothyronine), misincorporated in place of a genetically encoded amino acid, or not produced directly and in isolation by standard cellular machinery (like hydroxyproline). The latter often results from post-translational modification of proteins. Some non-proteinogenic amino acids are incorporated into nonribosomal peptides which are synthesized by non-ribosomal peptide synthetases.

Both eukaryotes and prokaryotes can incorporate selenocysteine into their proteins via a nucleotide sequence known as a SECIS element, which directs the cell to translate a nearby UGA codon as selenocysteine (UGA is normally a stop codon). In some methanogenic prokaryotes, the UAG codon (normally a stop codon) can also be translated to pyrrolysine. [2]

In eukaryotes, there are only 21 proteinogenic amino acids, the 20 of the standard genetic code, plus selenocysteine. Humans can synthesize 12 of these from each other or from other molecules of intermediary metabolism. The other nine must be consumed (usually as their protein derivatives), and so they are called essential amino acids. The essential amino acids are histidine, isoleucine, leucine, lysine, methionine, phenylalanine, threonine, tryptophan, and valine (i.e. H, I, L, K, M, F, T, W, V). [3]

The proteinogenic amino acids have been found to be related to the set of amino acids that can be recognized by ribozyme autoaminoacylation systems. [4] Thus, non-proteinogenic amino acids would have been excluded by the contingent evolutionary success of nucleotide-based life forms. Other reasons have been offered to explain why certain specific non-proteinogenic amino acids are not generally incorporated into proteins; for example, ornithine and homoserine cyclize against the peptide backbone and fragment the protein with relatively short half-lives, while others are toxic because they can be mistakenly incorporated into proteins, such as the arginine analog canavanine.

The evolutionary selection of certain proteinogenic amino acids from the primordial soup has been suggested to be because of their better incorporation into a polypeptide chain as opposed to non-proteinogenic amino acids. [5]

Structures

The following illustrates the structures and abbreviations of the 21 amino acids that are directly encoded for protein synthesis by the genetic code of eukaryotes. The structures given below are standard chemical structures, not the typical zwitterion forms that exist in aqueous solutions.

Structure of the 21 proteinogenic amino acids with 3 and 1 letters codes, grouped by side chain functionality Proteinogenic Amino Acid Table.png
Structure of the 21 proteinogenic amino acids with 3 and 1 letters codes, grouped by side chain functionality

IUPAC/IUBMB now also recommends standard abbreviations for the following two amino acids:

Chemical properties

Following is a table listing the one-letter symbols, the three-letter symbols, and the chemical properties of the side chains of the standard amino acids. The masses listed are based on weighted averages of the elemental isotopes at their natural abundances. Forming a peptide bond results in elimination of a molecule of water. Therefore, the protein's mass is equal to the mass of amino acids the protein is composed of minus 18.01524 Da per peptide bond.

General chemical properties

Amino acidShortAbbrev.Avg. mass (Da) pI pK 1
(α-COO-)
pK2
(α-NH3+)
Alanine AAla89.094046.012.359.87
Cysteine CCys121.154045.051.9210.70
Aspartic acid DAsp133.103842.851.999.90
Glutamic acid EGlu147.130743.152.109.47
Phenylalanine FPhe165.191845.492.209.31
Glycine GGly75.067146.062.359.78
Histidine HHis155.156347.601.809.33
Isoleucine IIle131.174646.052.329.76
Lysine KLys146.189349.602.169.06
Leucine LLeu131.174646.012.339.74
Methionine MMet149.207845.742.139.28
Asparagine NAsn132.119045.412.148.72
Pyrrolysine OPyl255.31 ? ? ?
Proline PPro115.131946.301.9510.64
Glutamine QGln146.145945.652.179.13
Arginine RArg174.2027410.761.828.99
Serine SSer105.093445.682.199.21
Threonine TThr119.120345.602.099.10
Selenocysteine USec168.0535.471.9110
Valine VVal117.147846.002.399.74
Tryptophan WTrp204.228445.892.469.41
Tyrosine YTyr181.191245.642.209.21

Side-chain properties

Amino acidShortAbbrev.Side chain Hydro-
phobic
pKa § Polar pH SmallTiny Aromatic
or Aliphatic
van der Waals
volume
3)
Alanine AAla-CH3Yes check.svg-Dark Red x.svg-Yes check.svgYes check.svgAliphatic67
Cysteine CCys-CH2 SH Yes check.svg8.55Yes check.svgacidicYes check.svgYes check.svg-86
Aspartic acid DAsp-CH2COOHDark Red x.svg3.67Yes check.svgacidicYes check.svgDark Red x.svg-91
Glutamic acid EGlu-CH2CH2COOHDark Red x.svg4.25Yes check.svgacidicDark Red x.svgDark Red x.svg-109
Phenylalanine FPhe-CH2C6H5Yes check.svg-Dark Red x.svg-Dark Red x.svgDark Red x.svgAromatic135
Glycine GGly-HYes check.svg-Dark Red x.svg-Yes check.svgYes check.svg-48
Histidine HHis-CH2-C3H3N2 Dark Red x.svg6.54Yes check.svgweak basicDark Red x.svgDark Red x.svgAromatic118
Isoleucine IIle-CH(CH3)CH2CH3Yes check.svg-Dark Red x.svg-Dark Red x.svgDark Red x.svgAliphatic124
Lysine KLys-(CH2)4NH2Dark Red x.svg10.40Yes check.svgbasicDark Red x.svgDark Red x.svg-135
Leucine LLeu-CH2CH(CH3)2Yes check.svg-Dark Red x.svg-Dark Red x.svgDark Red x.svgAliphatic124
Methionine MMet-CH2CH2 SCH3Yes check.svg-Dark Red x.svg-Dark Red x.svgDark Red x.svgAliphatic124
Asparagine NAsn-CH2CONH2Dark Red x.svg-Yes check.svg-Yes check.svgDark Red x.svg-96
Pyrrolysine OPyl-(CH2)4NHCOC4H5NCH3Dark Red x.svgN.D.Yes check.svgweak basicDark Red x.svgDark Red x.svg- ?
Proline PPro-CH2CH2CH2-Yes check.svg-Dark Red x.svg-Yes check.svgDark Red x.svg-90
Glutamine QGln-CH2CH2CONH2Dark Red x.svg-Yes check.svg-Dark Red x.svgDark Red x.svg-114
Arginine RArg-(CH2)3NH-C(NH)NH2Dark Red x.svg12.3Yes check.svgstrongly basicDark Red x.svgDark Red x.svg-148
Serine SSer-CH2OHDark Red x.svg-Yes check.svg-Yes check.svgYes check.svg-73
Threonine TThr-CH(OH)CH3Dark Red x.svg-Yes check.svg-Yes check.svgDark Red x.svg-93
Selenocysteine USec-CH2 SeH Dark Red x.svg5.43Dark Red x.svgacidicYes check.svgYes check.svg- ?
Valine VVal-CH(CH3)2Yes check.svg-Dark Red x.svg-Yes check.svgDark Red x.svgAliphatic105
Tryptophan WTrp-CH2 C8H6N Yes check.svg-Dark Red x.svg-Dark Red x.svgDark Red x.svgAromatic163
Tyrosine YTyr-CH2-C6H4OHDark Red x.svg9.84Yes check.svgweak acidicDark Red x.svgDark Red x.svgAromatic141

§: Values for Asp, Cys, Glu, His, Lys & Tyr were determined using the amino acid residue placed centrally in an alanine pentapeptide. [6] The value for Arg is from Pace et al. (2009). [7] The value for Sec is from Byun & Kang (2011). [8]

N.D.: The pKa value of Pyrrolysine has not been reported.

Note: The pKa value of an amino-acid residue in a small peptide is typically slightly different when it is inside a protein. Protein pKa calculations are sometimes used to calculate the change in the pKa value of an amino-acid residue in this situation.

Gene expression and biochemistry

Amino acidShortAbbrev. Codon(s)OccurrenceEssential in humans
in Archaean proteins
(%)&
in Bacteria proteins
(%)&
in Eukaryote proteins
(%)&
in human proteins
(%)&
Alanine AAlaGCU, GCC, GCA, GCG8.210.067.637.01No
Cysteine CCysUGU, UGC0.980.941.762.3Conditionally
Aspartic acid DAspGAU, GAC6.215.595.44.73No
Glutamic acid EGluGAA, GAG7.696.156.427.09Conditionally
Phenylalanine FPheUUU, UUC3.863.893.873.65Yes
Glycine GGlyGGU, GGC, GGA, GGG7.587.766.336.58Conditionally
Histidine HHisCAU, CAC1.772.062.442.63Yes
Isoleucine IIleAUU, AUC, AUA7.035.895.14.33Yes
Lysine KLysAAA, AAG5.274.685.645.72Yes
Leucine LLeuUUA, UUG, CUU, CUC, CUA, CUG9.3110.099.299.97Yes
Methionine MMetAUG2.352.382.252.13Yes
Asparagine NAsnAAU, AAC3.683.584.283.58No
Pyrrolysine OPylUAG* 0000No
Proline PProCCU, CCC, CCA, CCG4.264.615.416.31No
Glutamine QGlnCAA, CAG2.383.584.214.77No
Arginine RArgCGU, CGC, CGA, CGG, AGA, AGG5.515.885.715.64Conditionally
Serine SSerUCU, UCC, UCA, UCG, AGU, AGC6.175.858.348.33No
Threonine TThrACU, ACC, ACA, ACG5.445.525.565.36Yes
Selenocysteine USecUGA** 000>0No
Valine VValGUU, GUC, GUA, GUG7.87.276.25.96Yes
Tryptophan WTrpUGG1.031.271.241.22Yes
Tyrosine YTyrUAU, UAC3.352.942.872.66Conditionally
Stop codon -TermUAA, UAG, UGA††  ? ? ?

* UAG is normally the amber stop codon, but in organisms containing the biological machinery encoded by the pylTSBCD cluster of genes the amino acid pyrrolysine will be incorporated. [9]
** UGA is normally the opal (or umber) stop codon, but encodes selenocysteine if a SECIS element is present.
The stop codon is not an amino acid, but is included for completeness.
†† UAG and UGA do not always act as stop codons (see above).
An essential amino acid cannot be synthesized in humans and must, therefore, be supplied in the diet. Conditionally essential amino acids are not normally required in the diet, but must be supplied exogenously to specific populations that do not synthesize it in adequate amounts.
& Occurrence of amino acids is based on 135 Archaea, 3775 Bacteria, 614 Eukaryota proteomes and human proteome (21 006 proteins) respectively. [10]

Mass spectrometry

In mass spectrometry of peptides and proteins, knowledge of the masses of the residues is useful. The mass of the peptide or protein is the sum of the residue masses plus the mass of water (Monoisotopic mass = 18.01056 Da; average mass = 18.0153 Da). The residue masses are calculated from the tabulated chemical formulas and atomic weights. [11] In mass spectrometry, ions may also include one or more protons (Monoisotopic mass = 1.00728 Da; average mass* = 1.0074 Da). *Protons cannot have an average mass, this confusingly infers to Deuterons as a valid isotope, but they should be a different species (see Hydron (chemistry))

Amino acidShortAbbrev.FormulaMon. mass§ ( Da )Avg. mass ( Da )
Alanine AAlaC3H5NO71.0371171.0779
Cysteine CCysC3H5NOS103.00919103.1429
Aspartic acid DAspC4H5NO3115.02694115.0874
Glutamic acid EGluC5H7NO3129.04259129.1140
Phenylalanine FPheC9H9NO147.06841147.1739
Glycine GGlyC2H3NO57.0214657.0513
Histidine HHisC6H7N3O137.05891137.1393
Isoleucine IIleC6H11NO113.08406113.1576
Lysine KLysC6H12N2O128.09496128.1723
Leucine LLeuC6H11NO113.08406113.1576
Methionine MMetC5H9NOS131.04049131.1961
Asparagine NAsnC4H6N2O2114.04293114.1026
Pyrrolysine OPylC12H19N3O2237.14773237.2982
Proline PProC5H7NO97.0527697.1152
Glutamine QGlnC5H8N2O2128.05858128.1292
Arginine RArgC6H12N4O156.10111156.1857
Serine SSerC3H5NO287.0320387.0773
Threonine TThrC4H7NO2101.04768101.1039
Selenocysteine USecC3H5NOSe150.95364150.0489
Valine VValC5H9NO99.0684199.1311
Tryptophan WTrpC11H10N2O186.07931186.2099
Tyrosine YTyrC9H9NO2163.06333163.1733

§ Monoisotopic mass

Stoichiometry and metabolic cost in cell

The table below lists the abundance of amino acids in E.coli cells and the metabolic cost (ATP) for synthesis of the amino acids. Negative numbers indicate the metabolic processes are energy favorable and do not cost net ATP of the cell. [12] The abundance of amino acids includes amino acids in free form and in polymerization form (proteins).

Amino acidShortAbbrev.Abundance
(# of molecules (×108)
per E. coli cell)
ATP cost in synthesis
Aerobic
conditions
Anaerobic
conditions
Alanine AAla2.9-11
Cysteine CCys0.521115
Aspartic acid DAsp1.402
Glutamic acid EGlu1.5-7-1
Phenylalanine FPhe1.1-62
Glycine GGly3.5-22
Histidine HHis0.5417
Isoleucine IIle1.7711
Lysine KLys2.059
Leucine LLeu2.6-91
Methionine MMet0.882123
Asparagine NAsn1.435
Pyrrolysine OPyl---
Proline PPro1.3-24
Glutamine QGln1.5-60
Arginine RArg1.7513
Serine SSer1.2-22
Threonine TThr1.568
Selenocysteine USec---
Valine VVal2.4-22
Tryptophan WTrp0.33-77
Tyrosine YTyr0.79-82

Remarks

Amino acidAbbrev.Remarks
Alanine AAlaVery abundant and very versatile, it is more stiff than glycine, but small enough to pose only small steric limits for the protein conformation. It behaves fairly neutrally, and can be located in both hydrophilic regions on the protein outside and the hydrophobic areas inside.
Asparagine or aspartic acid BAsxA placeholder when either amino acid may occupy a position
Cysteine CCysThe sulfur atom bonds readily to heavy metal ions. Under oxidizing conditions, two cysteines can join in a disulfide bond to form the amino acid cystine. When cystines are part of a protein, insulin for example, the tertiary structure is stabilized, which makes the protein more resistant to denaturation; therefore, disulfide bonds are common in proteins that have to function in harsh environments including digestive enzymes (e.g., pepsin and chymotrypsin) and structural proteins (e.g., keratin). Disulfides are also found in peptides too small to hold a stable shape on their own (e.g. insulin).
Aspartic acid DAspAsp behaves similarly to glutamic acid, and carries a hydrophilic acidic group with strong negative charge. Usually, it is located on the outer surface of the protein, making it water-soluble. It binds to positively charged molecules and ions, and is often used in enzymes to fix the metal ion. When located inside of the protein, aspartate and glutamate are usually paired with arginine and lysine.
Glutamic acid EGluGlu behaves similarly to aspartic acid, and has a longer, slightly more flexible side chain.
Phenylalanine FPheEssential for humans, phenylalanine, tyrosine, and tryptophan contain a large, rigid aromatic group on the side chain. These are the biggest amino acids. Like isoleucine, leucine, and valine, these are hydrophobic and tend to orient towards the interior of the folded protein molecule. Phenylalanine can be converted into tyrosine.
Glycine GGlyBecause of the two hydrogen atoms at the α carbon, glycine is not optically active. It is the smallest amino acid, rotates easily, and adds flexibility to the protein chain. It is able to fit into the tightest spaces, e.g., the triple helix of collagen. As too much flexibility is usually not desired, as a structural component, it is less common than alanine.
Histidine HHisHis is essential for humans. In even slightly acidic conditions, protonation of the nitrogen occurs, changing the properties of histidine and the polypeptide as a whole. It is used by many proteins as a regulatory mechanism, changing the conformation and behavior of the polypeptide in acidic regions such as the late endosome or lysosome, enforcing conformation change in enzymes. However, only a few histidines are needed for this, so it is comparatively scarce.
Isoleucine IIleIle is essential for humans. Isoleucine, leucine, and valine have large aliphatic hydrophobic side chains. Their molecules are rigid, and their mutual hydrophobic interactions are important for the correct folding of proteins, as these chains tend to be located inside of the protein molecule.
Leucine or isoleucine JXleA placeholder when either amino acid may occupy a position
Lysine KLysLys is essential for humans, and behaves similarly to arginine. It contains a long, flexible side chain with a positively charged end. The flexibility of the chain makes lysine and arginine suitable for binding to molecules with many negative charges on their surfaces. E.g., DNA-binding proteins have their active regions rich with arginine and lysine. The strong charge makes these two amino acids prone to be located on the outer hydrophilic surfaces of the proteins; when they are found inside, they are usually paired with a corresponding negatively charged amino acid, e.g., aspartate or glutamate.
Leucine LLeuLeu is essential for humans, and behaves similarly to isoleucine and valine.
Methionine MMetMet is essential for humans. Always the first amino acid to be incorporated into a protein, it is sometimes removed after translation. Like cysteine, it contains sulfur, but with a methyl group instead of hydrogen. This methyl group can be activated, and is used in many reactions where a new carbon atom is being added to another molecule.
Asparagine NAsnSimilar to aspartic acid, Asn contains an amide group where Asp has a carboxyl.
Pyrrolysine OPylSimilar to lysine, but it has a pyrroline ring attached.
Proline PProPro contains an unusual ring to the N-end amine group, which forces the CO-NH amide sequence into a fixed conformation. It can disrupt protein folding structures like α helix or β sheet, forcing the desired kink in the protein chain. Common in collagen, it often undergoes a post-translational modification to hydroxyproline.
Glutamine QGlnSimilar to glutamic acid, Gln contains an amide group where Glu has a carboxyl. Used in proteins and as a storage for ammonia, it is the most abundant amino acid in the body.
Arginine RArgFunctionally similar to lysine.
Serine SSerSerine and threonine have a short group ended with a hydroxyl group. Its hydrogen is easy to remove, so serine and threonine often act as hydrogen donors in enzymes. Both are very hydrophilic, so the outer regions of soluble proteins tend to be rich with them.
Threonine TThrEssential for humans, Thr behaves similarly to serine.
Selenocysteine USecThe selenium analog of cysteine, in which selenium replaces the sulfur atom.
Valine VValEssential for humans, Val behaves similarly to isoleucine and leucine.
Tryptophan WTrpEssential for humans, Trp behaves similarly to phenylalanine and tyrosine. It is a precursor of serotonin and is naturally fluorescent.
UnknownXXaaPlaceholder when the amino acid is unknown or unimportant.
Tyrosine YTyrTyr behaves similarly to phenylalanine (precursor to tyrosine) and tryptophan, and is a precursor of melanin, epinephrine, and thyroid hormones. Naturally fluorescent, its fluorescence is usually quenched by energy transfer to tryptophans.
Glutamic acid or glutamine ZGlxA placeholder when either amino acid may occupy a position
Amino acid catabolism Amino acid catabolism revised.png
Amino acid catabolism

Catabolism

Amino acids can be classified according to the properties of their main products: [13]

See also

Related Research Articles

<span class="mw-page-title-main">Amino acid</span> Organic compounds containing amine and carboxylic groups

Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although over 500 amino acids exist in nature, by far the most important are the 22 α-amino acids incorporated into proteins. Only these 22 appear in the genetic code of life.

<span class="mw-page-title-main">Genetic code</span> Rules by which information encoded within genetic material is translated into proteins

The genetic code is the set of rules used by living cells to translate information encoded within genetic material into proteins. Translation is accomplished by the ribosome, which links proteinogenic amino acids in an order specified by messenger RNA (mRNA), using transfer RNA (tRNA) molecules to carry amino acids and to read the mRNA three nucleotides at a time. The genetic code is highly similar among all organisms and can be expressed in a simple table with 64 entries.

<span class="mw-page-title-main">Protein</span> Biomolecule consisting of chains of amino acid residues

Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, responding to stimuli, providing structure to cells and organisms, and transporting molecules from one location to another. Proteins differ from one another primarily in their sequence of amino acids, which is dictated by the nucleotide sequence of their genes, and which usually results in protein folding into a specific 3D structure that determines its activity.

<span class="mw-page-title-main">Selenocysteine</span> Chemical compound

Selenocysteine is the 21st proteinogenic amino acid. Selenoproteins contain selenocysteine residues. Selenocysteine is an analogue of the more common cysteine with selenium in place of the sulfur.

<span class="mw-page-title-main">Stop codon</span> Codon that marks the end of a protein-coding sequence

In molecular biology, a stop codon is a codon that signals the termination of the translation process of the current protein. Most codons in messenger RNA correspond to the addition of an amino acid to a growing polypeptide chain, which may ultimately become a protein; stop codons signal the termination of this process by binding release factors, which cause the ribosomal subunits to disassociate, releasing the amino acid chain.

Proline (symbol Pro or P) is an organic acid classed as a proteinogenic amino acid (used in the biosynthesis of proteins), although it does not contain the amino group -NH
2
but is rather a secondary amine. The secondary amine nitrogen is in the protonated form (NH2+) under biological conditions, while the carboxyl group is in the deprotonated −COO form. The "side chain" from the α carbon connects to the nitrogen forming a pyrrolidine loop, classifying it as a aliphatic amino acid. It is non-essential in humans, meaning the body can synthesize it from the non-essential amino acid L-glutamate. It is encoded by all the codons starting with CC (CCU, CCC, CCA, and CCG).

<span class="mw-page-title-main">Methionine</span> Sulfur-containing amino acid

Methionine is an essential amino acid in humans.

<span class="mw-page-title-main">Pyrrolysine</span> Chemical compound

Pyrrolysine is an α-amino acid that is used in the biosynthesis of proteins in some methanogenic archaea and bacteria; it is not present in humans. It contains an α-amino group, a carboxylic acid group. Its pyrroline side-chain is similar to that of lysine in being basic and positively charged at neutral pH.

<span class="mw-page-title-main">Isoleucine</span> Chemical compound

Isoleucine (symbol Ile or I) is an α-amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated −NH+3 form under biological conditions), an α-carboxylic acid group (which is in the deprotonated −COO form under biological conditions), and a hydrocarbon side chain with a branch (a central carbon atom bound to three other carbon atoms). It is classified as a non-polar, uncharged (at physiological pH), branched-chain, aliphatic amino acid. It is essential in humans, meaning the body cannot synthesize it. Essential amino acids are necessary in the human diet. In plants isoleucine can be synthesized from threonine and methionine. In plants and bacteria, isoleucine is synthesized from pyruvate employing leucine biosynthesis enzymes. It is encoded by the codons AUU, AUC, and AUA.

<span class="mw-page-title-main">Central dogma of molecular biology</span> Explanation of the flow of genetic information within a biological system

The central dogma of molecular biology is an explanation of the flow of genetic information within a biological system. It is often stated as "DNA makes RNA, and RNA makes protein", although this is not its original meaning. It was first stated by Francis Crick in 1957, then published in 1958:

The Central Dogma. This states that once "information" has passed into protein it cannot get out again. In more detail, the transfer of information from nucleic acid to nucleic acid, or from nucleic acid to protein may be possible, but transfer from protein to protein, or from protein to nucleic acid is impossible. Information here means the precise determination of sequence, either of bases in the nucleic acid or of amino acid residues in the protein.

<span class="mw-page-title-main">Translation (biology)</span> Cellular process of protein synthesis

In biology, translation is the process in living cells in which proteins are produced using RNA molecules as templates. The generated protein is a sequence of amino acids. This sequence is determined by the sequence of nucleotides in the RNA. The nucleotides are considered three at a time. Each such triple results in addition of one specific amino acid to the protein being generated. The matching from nucleotide triple to amino acid is called the genetic code. The translation is performed by a large complex of functional RNA and proteins called ribosomes. The entire process is called gene expression.

<span class="mw-page-title-main">Biomolecule</span> Molecule produced by a living organism

A biomolecule or biological molecule is loosely defined as a molecule produced by a living organism and essential to one or more typically biological processes. Biomolecules include large macromolecules such as proteins, carbohydrates, lipids, and nucleic acids, as well as small molecules such as vitamins and hormones. A more general name for this class of material is biological materials. Biomolecules are an important element of living organisms, those biomolecules are often endogenous, produced within the organism but organisms usually need exogenous biomolecules, for example certain nutrients, to survive.

In genetics, a nonsense mutation is a point mutation in a sequence of DNA that results in a nonsense codon, or a premature stop codon in the transcribed mRNA, and leads to a truncated, incomplete, and possibly nonfunctional protein product. Nonsense mutations are not always harmful; the functional effect of a nonsense mutation depends on many aspects, such as the location of the stop codon within the coding DNA. For example, the effect of a nonsense mutation depends on the proximity of the nonsense mutation to the original stop codon, and the degree to which functional subdomains of the protein are affected. As nonsense mutations leads to premature termination of polypeptide chains; they are also called chain termination mutations.

Xenobiology (XB) is a subfield of synthetic biology, the study of synthesizing and manipulating biological devices and systems. The name "xenobiology" derives from the Greek word xenos, which means "stranger, alien". Xenobiology is a form of biology that is not (yet) familiar to science and is not found in nature. In practice, it describes novel biological systems and biochemistries that differ from the canonical DNA–RNA-20 amino acid system. For example, instead of DNA or RNA, XB explores nucleic acid analogues, termed xeno nucleic acid (XNA) as information carriers. It also focuses on an expanded genetic code and the incorporation of non-proteinogenic amino acids, or “xeno amino acids” into proteins.

Bacterial translation is the process by which messenger RNA is translated into proteins in bacteria.

A peptide library is a tool for studying proteins. Peptide libraries typically contain a large number of peptides that have a systematic combination of amino acids. Usually, solid phase synthesis, e.g. resin as a flat surface or beads, is used for peptide library generation. Peptide libraries are a popular tool for experiments in drug design, protein–protein interactions, and other biochemical and pharmaceutical applications.

<span class="mw-page-title-main">Expanded genetic code</span> Modified genetic code

An expanded genetic code is an artificially modified genetic code in which one or more specific codons have been re-allocated to encode an amino acid that is not among the 22 common naturally-encoded proteinogenic amino acids.

<span class="mw-page-title-main">PYLIS downstream sequence</span> Structure on some mRNA sequences

In biology, the PYLIS downstream sequence is a stem-loop structure that appears on some mRNA sequences. This structural motif was previously thought to cause the UAG (amber) stop codon to be translated to the amino acid pyrrolysine instead of ending the protein translation. However, it has been shown that PYLIS has no effect upon the efficiency of the UAG suppression, hence even its name is, in fact, incorrect.

<span class="mw-page-title-main">DNA and RNA codon tables</span> List of standard rules to translate DNA encoded information into proteins

A codon table can be used to translate a genetic code into a sequence of amino acids. The standard genetic code is traditionally represented as an RNA codon table, because when proteins are made in a cell by ribosomes, it is messenger RNA (mRNA) that directs protein synthesis. The mRNA sequence is determined by the sequence of genomic DNA. In this context, the standard genetic code is referred to as translation table 1. It can also be represented in a DNA codon table. The DNA codons in such tables occur on the sense DNA strand and are arranged in a 5-to-3 direction. Different tables with alternate codons are used depending on the source of the genetic code, such as from a cell nucleus, mitochondrion, plastid, or hydrogenosome.

<span class="mw-page-title-main">Non-proteinogenic amino acids</span> Are not naturally encoded in the genome

In biochemistry, non-coded or non-proteinogenic amino acids are distinct from the 22 proteinogenic amino acids, which are naturally encoded in the genome of organisms for the assembly of proteins. However, over 140 non-proteinogenic amino acids occur naturally in proteins and thousands more may occur in nature or be synthesized in the laboratory. Chemically synthesized amino acids can be called unnatural amino acids. Unnatural amino acids can be synthetically prepared from their native analogs via modifications such as amine alkylation, side chain substitution, structural bond extension cyclization, and isosteric replacements within the amino acid backbone. Many non-proteinogenic amino acids are important:

References

  1. Ambrogelly A, Palioura S, Söll D (January 2007). "Natural expansion of the genetic code". Nature Chemical Biology. 3 (1): 29–35. doi:10.1038/nchembio847. PMID   17173027.
  2. Lobanov AV, Turanov AA, Hatfield DL, Gladyshev VN (August 2010). "Dual functions of codons in the genetic code". Critical Reviews in Biochemistry and Molecular Biology. 45 (4): 257–65. doi:10.3109/10409231003786094. PMC   3311535 . PMID   20446809.
  3. Young VR (August 1994). "Adult amino acid requirements: the case for a major revision in current recommendations" (PDF). The Journal of Nutrition. 124 (8 Suppl): 1517S–1523S. doi:10.1093/jn/124.suppl_8.1517S. PMID   8064412.
  4. Erives A (August 2011). "A model of proto-anti-codon RNA enzymes requiring L-amino acid homochirality". Journal of Molecular Evolution. 73 (1–2): 10–22. Bibcode:2011JMolE..73...10E. doi:10.1007/s00239-011-9453-4. PMC   3223571 . PMID   21779963.
  5. Frenkel-Pinter, Moran; Haynes, Jay W.; C, Martin; Petrov, Anton S.; Burcar, Bradley T.; Krishnamurthy, Ramanarayanan; Hud, Nicholas V.; Leman, Luke J.; Williams, Loren Dean (2019-08-13). "Selective incorporation of proteinaceous over nonproteinaceous cationic amino acids in model prebiotic oligomerization reactions". Proceedings of the National Academy of Sciences. 116 (33): 16338–16346. Bibcode:2019PNAS..11616338F. doi: 10.1073/pnas.1904849116 . ISSN   0027-8424. PMC   6697887 . PMID   31358633.
  6. Thurlkill RL, Grimsley GR, Scholtz JM, Pace CN (May 2006). "pK values of the ionizable groups of proteins". Protein Science. 15 (5): 1214–8. doi:10.1110/ps.051840806. PMC   2242523 . PMID   16597822.
  7. Pace CN, Grimsley GR, Scholtz JM (May 2009). "Protein ionizable groups: pK values and their contribution to protein stability and solubility". The Journal of Biological Chemistry. 284 (20): 13285–9. doi: 10.1074/jbc.R800080200 . PMC   2679426 . PMID   19164280.
  8. Byun BJ, Kang YK (May 2011). "Conformational preferences and pK(a) value of selenocysteine residue". Biopolymers. 95 (5): 345–53. doi:10.1002/bip.21581. PMID   21213257. S2CID   11002236.
  9. Rother M, Krzycki JA (August 2010). "Selenocysteine, pyrrolysine, and the unique energy metabolism of methanogenic archaea". Archaea. 2010: 1–14. doi: 10.1155/2010/453642 . PMC   2933860 . PMID   20847933.
  10. Kozlowski LP (January 2017). "Proteome-pI: proteome isoelectric point database". Nucleic Acids Research. 45 (D1): D1112–D1116. doi:10.1093/nar/gkw978. PMC   5210655 . PMID   27789699.
  11. "Atomic Weights and Isotopic Compositions for All Elements". NIST. Retrieved 2016-12-12.
  12. Phillips R, Kondev J, Theriot J, Garcia HG, Orme N (2013). Physical biology of the cell (Second ed.). Garland Science. p. 178. ISBN   978-0-8153-4450-6.
  13. Ferrier DR (2005). "Chapter 20: Amino Acid Degradation and Synthesis". In Champe PC, Harvey RA, Ferrier DR (eds.). Lippincott's Illustrated Reviews: Biochemistry (Lippincott's Illustrated Reviews). Hagerstwon, MD: Lippincott Williams & Wilkins. ISBN   978-0-7817-2265-0.

General references