GNC hypothesis

Last updated

The GNC hypothesis or GNC-SNS primeval genetic code hypothesis refers to a hypothesis about the origin of genes. It suggests the universal genetic code originated not from a three-amino acid system, but from a four-amino acid system. It is this GNC code encoding [GADV]-proteins which is the most primitive genetic code. This hypothesis was first proposed by Kenji Ikehara at Nara Women's University.

Contents

Details

While almost all of the organisms on Earth share the universal genetic code, in the GNC hypothesis it is argued that two primeval genetic codes preceded the present genetic code as follows:

The GNC hypothesis is based on the following facts:

Ikehara, Kenji; Omori, Yoko; Arai, Rieko; Hirose, Akiko (2002). "A Novel Theory on the Origin of the Genetic Code: A GNC-SNS Hypothesis". Journal of Molecular Evolution. 54 (4): 530–538. Bibcode:2002JMolE..54..530I. doi:10.1007/s00239-001-0053-6. PMID   11956691.

See also

Related Research Articles

Amino acid Organic compounds containing amine and carboxylic groups

Amino acids are organic compounds that contain amino and carboxylic acid functional groups, along with a side chain specific to each amino acid. The elements present in every amino acid are carbon (C), hydrogen (H), oxygen (O), and nitrogen (N) (CHON); in addition sulfur (S) is present in the side chains of cysteine and methionine, and selenium (Se) in the less common amino acid selenocysteine. More than 500 naturally occurring amino acids are known to constitute monomer units of peptides, including proteins, as of 2020 although only 22 appear in the genetic code, 20 of which have their own designated codons and 2 of which have special coding mechanisms: Selenocysteine which is present in all eukaryotes and pyrrolysine which is present in some prokaryotes.

Glycine Amino acid

Glycine (symbol Gly or G; ) is an amino acid that has a single hydrogen atom as its side chain. It is the simplest stable amino acid (carbamic acid is unstable), with the chemical formula NH2CH2‐COOH. Glycine is one of the proteinogenic amino acids. It is encoded by all the codons starting with GG (GGU, GGC, GGA, GGG). Glycine is integral to the formation of alpha-helices in secondary protein structure due to its compact form. For the same reason, it is the most abundant amino acid in collagen triple-helices. Glycine is also an inhibitory neurotransmitter – interference with its release within the spinal cord (such as during a Clostridium tetani infection) can cause spastic paralysis due to uninhibited muscle contraction.

Alanine Α-amino acid that is used in the biosynthesis of proteins

Alanine (symbol Ala or A), or α-alanine, is an α-amino acid that is used in the biosynthesis of proteins. It contains an amine group and a carboxylic acid group, both attached to the central carbon atom which also carries a methyl group side chain. Consequently, its IUPAC systematic name is 2-aminopropanoic acid, and it is classified as a nonpolar, aliphatic α-amino acid. Under biological conditions, it exists in its zwitterionic form with its amine group protonated (as −NH3+) and its carboxyl group deprotonated (as −CO2). It is non-essential to humans as it can be synthesised metabolically and does not need to be present in the diet. It is encoded by all codons starting with GC (GCU, GCC, GCA, and GCG).

Proteinogenic amino acid Amino acid that is incorporated biosynthetically into proteins during translation

Proteinogenic amino acids are amino acids that are incorporated biosynthetically into proteins during translation. The word "proteinogenic" means "protein creating". Throughout known life, there are 22 genetically encoded (proteinogenic) amino acids, 20 in the standard genetic code and an additional 2 that can be incorporated by special translation mechanisms.

Non-proteinogenic amino acids Are not naturally encoded in the genome

In biochemistry, non-coded or non-proteinogenic amino acids are distinct from the 22 proteinogenic amino acids which are naturally encoded in the genome of organisms for the assembly of proteins. However, over 140 non-proteinogenic amino acids occur naturally in proteins and thousands more may occur in nature or be synthesized in the laboratory. Many non-proteinogenic amino acids are important:

The yeast mitochondrial code is a genetic code used by the mitochondrial genome of yeasts, notably Saccharomyces cerevisiae, Candida glabrata, Hansenula saturnus, and Kluyveromyces thermotolerans.

The mold, protozoan, and coelenterate mitochondrial code and the mycoplasma/spiroplasma code is the genetic code used by various organisms, in some cases with slight variations, notably the use of UGA as a tryptophan codon rather than a stop codon.

The invertebrate mitochondrial code is a genetic code used by the mitochondrial genome of invertebrates.

GADV-protein world is a hypothetical stage of abiogenesis. GADV stands for the one letter codes of four amino acids, namely, glycine (G), alanine (A), aspartic acid (D) and valine (V), the main components of GADV proteins. In the GADV-protein world hypothesis, it is argued that the prebiotic chemistry before the emergence of genes involved a stage where GADV-proteins were able to pseudo-replicate. This hypothesis is contrary to the RNA world hypothesis.

The euplotid nuclear code is the genetic code used by Euplotidae. The euplotid code is a socalled "symmetrical code", which results from the symmetrical distribution of the codons. This symmetry allows for arythmic exploration of the codon distribution. In 2013, shCherbak and Makukov, reported that "the patterns are shown to match the criteria of an intelligent signal."

The candidate division SR1 and gracilibacteria code is used in two groups of uncultivated bacteria found in marine and fresh-water environments and in the intestines and oral cavities of mammals among others. The difference to the standard and the bacterial code is that UGA represents an additional glycine codon and does not code for termination.

The trematode mitochondrial code is a genetic code found in the mitochondria of Trematoda.

The pachysolen tannophilus nuclear code is a genetic code found in the ascomycete fungus Pachysolen tannophilus.

The karyorelictid nuclear code is a genetic code used by the nuclear genome of the Karyorelictea ciliate Parduczia sp.

The Condylostoma nuclear code is a genetic code used by the nuclear genome of the heterotrich ciliate Condylostoma magnum.

The Mesodinium nuclear code is a genetic code used by the nuclear genome of the ciliates Mesodinium and Myrionecta.

The Blastocrithidia nuclear code is a genetic code used by the nuclear genome of the trypanosomatid genus Blastocrithidia.

Low complexity regions (LCRs) in protein sequences, also defined in some contexts as compositionally biased regions (CBRs), are regions in protein sequences that differ from the composition and complexity of most proteins that is normally associated with globular structure. LCRs have different properties from normal regions regarding structure, function and evolution.

The QTY Code is a design method to transform membrane proteins that are intrinsically insoluble in water into variants with water solubility, while retaining their structure and function.

Phoratoxins are a group of peptide toxins that belong to the family of thionins, a subdivision of small plant toxins. Phoratoxins are proteins present in the leaves and branches of the Phoradendron, commonly known as the American variant of the mistletoe, a plant commonly used as decoration during the festive season. The berries of the mistletoe do not contain phoratoxins, making them less toxic compared to other parts of the plant. The toxicity of the mistletoe is dependent on the host tree, since mistletoe is known to be a semi-parasite. The host tree provides fixed inorganic nitrogen compounds necessary for the mistletoe to synthesize phoratoxins.