Microprotein

Last updated

A microprotein (miP) is a small protein encoded from small open reading frames (smORFs). [1] They are a class of protein with a single protein domain that are related to multidomain proteins. [2] Microproteins regulate larger multidomain proteins at the post-translational level. [3] Microproteins are analogous to microRNAs (miRNAs) and heterodimerize with their targets causing dominant and negative effects. [4] In animals and plants, microproteins have been found to greatly influence biological processes. [2] Because of microproteins' dominant effects on their targets, microproteins are currently being studied for potential applications in biotechnology. [2]

Contents

History

The first microprotein (miP) discovered was during a research in the early 1990s on genes for basic helix–loop–helix (bHLH) transcription factors from a murine erythroleukaemia cell cDNA library. [3] The protein was found to be an inhibitor of DNA binding (ID protein), and it negatively regulated the transcription factor complex. [3] The ID protein was 16 kDa and consisted of a helix-loop-helix (HLH) domain. [2] The microprotein formed bHLH/HLH heterodimers which disrupted the functional basic helix–loop–helix (bHLH) homodimers. [2]

The first microprotein discovered in plants was the LITTLE ZIPPER (ZPR) protein. [2] The LITTLE ZIPPER protein contains a leucine zipper domain but does not have the domains required for DNA binding and transcription activation. [2] Thus, LITTLE ZIPPER protein is analogous to the ID protein. [2] Despite not all proteins being small, in 2011, this class of protein was given the name microproteins because their negative regulatory actions are similar to those of miRNAs. [3]

Evolutionarily, the ID protein or proteins similar to ID found in all animals. [3] In plants, microproteins are only found in higher order. [3] However, the homeodomain transcription factors that belong to the three-amino-acid loop-extension (TALE) family are targets of microproteins, and these homeodomain proteins are conserved in animals, plants, and fungi. [3]

Structure

Microproteins are generally small proteins with a single protein domain. [2] [4] The active form of microproteins are translated from smORF. [1] The smORF codons which microproteins are translated from can be less than 100 codons. [1] However, not all microproteins are small, and the name was given because their actions are analogous to miRNAs. [3]

Function

The function of microproteins is post-translational regulators. [3] Microproteins disrupt the formation of heterodimeric, homodimeric, or multimeric complexes. [4] Furthermore, microproteins can interact with any protein that require functional dimers to function normally. [3] The primary targets of microproteins are transcription factors that bind to DNA as dimers. [5] [3] Microproteins regulate these complexes by creating homotypic dimers with the targets and inhibit protein complex function. [3] There are two types of miP inhibitions: homotypic miP inhibition and heterotypic miP inhibition. [4] In homotypic miP inhibition, microproteins interact with proteins with similar protein-protein interaction (PPI) domain. [4] In heterotypic miP inhibition, microproteins interact with proteins with different but compatible PPI domain. [4] In both types of inhibition, microproteins interfere and prevent the PPI domains from interacting with their normal proteins. [4]

Related Research Articles

<span class="mw-page-title-main">Transcription factor</span> Protein that regulates the rate of DNA transcription

In molecular biology, a transcription factor (TF) is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. The function of TFs is to regulate—turn on and off—genes in order to make sure that they are expressed in the desired cells at the right time and in the right amount throughout the life of the cell and the organism. Groups of TFs function in a coordinated fashion to direct cell division, cell growth, and cell death throughout life; cell migration and organization during embryonic development; and intermittently in response to signals from outside the cell, such as a hormone. There are 1500-1600 TFs in the human genome. Transcription factors are members of the proteome as well as regulome.

<span class="mw-page-title-main">Homeobox</span> DNA pattern affecting anatomy development

A homeobox is a DNA sequence, around 180 base pairs long, that regulates large-scale anatomical features in the early stages of embryonic development. Mutations in a homeobox may change large-scale anatomical features of the full-grown organism.

<span class="mw-page-title-main">Basic helix–loop–helix</span> Protein structural motif

A basic helix–loop–helix (bHLH) is a protein structural motif that characterizes one of the largest families of dimerizing transcription factors. The word "basic" does not refer to complexity but to the chemistry of the motif because transcription factors in general contain basic amino acid residues in order to facilitate DNA binding.

Inhibitor of DNA-binding/differentiation proteins, also known as ID proteins comprise a family of proteins that heterodimerize with basic helix-loop-helix (bHLH) transcription factors to inhibit DNA binding of bHLH proteins. ID proteins also contain the HLH-dimerization domain but lack the basic DNA-binding domain and thus regulate bHLH transcription factors when they heterodimerize with bHLH proteins. The first helix-loop-helix proteins identified were named E-proteins because they bind to Ephrussi-box (E-box) sequences. In normal development, E proteins form dimers with other bHLH transcription factors, allowing transcription to occur. However, in cancerous phenotypes, ID proteins can regulate transcription by binding E proteins, so no dimers can be formed and transcription is inactive. E proteins are members of the class I bHLH family and form dimers with bHLH proteins from class II to regulate transcription. Four ID proteins exist in humans: ID1, ID2, ID3, and ID4. The ID homologue gene in Drosophila is called extramacrochaetae (EMC) and encodes a transcription factor of the helix-loop-helix family that lacks a DNA binding domain. EMC regulates cell proliferation, formation of organs like the midgut, and wing development. ID proteins could be potential targets for systemic cancer therapies without inhibiting the functioning of most normal cells because they are highly expressed in embryonic stem cells, but not in differentiated adult cells. Evidence suggests that ID proteins are overexpressed in many types of cancer. For example, ID1 is overexpressed in pancreatic, breast, and prostate cancers. ID2 is upregulated in neuroblastoma, Ewing’s sarcoma, and squamous cell carcinoma of the head and neck.

<span class="mw-page-title-main">Leucine zipper</span> DNA-binding structural motif

A leucine zipper is a common three-dimensional structural motif in proteins. They were first described by Landschulz and collaborators in 1988 when they found that an enhancer binding protein had a very characteristic 30-amino acid segment and the display of these amino acid sequences on an idealized alpha helix revealed a periodic repetition of leucine residues at every seventh position over a distance covering eight helical turns. The polypeptide segments containing these periodic arrays of leucine residues were proposed to exist in an alpha-helical conformation and the leucine side chains from one alpha helix interdigitate with those from the alpha helix of a second polypeptide, facilitating dimerization.

A DNA-binding domain (DBD) is an independently folded protein domain that contains at least one structural motif that recognizes double- or single-stranded DNA. A DBD can recognize a specific DNA sequence or have a general affinity to DNA. Some DNA-binding domains may also include nucleic acids in their folded structure.

<span class="mw-page-title-main">Myogenin</span> Mammalian protein found in Homo sapiens

Myogenin, is a transcriptional activator encoded by the MYOG gene. Myogenin is a muscle-specific basic-helix-loop-helix (bHLH) transcription factor involved in the coordination of skeletal muscle development or myogenesis and repair. Myogenin is a member of the MyoD family of transcription factors, which also includes MyoD, Myf5, and MRF4.

The scleraxis protein is a member of the basic helix-loop-helix (bHLH) superfamily of transcription factors. Currently two genes have been identified to code for identical scleraxis proteins.

An E-box is a DNA response element found in some eukaryotes that acts as a protein-binding site and has been found to regulate gene expression in neurons, muscles, and other tissues. Its specific DNA sequence, CANNTG, with a palindromic canonical sequence of CACGTG, is recognized and bound by transcription factors to initiate gene transcription. Once the transcription factors bind to the promoters through the E-box, other enzymes can bind to the promoter and facilitate transcription from DNA to mRNA.

<span class="mw-page-title-main">Mef2</span> Protein family

In the field of molecular biology, myocyte enhancer factor-2 (Mef2) proteins are a family of transcription factors which through control of gene expression are important regulators of cellular differentiation and consequently play a critical role in embryonic development. In adult organisms, Mef2 proteins mediate the stress response in some tissues. Mef2 proteins contain both MADS-box and Mef2 DNA-binding domains.

The gene extramachrochaetae (emc) is a Drosophila melanogaster gene that codes for the Emc protein, which has a wide variety of developmental roles. It was named, as is common for Drosophila genes, after the phenotypic change caused by a mutation in the gene (macrochaetae are the longer bristles on Drosophila).

<span class="mw-page-title-main">ID2</span> Protein-coding gene in the species Homo sapiens

DNA-binding protein inhibitor ID-2 is a protein that in humans is encoded by the ID2 gene.

<span class="mw-page-title-main">ID1</span> Protein-coding gene in the species Homo sapiens

DNA-binding protein inhibitor ID-1 is a protein that in humans is encoded by the ID1 gene.

<span class="mw-page-title-main">MAX (gene)</span> Protein-coding gene in the species Homo sapiens

MAX is a gene that in humans encodes the MAX transcription factor.

<span class="mw-page-title-main">ID3 (gene)</span> Protein-coding gene in the species Homo sapiens

DNA-binding protein inhibitor ID-3 is a protein that in humans is encoded by the ID3 gene.

<span class="mw-page-title-main">ID4</span> Protein-coding gene in humans

ID4 is a protein coding gene. In humans, it encodes for the protein known as DNA-binding protein inhibitor ID-4. This protein is known to be involved in the regulation of many cellular processes during both prenatal development and tumorigenesis. This is inclusive of embryonic cellular growth, senescence, cellular differentiation, apoptosis, and as an oncogene in angiogenesis.

<span class="mw-page-title-main">TCF21 (gene)</span> Protein-coding gene in the species Homo sapiens

Transcription factor 21 (TCF21), also known as pod-1, capsuling, or epicardin, is a protein that in humans is encoded by the TCF21 gene on chromosome 6. It is ubiquitously expressed in many tissues and cell types and highly significantly expressed in lung and placenta. TCF21 is crucial for the development of a number of cell types during embryogenesis of the heart, lung, kidney, and spleen. TCF21 is also deregulated in several types of cancers, and thus known to function as a tumor suppressor. The TCF21 gene also contains one of 27 SNPs associated with increased risk of coronary artery disease.

Neurogenins are a family of bHLH transcription factors involved in specifying neuronal differentiation. It is one of many gene families related to the atonal gene in Drosophila. Other positive regulators of neuronal differentiation also expressed during early neural development include NeuroD and ASCL1.

<span class="mw-page-title-main">Pho4</span> Protein-coding gene in the species Saccharomyces cerevisiae S288c

Pho4 is a protein with a basic helix-loop-helix (bHLH) transcription factor. It is found in S. cerevisiae and other yeasts. It functions as a transcription factor to regulate phosphate responsive genes located in yeast cells. The Pho4 protein homodimer is able to do this by binding to DNA sequences containing the bHLH binding site 5'-CACGTG-3'. This sequence is found in the promoters of genes up-regulated in response to phosphate availability such as the PHO5 gene.

<span class="mw-page-title-main">Myogenic determination factor 5</span>

In molecular biology, the myogenic determination factor 5 proteins are a family of proteins found in eukaryotes. This family includes the Myf5 protein, which is responsible for directing cells to the skeletal myocyte lineage during development. Myf5 is likely to act in a similar way to the other MRF4 proteins such as MyoD which perform the same function. These are histone acetyltransferases and histone deacetylases which activate and repress genes involved in the myocyte lineage.

References

  1. 1 2 3 "The Dark Matter of the Human Proteome". The Scientist Magazine®. Retrieved 2019-04-25.
  2. 1 2 3 4 5 6 7 8 9 Bhati, Kaushal Kumar; Blaakmeer, Anko; Paredes, Esther Botterweg; Dolde, Ulla; Eguen, Tenai; Hong, Shin-Young; Rodrigues, Vandasue; Straub, Daniel; Sun, Bin (2018-04-18). "Approaches to identify and characterize microProteins and their potential uses in biotechnology". Cellular and Molecular Life Sciences. 75 (14): 2529–2536. doi:10.1007/s00018-018-2818-8. ISSN   1420-682X. PMC   6003976 . PMID   29670998.
  3. 1 2 3 4 5 6 7 8 9 10 11 12 Staudt, Annica-Carolin; Wenkel, Stephan (2010-12-10). "Regulation of protein function by 'microProteins'". EMBO Reports. 12 (1): 35–42. doi:10.1038/embor.2010.196. ISSN   1469-221X. PMC   3024132 . PMID   21151039.
  4. 1 2 3 4 5 6 7 Eguen, T; Straub, D; Graeff, M; Wenkel, S (August 2015). "MicroProteins: small size-big impact". Trends in Plant Science. 20 (8): 477–482. doi:10.1016/j.tplants.2015.05.011. PMID   26115780.
  5. de Klein, Niek; Magnani, Enrico; Banf, Michael; Rhee, Seung Yon (2015). "microProtein Prediction Program (miP3): A Software for Predicting microProteins and Their Target Transcription Factors". International Journal of Genomics. 2015: 734147. doi: 10.1155/2015/734147 . ISSN   2314-436X. PMC   4427850 . PMID   26060811.