KNOP1

Last updated

Lysine-rich nucleolar protein 1 (KNOP1) is a protein which in human's is encoded by the KNOP1 gene. Aliases for KNOP1 include TSG118, C16orf88, and FAM191A. [1]

Contents

Gene

KNOP1 is located on the negative DNA strand of chromosome 16 at 16p12.3. It spans 15.21 kb, from 19729556 to 19714347 and has 6 exons which are alternatively spliced in the RNA to create three main transcript isoforms. [2] Two of the KNOP1 isoforms, B and C, lack exon 1, shifting the start codon used to the one found in exon 2, and so the proteins encoded by these lack sequence at N-terminus. Isoform C also does not contain exon 4, which encodes the C-terminal DUF5595 domain.

Human chromosome 16 ideogram from GHR.png

Gene Neighborhood

The genes surrounding KNOP1 are VPS35L which is upstream and IQCK which is downstream of KNOP1. The IQCK gene was identified to be a potential candidate for obsessive-compulsive disorder in a genome-wide association study. [3] The VPS35L encodes the protein VPS35L that acts as a component of the retriever complex. [4]

Gene Expression

KNOP1 expression in Human tissues KNOP1 Expression.png
KNOP1 expression in Human tissues

KNOP1 has high levels of expression in the superior cervical ganglion, [6] testis, placenta, and in early stages of hear and lung development. [7] KNOP1 showed low levels of expression in the liver and pancreas.

Species distribution

There are many orthologs of KNOP1 in many different species of animals but not in the other kingdoms. No paralogs of KNOP1 were found. Table 1 lists select KNOP1 orthologs.

Table 1: KNOP1 Orthologs
Genus, SpeciesDivergence from Homo sapiens (MYA) [8] NCBI accession numberSequence Length (AA)Sequence Similarity to Homo sapien KNOP1 [9]
Homo sapiens --NP_001335456.1518100%
Pan paniscus 6XP_034795825.160485.1%
Mus musculus 89NP_075686.253265.8%
Lagenorhynchus obliquidens 94XP_026953897.145472.6%
Galemys pyrenaicus 94KAG8513613.153467.1%
Phyllostomus discolor 94KAF6125017.143464.3%
Monodelphis domestica 160XP_016279153.160147.1%
Ornithorhynchus anatinus 180XP_028910011.154447.2%
Tyto alba 318XP_042654773.162742.1%
Gallus gallus 318XP_004945520.254832.4%
Bufo bufo 352XP_040296565.151337.7%
Danio rerio 433XP_687135.147540.4%
Branchiostoma floridae 637XP_035694713.166634.2%
Owenia fusiformis 787CAC9610945.155334.7%

Protein

The exact function of KNOP1 is not yet understood it is hypothesized to mimic nucleostemin, a nucleolar protein linked to the proliferation potential of stem cells. [10] The protein is 518 amino acids long, [11] Isoform B is 458 amino acids, [12] and Isoform C is 435 amino acids. [13] It has a molecular weight of 58 kdal and an isoelectric point of 9.92 [14] The protein is rich in lysine [15] and has a lysine-rich region from amino acid 123–355. [16] There is a region of the protein that interacts with the protein ZNF106. [17] Some papers have associated it with the surface of the condensed chromosomes. [18]

Domains

Domains of KNOP1:The green box is domain DUF5595:The blue box is domain SMAP:The green line is region of interaction for ZNF106 KNOP1 Protein domains.png
Domains of KNOP1:The green box is domain DUF5595:The blue box is domain SMAP:The green line is region of interaction for ZNF106

KNOP1 has two domains Duf5595 (Not found in isoform C) and SMAP located at the end of the protein. [20] DUF5595 is found in Nude C 80 (Ndc80) proteins which can be found in species such as Homo sapiens. Ndc80 protein complexes are a core component of the end-on attachment sites for kinetochore microtubules. [21] SMAP (Small acidic protein family) is found in eukaryotes, and is approximately 70 amino acids in length. There is a single completely conserved residue G that may be functionally important at G441 [22]

KNOP1 Conceptual Translation KNOP1 Conceptual Translation.pdf
KNOP1 Conceptual Translation

Interacting protein

KNOP1 has been shown to interact with ZNF106 [23] and has been confirmed by Grasberger, H., & Bell, G. I. [24] This study concluded that the rapid downregulation of KNOP1 expression during in vitro terminal differentiation coincides with a loss of nucleolar ZFP106.

Related Research Articles

<span class="mw-page-title-main">DEPDC1B</span> Protein-coding gene in the species Homo sapiens

DEP Domain Containing Protein 1B also known as XTP1, XTP8, HBV XAg-Transactivated Protein 8, [formerly referred to as BRCC3] is a human protein encoded by a gene of similar name located on chromosome 5.

<span class="mw-page-title-main">Transmembrane protein 134</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 134 is a protein encoded by the TMEM134 gene. TMEM134 does not have any other known aliases. There are two transmembrane domains and a domain of unknown function (DUF872). Evolutionary, the majority of the organisms that have this gene are primates and mammals, although there are some organisms dating back to Drosophila and C. elegans. Through current research, there has not been any confirmed function of TMEM134.

<span class="mw-page-title-main">FAM63A</span> Protein-coding gene in the species Homo sapiens

Family with sequence similarity 63, member A is a protein that, in humans, is encoded by the FAM63A gene. It is located on the minus strand of chromosome 1 at locus 1q21.3.

<span class="mw-page-title-main">Proser2</span> Protein-coding gene in the species Homo sapiens

PROSER2, also known as proline and serine rich 2, is a protein that in humans is encoded by the PROSER2 gene. PROSER2, or c10orf47(Chromosome 10 open reading frame 47), is found in band 14 of the short arm of chromosome 10 (10p14) and contains a highly conserved SARG domain. It is a fast evolving gene with two paralogs, c1orf116 and specifically androgen-regulated gene protein isoform 1. The PROSER2 protein has a currently uncharacterized function however, in humans, it may play a role in cell cycle regulation, reproductive functioning, and is a potential biomarker of cancer.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

Coiled-coil domain containing protein 180 (CCDC180) is a protein that in humans is encoded by the CCDC180 gene. This protein is known to localize to the nucleus and is thought to be involved in regulation of transcription as are many proteins containing coiled-coil domains. As it is expressed most highly in the testes and is regulated by SRY and SOX transcription factors, it could be involved in sex determination.

<span class="mw-page-title-main">FAM71F2</span> Protein-coding gene in the species Homo sapiens

FAM71F2 or Family with Sequence Similarity 71 member F2 is a protein that in humans is encoded by the Family with Sequence Similarity 71 member F2 gene. This gene is highly active in the reproductive tissues, specifically the testis, and may serve as a potential biomarker for determining metastatic testicular cancer.

The Family with sequence similarity 149 member B1 is an uncharacterized protein encoded by the human FAM149B1 gene, with one alias KIAA0974. The protein resides in the nucleus of the cell. The predicted secondary structure of the gene contains multiple alpha-helices, with a few beta-sheet structures. The gene is conserved in mammals, birds, reptiles, fish, and some invertebrates. The protein encoded by this gene contains a DUF3719 protein domain, which is conserved across its orthologues. The protein is expressed at slightly below average levels in most human tissue types, with high expression in brain, kidney, and testes tissues, while showing relatively low expression levels in pancreas tissues.

<span class="mw-page-title-main">C17orf98</span> Protein-coding gene in the species Homo sapiens

C17orf98 is a protein which in humans is coded by the gene c17orf98. The protein is derived from Homo sapiens chromosome 17. The C17orf98 gene consists of a 6,302 base sequence. Its mRNA has three exons and no alternative splice sites. The protein has 154 amino acids, with no abnormal amino acid levels. C17orf98 has a domain of unknown function (DUF4542) and is 17.6kDa in weight. C17orf98 does not belong to any other families nor does it have any isoforms. The protein has orthologs with high percent similarity in mammals and reptiles. The protein has additional distantly related orthologs across the metazoan kingdom, culminating with the sponge family.

<span class="mw-page-title-main">C19orf44</span> Mammalian protein found in Homo sapiens

Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.

<span class="mw-page-title-main">C16orf86</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.

<span class="mw-page-title-main">CFAP299</span> Protein-coding gene in the species Homo sapiens

Cilia- and flagella-associated protein 299 (CFAP299), is a protein that in humans is encoded by the CFAP299 gene. CFAP299 is predicted to play a role in spermatogenesis and cell apoptosis.

<span class="mw-page-title-main">TMEM125</span> Protein

Transmembrane protein 125 is a protein that, in humans, is encoded by the TMEM125 gene. It has 4 transmembrane domains and is expressed in the lungs, thyroid, pancreas, intestines, spinal cord, and brain. Though its function is currently poorly understood by the scientific community, research indicates it may be involved in colorectal and lung cancer networks. Additionally, it was identified as a cell adhesion molecule in oligodendrocytes, suggesting it may play a role in neuron myelination.

<span class="mw-page-title-main">LSMEM2</span> Protein-coding gene in the species Homo sapiens

Leucine rich single-pass membrane protein 2 is a single-pass membrane protein rich in leucine, that in humans is encoded by the LSMEM2 gene. The LSMEM2 protein is conserved in mammals, birds, and reptiles. In humans, LSMEM2 is found to be highly expressed in the heart, skeletal muscle and tongue.

<span class="mw-page-title-main">TMEM221</span> Protein

Transmembrane protein 221 (TMEM221) is a protein that in humans is encoded by the TMEM221 gene. The function of TMEM221 is currently not well understood.

C2orf74, also known as LOC339804, is a protein encoding gene located on the short arm of chromosome 2 near position 15 (2p15). Isoform 1 of the gene is 19,713 base pairs long. C2orf74 has orthologs in 135 different species, including primarily placental mammals and some marsupials.

<span class="mw-page-title-main">C9orf85</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13. When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal. Isoelectric point was found to be 9.54. The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.

<span class="mw-page-title-main">TMEM144</span> Transmembrane Protein 144

Transmembrane Protein 144 (TMEM144) is a protein in humans encoded by the TMEM144 gene.

<span class="mw-page-title-main">C13orf46</span> C13of46 Gene and Protein

Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.

<span class="mw-page-title-main">Chromosome 5 open reading frame 47</span> Human C5ORF47 Gene

Chromosome 5 Open Reading Frame 47, or C5ORF47, is a protein which, in humans, is encoded by the C5ORF47 gene. It also goes by the alias LOC133491. The human C5ORF47 gene is primarily expressed in the testis.

References

  1. GeneCards (https://www.genecards.org/cgi-bin/carddisp.pl?gene=KNOP1)
  2. (NCBI Gene (https://www.ncbi.nlm.nih.gov/gene/400506#genomic-context)
  3. U.S. National Library of Medicine. (n.d.). IQCK IQ motif containing K [Homo sapiens (human)] - gene - NCBI. National Center for Biotechnology Information. Retrieved December 18, 2021, from https://www.ncbi.nlm.nih.gov/gene/124152
  4. McNally, K. E., Faulkner, R., Steinberg, F., Gallon, M., Ghai, R., Pim, D., Langton, P., Pearson, N., Danson, C. M., Nägele, H., Morris, L. L., Singla, A., Overlee, B. L., Heesom, K. J., Sessions, R., Banks, L., Collins, B. M., Berger, I., Billadeau, D. D., Burstein, E., … Cullen, P. J. (2017). Retriever is a multiprotein complex for retromer-independent endosomal cargo recycling. Nature Cell Biology, 19(10), 1214–1225. https://doi.org/10.1038/ncb3610
  5. NCBI GEO (https://www.ncbi.nlm.nih.gov/geo/tools/profileGraph.cgi?ID=GDS596:213235_at)
  6. NCBI GEO (https://www.ncbi.nlm.nih.gov/geo/tools/profileGraph.cgi?ID=GDS596:213235_at)
  7. NCBI RNA-seq data (https://www.ncbi.nlm.nih.gov/gene/400506#genomic-context)
  8. TimeTree (http://www.timetree.org/)
  9. NCBI BLAST
  10. Grasberger, H., & Bell, G. I. (2005). Subcellular recruitment by TSG118 and TSPYL implicates a role for zinc finger protein 106 in a novel Developmental pathway. The International Journal of Biochemistry & Cell Biology, 37(7), 1421–1437. https://doi.org/10.1016/j.biocel.2005.01.013
  11. NCBI Protein (https://www.ncbi.nlm.nih.gov/protein/1143077058)
  12. NCBI KNOP1 Isoform B (https://www.ncbi.nlm.nih.gov/protein/1142736531)
  13. NCBI KNOP1 Isoform C (https://www.ncbi.nlm.nih.gov/protein/NP_001335461.1)
  14. Expasy-Compute pI/Mw (https://web.expasy.org/compute_pi/)
  15. Statistical Analysis of Protein Sequences (https://www.ebi.ac.uk/Tools/seqstats/saps/)
  16. Motif Scan (https://myhits.sib.swiss/cgi-bin/motif_scan)
  17. UniProt (https://www.uniprot.org/uniprot/Q1ED39)
  18. Larsson, M., Brundell, E., Jörgensen, P. M., Ståhl, S., & Höög, C. (1999). Characterization of a novel nucleolar protein that transiently associates with the condensed chromosomes in mitotic cells. European journal of cell biology, 78(6), 382-390.
  19. ProSite (https://prosite.expasy.org/cgi-bin/prosite/mydomains/)
  20. MOTIF Search (https://www.genome.jp/tools/motif/)
  21. DUF5595 (https://www.ncbi.nlm.nih.gov/Structure/cdd/cddsrv.cgi?uid=pfam18077)
  22. SMAP (https://www.ncbi.nlm.nih.gov/Structure/cdd/cddsrv.cgi?uid=pfam15477)
  23. String Protein-Protein Interaction Networks(https://string-db.org/cgi/network?taskId=bKzRFr03O9Lu&sessionId=b92QrR5MM6wa)
  24. Grasberger, H., & Bell, G. I. (2005). Subcellular recruitment by TSG118 and TSPYL implicates a role for zinc finger protein 106 in a novel developmental pathway. The international journal of biochemistry & cell biology, 37(7), 1421–1437. https://doi.org/10.1016/j.biocel.2005.01.013