KIAA0090

Last updated

KIAA0090 is a human gene coding for a protein of unknown function. [1] KIAA0090 has two aliases OTTHUMP00000002581 and RP1-43E13.1. The gene codes for multiple transcript variants which can localize to different subcellular compartments. KIAA0090 interacts with multiple effector proteins. KIAA0090 contains a conserved COG1520 WD40 like repeat domain thought to be the method of such interaction.

Contents

Characterization of the KIAA0090 gene and its transcript products

Figure 1: A graphical summary of the KIAA0090 gene neighborhood, regions of promotion, exons/introns, and variant transcript products Genetic breakdown.jpg
Figure 1: A graphical summary of the KIAA0090 gene neighborhood, regions of promotion, exons/introns, and variant transcript products

KIAA0090 is located on chromosome one in the p arm at location 1p36.132. [2] It covers 36.74 kb, from base pairs 19451486 to 19414744. The gene is composed of 37 gt-at introns/alternative introns with 57 exons expressed in 1 unspliced form of 4253 bp and 20 alternatively spliced forms of varying lengths. [3] The gene has 8 probable promoters. [4] The gene is flanked by UBR4 on its right and MRTO4 on its left. [1] This Information is graphically displayed in Figure 1.

Expressed Sequence Tags and isolated cDNA clones indicate KIAA0090 is expressed ubiquitously in low to moderate levels throughout the body. [5] This includes but is not limited to testis, tongue, lung, cerebellum, brain, mammary gland, trachea, placenta, esophageal, salivary gland, brain, hippocampus, amygdale, bone marrow, thalamus, spleen, uterus, thymus, kidney, eye, heart, gall bladder, prostate, liver, parathyroid gland, ovary, stomach, skeletal muscle, colon, pancreas, and skin. Expression of KIAA0090 changes throughout development (embryogenesis, fetal, adult, etc.)and during carcinogenesis. Evidence indicates a correlation between conditions and expression level but no data exists to suggest KIAA0090 is responsible for any disease or stage of development.

The mRNA for this gene codes for 18 protein isoforms6. The remaining 3 splice variants have no evidence supporting their ability to be translated.

Characterization of the KIAA0090 Protein Product

Analysis indicates the KIAA0090 unspliced protein product to be 993 amino acids long with an isoelectric point of 7.418 and a molecular weight of 111765.73 Daltons. [6] [7] The primary structure of this protein contains 4 conserved domains. [8] This includes a signal peptide from position 1 to 22, a COG1520 WD40 like domain, a leucine zipper domain, a DUF1620 domain (domain of unknown function), and a transmembrane domain. These can be viewed in Figure 2. Several conserved cysteine residues are present at positions 226,235, 335, 364,449, 581, 675, 925, and 985. [9] Several internal localization signals are also present. [10] [11] [12] [13] [14] Dependent on splice outcome and posttranslational modification, these additional signals indicate the protein could localize to the peroxisome, the plasma membrane, outside the cell, the cytosol, the nucleus, or mitochondria.

Figure 2: An annotated protein map of KIAA0090 with domain residence and function Conceptual Translation 1.jpg
Figure 2: An annotated protein map of KIAA0090 with domain residence and function

Post translational modification of KIAA0090 can occur. 54 possible sites of phosphorylation exist; 33 serines, 10 threonines, and 11 tyrosines. [15] 3 sites of N-linked glycosylation are present at residues 370, 818, and 913 [16] The signal peptide can be cleaved between residues 21 and 22. [13]

Figure 3: A conceptual translation of KIAA0090 with domains, sites of phosphorylation, sites of N-linked glycosylation, and signal peptide. Another KIAA0090 Conceptual Translation.jpg
Figure 3: A conceptual translation of KIAA0090 with domains, sites of phosphorylation, sites of N-linked glycosylation, and signal peptide.

This information is graphically displayed in Figure 3. Structure beyond the primary remains predicative. Bioinformatic analysis yields consensus data that is also displayed in Figure 3. [17] [18] The protein is highly conserved throughout Eukaryotes both in multi and single cellular organisms. This includes but is not limited to animals, plants, fungi, and protists.

Figure 4: KIAA0090/Protein Interactions KIAA0090 Interaction Diagram.jpg
Figure 4: KIAA0090/Protein Interactions

The WD40 like domain COG1520 is KIAA0090s only identified functional effector domain. WD40 containing proteins are signal transducers involved in transduction of signals to binding factors, the centromeres, and other effectors. [19] Coimmunoprecipitation experiments have proven KIAA0090 interaction with these types of proteins; specifically the centromeric protein CENPH, the BAX Inhibitor TMBI4, the ADP ribosylation factor ARF6, the kinase TNIK, and the transcriptional repressor T22D1. [20] The number of splice variants indicates this list is probably not definitive. As further characterization is completed additional interactions would be expected.

Related Research Articles

DGLUCY

DGLUCY is a protein that in humans is encoded by the DGLUCY gene.

KIAA1109

Uncharacterized protein KIAA1109 is a protein that in humans is encoded by the KIAA1109 gene.

Søren Brunak

Søren Brunak is a Danish biological and physical scientist working in bioinformatics, systems biology and medical informatics. He is professor of Disease Systems Biology at the University of Copenhagen and professor of Bioinformatics at the Technical University of Denmark. As Research Director at the Novo Nordisk Foundation Center for Protein Research at the University of Copenhagen Medical School he leads a research effort where molecular level systems biology data are combined with phenotypic data from the healthcare sector, such as electronic patient records, registry information and biobank questionnaires. A major aim is to understand the network biology basis for time-ordered comorbidities and discriminate between treatment related disease correlations and other comorbidities in disease trajectories. Søren Brunak also holds a position as Medical Informatics Officer at Rigshospitalet, Capital Region of Denmark.

TMEM63A

Transmembrane protein 63A is a protein that in humans is encoded by the TMEM63A gene. The mature human protein is approximately 92.1 kilodaltons (kDa), with a relatively high conservation of mass in orthologs. The protein contains eleven transmembrane domains and is inserted into the membrane of the lysosome. BioGPS analysis for TMEM63A in humans shows that the gene is ubiquitously expressed, with the highest levels of expression found in T-cells and dendritic cells.

Gunnar von Heijne

Professor Nils Gunnar Hansson von Heijne, born 10 June 1951 in Gothenburg, is a Swedish scientist working on signal peptides, membrane proteins and bioinformatics at the Stockholm Center for Biomembrane Research at Stockholm University.

Morn repeat containing 1

MORN1 containing repeat 1, also known as Morn1, is a protein that in humans is encoded by the MORN1 gene.

TMEM106A

TMEM106A is a gene that encodes the transmembrane protein 106A (TMEM106A) in Homo sapiens. It is located at 17q21.31 on the plus strand next to cancer-related genes NBR1 and BRCA1. The TMEM106A gene contains a domain of unknown function, DUF1356.

KIAA0922

Transmembrane protein 131-like, alternatively named uncharacterized protein KIAA0922, is an integral transmembrane protein encoded by the human gene KIAA0922 that is significantly conserved in eukaryotes, at least through protists. Although the function of this gene is not yet fully elucidated, initial microarray evidence suggests that it may be involved in immune responses. Furthermore, its paralog, prolyl endopeptidase (PREP) whose function is known, provides clues as to the function of TMEM131L.

Alpha-1-B glycoprotein

Alpha-1-B glycoprotein is a 54.3 kDa protein in humans that is encoded by the A1BG gene. The protein encoded by this gene is a plasma glycoprotein of unknown function. The protein shows sequence similarity to the variable regions of some immunoglobulin supergene family member proteins. Patients who have pancreatic ductal adenocarcinoma show an overexpression of A1BG in pancreatic juice.

CCDC130

Coiled-coil domain containing 130 is a protein that in humans is encoded by the CCDC130 gene. It is part of the U4/U5/U6 tri-snRNP in the U5 portion. This tri-snRNP comes together with other proteins to form complex B of the mature spliceosome. The mature protein is approximately 45 kilodaltons (kDa) and is extremely hydrophilic due to the abnormally high number of charged and polar amino acids. CCDC130 is a highly conserved protein, it has orthologous genes in some yeasts and plants that were found using nucleotide and protein versions of the basic local alignment search tool (BLAST) from the National Center for Biotechnology Information. GEO profiles for CCDC130 have shown that this protein is ubiquitously expressed, but the highest levels of expression are found in T-lymphocytes.

CXorf66

CXorf66 also known as Chromosome X Open Reading Frame 66, is a 361aa protein in humans that is encoded by the CXorf66 gene. The protein encoded is predicted to be a type 1 transmembrane protein; however, its exact function is currently unknown. CXorf66 has one alias: RP11-35F15.2.

TMEM106C

TMEM106C is a gene that encodes the transmembrane protein 106C (TMEM106C) in Homo sapiens It has been found to be overexpressed in cancer cells and also is related to distal arthrogryposis, a condition of stiff joints and irregular muscle development. The TMEM106C gene contains a domain of unknown function, DUF1356, that spans most of the protein. Transmembrane protein 106C also goes by the aliases MGC5576 or MGC111210, LOC79022.

EVI5L

EVI5L is a protein that in humans is encoded by the EVI5L gene. EVI5L is a member of the Ras superfamily of monomeric guanine nucleotide-binding (G) proteins, and functions as a GTPase-activating protein (GAP) with a broad specificity. Measurement of in vitro Rab-GAP activity has shown that EVI5L has significant Rab2A- and Rab10-GAP activity.

Transmembrane protein 255A

Transmembrane protein 255A is a protein that is encoded by the TMEM255A gene. TMEM255A is often referred to as family with sequence similarity 70, member A (FAM70A). The TMEM255A protein is transmembrane and is predicted to be located the nuclear envelope of eukaryote organisms.

Transmembrane Protein 217 is a protein encoded by the gene TMEM217. TMEM217 has been found to have expression correlated with the lymphatic system and endothelial tissues and has been predicted to have a function linked to the cytoskeleton.

KIAA0825

KIAA0825 is a protein that in humans is encoded by the gene of the same name, located on chromosome 5, 5q15. It is a possible risk factor in Type II Diabetes, and associated with high levels of glucose in the blood. It is a relatively fast mutating gene, compared to other coding genes. There is however one region which is highly conserved across the species that have the gene, known as DUF4495. It is predicted to travel between the nucleus and the cytoplasm.

Proline-rich protein 30 is a protein in humans that is encoded for by the PRR30 gene. PRR30 is a member in the family of Proline-rich proteins characterized by their intrinsic lack of structure. Copy number variations in the PRR30 gene have been associated with an increased risk for neurofibromatosis.

SHLD1

SHLD1 or shieldin complex subunit 1 is a gene on chromosome 20. The C20orf196 gene encodes an mRNA that is 1,763 base pairs long, and a protein that is 205 amino acids long.

Coiled-coil domain containing 74a

Coiled-coil domain containing 74A is a protein that in humans is encoded by the CCDC74A gene. The protein is most highly expressed in the testis and may play a role in developmental pathways. The gene has undergone duplication in the primate lineage within the last 9 million years, and its only true ortholog is found in Pan troglodytes.

C2orf16

C2orf16 is a protein that in humans is encoded by the C2orf16 gene. Isoform 2 of this protein is 1,984 amino acids long. The gene contains 1 exon and is located at 2p23.3. Aliases for C2orf16 include Open Reading Frame 16 on Chromosome 2 and P-S-E-R-S-H-H-S Repeats Containing Sequence.

References

  1. 1 2 "Enterez Gene, KIAA0090". NCBI. March 2010. Retrieved 2010-03-21.
  2. "Enterez Nucleotide, KIAA0090". NCBI. April 2010. Retrieved 2010-04-23.
  3. "Aceview". NCBI. April 2010. Retrieved 2010-04-23.
  4. Genomatix. "El Dorado, KIAA0090". Genomatix. Retrieved 2010-05-10.
  5. "Unigene, KIAA0090". NCBI. March 2010. Retrieved 2010-04-05.
  6. AASTATS; Jack Kramer, 1990. http://seqtool.sdsc.edu Archived 2003-08-11 at the Wayback Machine Accessed April 21, 2010
  7. PI; Program by Dr. Luca Toldo, developed at http://www.embl-heidelberg.de. Changed by Bjoern Kindler to print also the lowest found net charge. http://seqtool.sdsc.edu Archived 2003-08-11 at the Wayback Machine Accessed April 22, 2010
  8. "Gene cards" . Retrieved 2010-02-14.
  9. Brendel V, Bucher P, Nourbakhsh IR, Blaisdell BE, Karlin S (March 1992). "Methods and algorithms for statistical analysis of protein sequences". Proc. Natl. Acad. Sci. U.S.A. 89 (6): 2002–6. Bibcode:1992PNAS...89.2002B. doi: 10.1073/pnas.89.6.2002 . PMC   48584 . PMID   1549558.
  10. Horton P, Park KJ, Obayashi T, Fujita N, Harada H, Adams-Collier CJ, Nakai K (July 2007). "WoLF PSORT: protein localization predictor". Nucleic Acids Res. 35 (Web Server issue): W585–7. doi:10.1093/nar/gkm259. PMC   1933216 . PMID   17517783.
  11. la Cour T, Kiemer L, Mølgaard A, Gupta R, Skriver K, Brunak S (June 2004). "Analysis and prediction of leucine-rich nuclear export signals". Protein Eng. Des. Sel. 17 (6): 527–36. doi: 10.1093/protein/gzh062 . PMID   15314210.
  12. Bendtsen JD, Jensen LJ, Blom N, Von Heijne G, Brunak S (April 2004). "Feature-based prediction of non-classical and leaderless protein secretion". Protein Eng. Des. Sel. 17 (4): 349–56. doi: 10.1093/protein/gzh037 . PMID   15115854.
  13. 1 2 Bendtsen JD, Nielsen H, von Heijne G, Brunak S (July 2004). "Improved prediction of signal peptides: SignalP 3.0". J. Mol. Biol. 340 (4): 783–95. CiteSeerX   10.1.1.165.2784 . doi:10.1016/j.jmb.2004.05.028. PMID   15223320.
  14. Gupta R, Brunak S (2002). "Prediction of glycosylation across the human proteome and the correlation to protein function". Pac Symp Biocomput: 310–22. doi:10.1142/9789812799623_0029. ISBN   978-981-02-4777-5. PMID   11928486.
  15. Blom N, Gammeltoft S, Brunak S (December 1999). "Sequence and structure-based prediction of eukaryotic protein phosphorylation sites". J. Mol. Biol. 294 (5): 1351–62. doi:10.1006/jmbi.1999.3310. PMID   10600390.
  16. NetNGlyc; Prediction of N-glycosylation sites in human proteins.R. Gupta, E. Jung and S. Brunak. In preparation, 2004. http://www.cbs.dtu.dk/services/NetNGlyc/ Accessed April 20, 2010.
  17. McGuffin LJ, Bryson K, Jones DT (April 2000). "The PSIPRED protein structure prediction server". Bioinformatics. 16 (4): 404–5. doi: 10.1093/bioinformatics/16.4.404 . PMID   10869041.
  18. PROF; Aberystwyth University Computational Biology Group. Department of Computer Science, Aberystwyth SY23 3DB, Wales, UK. http://www.aber.ac.uk/~phiwww/prof/ Accessed: April 23, 2010
  19. Neer EJ, Schmidt CJ, Nambudripad R, Smith TF (September 1994). "The ancient regulatory-protein family of WD-repeat proteins". Nature. 371 (6495): 297–300. Bibcode:1994Natur.371..297N. doi:10.1038/371297a0. PMID   8090199. S2CID   600856.
  20. Prieto C, De Las Rivas J (July 2006). "APID: Agile Protein Interaction DataAnalyzer". Nucleic Acids Res. 34 (Web Server issue): W298–302. doi:10.1093/nar/gkl128. PMC   1538863 . PMID   16845013.