C1orf131

Last updated
C1orf131
Identifiers
Aliases C1orf131 , chromosome 1 open reading frame 131
External IDs MGI: 1913773 HomoloGene: 11982 GeneCards: C1orf131
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_152379
NM_001300830

NM_025615

RefSeq (protein)

NP_001287759
NP_689592

NP_079891

Location (UCSC) Chr 1: 231.22 – 231.24 Mb Chr 8: 125.56 – 125.59 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

Uncharacterized protein C1orf131 is a protein that in humans is encoded by the gene C1orf131. The first ortholog of this protein was discovered in humans. [5] [6] Subsequently, through the use of algorithms and bioinformatics, homologs of C1orf131 have been discovered in numerous species, and as a result, the name of the majority of the proteins in this protein family is Uncharacterized protein C1orf131 homolog.

Contents


Gene

In humans C1orf131 is located on the minus strand of chromosome 1 and on the cytogenetic band 1q42.2 along with 193 other genes. [7] Notably, the gene upstream of C1orf131 is GNPAT , and the gene downstream of C1orf131 is TRIM67. When this gene is transcribed in humans, C1orf131 most often forms an mRNA of 1458 base pairs long which is composed of seven exons. There are at least nine others alternative splice forms in humans that produce proteins. They range in size from 129 base pairs (2 exons) to 1458 base pairs (7 exons). [8]

Protein

In the C1orf131 protein family, the proteins are between 93 and 450 amino acids long; however, the majority tend to be between 160-295 amino acids long. They have a molecular weight between 10.6 and 49.0 kDa with the majority between 18.6 and 32.7 kDa. They have an isoelectric point between 9.6 and 11.2. [9] Over 30 orthologs from mammals, birds and lizards have been identified as having a poly(A) RNA binding site. [10] All orthologs in this protein family have a domain of unknown function DUF4602. [10] [11] The human protein has been shown to be both phosphorylated and acetylated. [12] [13] [14] [15] [16] [17] These proteins are lysine-rich, charged amino acids (D E H K R), and basic charged amino acids (H K R). [18] The secondary structure of these proteins primarily consist of alpha helices and coils with a small percentage of beta strands. [19] C1orf131 has been shown to interact with ubiquitin [20] through affinity capture followed by mass spectrometry and APP (amyloid beta (A4) precursor protein) [21] through reconstituted complex.

Graphical overview of the human protein C1orf131 with DUF4602 shown in green, phosphorylation in red points, and acetylation in gray point. C1orf131 overview1.png
Graphical overview of the human protein C1orf131 with DUF4602 shown in green, phosphorylation in red points, and acetylation in gray point.

DUF4602

DUF4602 (PF15375) is generally 120+ amino acids long. [22] There is typically only one gene that contains this DUF domain;however, the DUF domain has been identified in two different proteins in several species. In Trichuris suis DUF4602 is found in both hypothetical protein M5114_09117 and tRNA pseudouridine synthase D, and in Echinocuccus granulosus DUF4602 has been found in hypothetical protein EGR 05135 and expressed conserved protein. DUF4602 has been found primarily in eukaryotes; however, DUF4602 has been identified in the virus DRHN1, Bacillus sp. UNC41MFS5, Enterococcus faecalis , and Enterococcus faecalis 13-SD-W-01. In the C1orf131 orthologs the DUF domains are typically located in the middle of the gene toward the C-terminus side in larger proteins (250+ residues) and in smaller orthologs (160-250 residues) the DUF domain is located near the N-terminus. Also in larger orthologs there are regions of low complexity which could indicate that these proteins are intrinsically disordered proteins.

Evolutionary history

This gene family exists only in eukaryotes. There are no paralogs of this gene; however, there are a few pseudogenes of C1orf131. Thus far they have only been found in orangutans, mouse lemurs, and sloths. [11] When this gene family is compared to cytochrome C, a slow evolving gene, [23] and fibrinogen gamma chain, a fast evolving gene [24] it is shown to evolve at a faster rate than fibrinogen.

Graph of divergence of this gene as compared to fibrinogen and cytochrome C. Evolutionary History of C1orf131.png
Graph of divergence of this gene as compared to fibrinogen and cytochrome C.

Related Research Articles

<span class="mw-page-title-main">TCAIM</span> Protein-coding gene in the species Homo sapiens

TCAIM is a protein that in humans is encoded by the TCAIM gene.

<span class="mw-page-title-main">C14orf80</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C14orf80 is a protein which in humans is encoded by the chromosome 14 open reading frame 80, C14orf80, gene.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

<span class="mw-page-title-main">C10orf67</span> Protein-coding gene in the species Homo sapiens

Chromosome 10 open reading frame 67 (C10orf67), also known as C10orf115, LINC01552, and BA215C7.4, is an un-characterized human protein-coding gene. Several studies indicate a possible link between genetic polymorphisms of this and several other genes to chronic inflammatory barrier diseases such as Crohn's Disease and sarcoidosis.

<span class="mw-page-title-main">SHLD1</span> Protein-coding gene in the species Homo sapiens

SHLD1 or shieldin complex subunit 1 is a gene on chromosome 20. The C20orf196 gene encodes an mRNA that is 1,763 base pairs long, and a protein that is 205 amino acids long.

<span class="mw-page-title-main">FAM208b</span> Protein-coding gene in the species Homo sapiens

Protein FAM208B is a protein that in humans is encoded by the FAM208B gene. The gene is also known as "chromosome 10 open reading frame 18" (c10orf18). FAM208B is expressed throughout the body however its function has not been established. FAM208b has been observed to be differentially regulated in various cancers and throughout development. While the exact role of the protein is yet to be established, the significant presence of the protein within humans and throughout the phylogenetic tree depicts a central importance of the gene in normal function.

<span class="mw-page-title-main">C19orf44</span> Mammalian protein found in Homo sapiens

Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.

<span class="mw-page-title-main">C2orf16</span> Protein-coding gene in the species Homo sapiens

C2orf16 is a protein that in humans is encoded by the C2orf16 gene. Isoform 2 of this protein is 1,984 amino acids long. The gene contains 1 exon and is located at 2p23.3. Aliases for C2orf16 include Open Reading Frame 16 on Chromosome 2 and P-S-E-R-S-H-H-S Repeats Containing Sequence.

<span class="mw-page-title-main">C22orf31</span> Protein-coding gene in the species Homo sapiens

C22orf31 is a protein which in humans is encoded by the C22orf31 gene. The C22orf31 mRNA transcript has an upstream in-frame stop codon, while the protein has a domain of unknown function (DUF4662) spanning the majority of the protein-coding region. The protein has orthologs with high percent similarity in mammals. The most distant orthologs are found in species of bony fish, but C22orf31 is not found in any species of birds or amphibians.

<span class="mw-page-title-main">SAAL1</span> Protein-coding gene in the species Homo sapiens

Serum amyloid A-like 1 is a protein in humans encoded by the SAAL1 gene.

C2orf74, also known as LOC339804, is a protein encoding gene located on the short arm of chromosome 2 near position 15 (2p15). Isoform 1 of the gene is 19,713 base pairs long. C2orf74 has orthologs in 135 different species, including primarily placental mammals and some marsupials.

<span class="mw-page-title-main">C6orf136</span> Protein-coding gene in the species Homo sapiens

C6orf136 is a protein in humans encoded by the C6orf136 gene. The gene is conserved in mammals, mollusks, as well some porifera. While the function of the gene is currently unknown, C6orf136 has been shown to be hypermethylated in response to FOXM1 expression in Head Neck Squamous Cell Carcinoma (HNSCC) tissue cells. Additionally, elevated expression of C6orf136 has been associated with improved survival rates in patients with bladder cancer. C6orf136 has three known isoforms.

<span class="mw-page-title-main">TMEM101</span>

Transmembrane protein 101 (TMEM101) is a protein that in humans is encoded by the TMEM101 gene. The TMEM101 protein has been demonstrated to activate the NF-κB signaling pathway. High levels of expression of TMEM101 have been linked to breast cancer.

<span class="mw-page-title-main">FAM98C</span> Gene

Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.

<span class="mw-page-title-main">CCDC190</span> Protein found in humans

Coiled-Coil Domain Containing 190, also known as C1orf110, the Chromosome 1 Open Reading Frame 110, MGC48998 and CCDC190, is found to be a protein coding gene widely expressed in vertebrates. RNA-seq gene expression profile shows that this gene selectively expressed in different organs of human body like lung brain and heart. The expression product of c1orf110 is often called Coiled-coil domain-containing protein 190 with a size of 302 aa. It may get the name because a coiled-coil domain is found from position 14 to 72. At least 6 spliced variants of its mRNA and 3 isoforms of this protein can be identified, which is caused by alternative splicing in human.

<span class="mw-page-title-main">TEDDM1</span> Protein-coding gene in the species Homo sapiens

Transmembrane epididymal protein 1 is a transmembrane protein encoded by the TEDDM1 gene. TEDDM1 is also commonly known as TMEM45C and encodes 273 amino acids that contains six alpha-helix transmembrane regions. The protein contains a 118 amino acid length family of unknown function. While the exact function of TEDDM1 is not understood, it is predicted to be an integral component of the plasma membrane.


<span class="mw-page-title-main">C1orf159</span> Protein encoded on a gene

C1orf159 is a protein that in human is encoded by the C1orf159 gene located on chromosome 1. This gene is also found to be an unfavorable prognosis marker for renal and liver cancer, and a favorable prognosis marker for urothelial cancer.

C4orf45 Human protein

Chromosome 4 Open Reading Frame 45 (C4orf45) is a protein which in humans is encoded by the C4orf45 gene. It is predicted to be localized in the cytoplasm and nucleus of a cell

<span class="mw-page-title-main">C13orf42</span> C13orf42 gene page

C13orf42 is a protein which, in humans, is encoded by the gene chromosome 13 open reading frame 42 (C13orf42). RNA sequencing data shows low expression of the C13orf42 gene in a variety of tissues. The C13orf42 protein is predicted to be localized in the mitochondria, nucleus, and cytosol. Tertiary structure predictions for C13orf42 indicate multiple alpha helices.

<span class="mw-page-title-main">C10orf53</span> Human gene

C10orf53 is a protein that in humans is encoded by the C10orf53 gene. The gene is located on the positive strand of the DNA and is 30,611 nucleotides in length. The protein is 157 amino acids and the gene has 3 exons. C10orf53 orthologs are found in mammals, birds, reptiles, amphibians, fish, and invertebrates. It is primarily expressed in the testes and at very low levels in the cerebellum, liver, placenta, and trachea.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000143633 - Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000031984 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. Gerhard DS, Wagner L, et al. (October 2004). "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)". Genome Research. 14 (10b): 212–2127. doi:10.1101/gr.2596504. PMC   528928 . PMID   15489334.
  6. Ota,T.; Suzuki,Y.; et al. (December 21, 2004). "Complete sequencing and characterization of 21,243 full-length human cDNAs". Nature Genetics. 36 (1): 40–45. doi: 10.1038/ng1285 . PMID   14702039.
  7. "Browse Homo sapiens ORF cDNA clones by chromosome 1, map 1q42, page 1".
  8. "AceView: Gene:C1orf131, a comprehensive annotation of human, mouse and worm genes with mRNAs or ESTsAceView".
  9. Kozlowski, LP (2016). "IPC - Isoelectric Point Calculator". Biology Direct. 11 (1): 55. doi: 10.1186/s13062-016-0159-9 . PMC   5075173 . PMID   27769290.
  10. 1 2 "Uniprot Gene: C1orf131" . Retrieved May 7, 2015.
  11. 1 2 "BLAT" . Retrieved May 7, 2015.
  12. Olsen JV, Vermeulen M, Santamaria A, Kumar C, Miller ML, Jensen LJ, Gnad F, Cox J, Jensen TS, Nigg EA, Brunak S, Mann M (January 2010). "Quantitative phosphoproteomics reveals widespread full phosphorylation site occupancy during mitosis". Science Signaling. 3 (104): ra3. doi:10.1126/scisignal.2000475. PMID   20068231. S2CID   24775963.
  13. Wang B, Malik R, Nigg EA, Körner R (December 2008). "Evaluation of the low-specificity protease elastase for large-scale phosphoproteome analysis". Analytical Chemistry. 80 (24): 9526–9533. doi:10.1126/scisignal.2000475. PMID   20068231. S2CID   24775963 . Retrieved April 26, 2015.
  14. Matsuoka S, Ballif BA, Smogorzewska A, McDonald ER 3rd, Hurov KE, Luo J, Bakalarski CE, Zhao Z, Solimini N, Lerenthal Y, Shiloh Y, Gygi SP, Elledge SJ (May 2007). "ATM and ATR substrate analysis reveals extensive protein networks responsive to DNA damage". Science. 316 (5828): 1160–1166. Bibcode:2007Sci...316.1160M. doi:10.1126/science.1140321. PMID   17525332. S2CID   16648052.
  15. Kim D, Hahn Y (July 9, 2011). "Identification of novel phosphorylation modification sites in human proteins that originated after the human–chimpanzee divergence". Bioinformatics. 27 (18): 2494–501. doi: 10.1093/bioinformatics/btr426 . PMID   21775310.
  16. Dephoure N, Zhou C, Villén J, Beausoleil SA, Bakalarski CE, Elledge SJ, Gygi SP (August 2008). "A quantitative atlas of mitotic phosphorylation". Proceedings of the National Academy of Sciences of the United States of America. 105 (31): 10762–10767. Bibcode:2008PNAS..10510762D. doi: 10.1073/pnas.0805139105 . PMC   2504835 . PMID   18669648.
  17. Choudhary C, Kumar C, Gnad F, Nielsen ML, Rehman M, Walther TC, Olsen JV, Mann M (August 2010). "Lysine acetylation targets protein complexes and co-regulates major cellular functions". Science. 325 (5942): 834–40. Bibcode:2009Sci...325..834C. doi: 10.1126/science.1175371 . PMID   19608861. S2CID   206520776.
  18. Brendel V, Bucher P, Nourbakhsh IR, Blaisdell BE, Karlin S (March 1992). "Methods and algorithms for statistical analysis of protein sequences". Proceedings of the National Academy of Sciences of the United States of America. 89 (6): 2002–2006. Bibcode:1992PNAS...89.2002B. doi: 10.1073/pnas.89.6.2002 . PMC   48584 . PMID   1549558.
  19. Garnier J, Gibrat JF, Robson B (1996). GOR secondary structure prediction method version IV. Methods in Enzymology. Vol. 266. pp. 540–553. doi:10.1016/S0076-6879(96)66034-0. ISBN   9780121821678. PMID   8743705.
  20. Stes E, Laga M, Walton A, Samyn N, Timmerman E, De Smet I, Goormachtig S, Gevaert K (June 2014). "A COFRADIC Protocol To Study Protein Ubiquitination". J Proteome Res (3rd ed.). 13 (6): 3107–3113. doi:10.1021/pr4012443. PMID   24816145.
  21. Olah J, Vincze O, Virok D, Simon D, Bozso Z, Tokesi N, Horvath I, Hlavanda E, Kovacs J, Magyar A, Szucs M, Orosz F, Penke B, Ovadi J (September 2011). "Interactions of pathological hallmark proteins: tubulin polymerization promoting protein/p25, beta-amyloid, and alpha-synuclein". J. Biol. Chem. (39th ed.). 286 (39): 34088–34100. doi: 10.1074/jbc.m111.243907 . PMC   3190826 . PMID   21832049.
  22. "Family: DUF4602" . Retrieved May 8, 2015.
  23. Dickerson, R. (1971). "The structure of cytochrome c and the rates of molecular evolution". Journal of Molecular Evolution (1st ed.). 1 (1): 26–45. Bibcode:1971JMolE...1...26D. doi:10.1007/bf01659392. PMID   4377446. S2CID   24992347.
  24. Prychitko TM, Moore WS (2000). "Comparative evolution of the mitochondrial cytochrome b gene and nuclear beta-fibrinogen intron 7 in woodpeckers". Mol Biol Evol. 17 (7): 1101–11. doi: 10.1093/oxfordjournals.molbev.a026391 . PMID   10889223.