C5orf49

Last updated
C5orf49
Identifiers
Aliases C5orf49 , chromosome 5 open reading frame 49
External IDs MGI: 1916565 HomoloGene: 28246 GeneCards: C5orf49
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001089584

NM_027035

RefSeq (protein)

NP_001083053

NP_081311

Location (UCSC) Chr 5: 7.83 – 7.85 Mb Chr 13: 68.75 – 68.76 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

Chromosome 5 open reading frame forty-nine, also known as C5orf49, is a protein that in humans is encoded by the C5orf49 gene. Aliases for C5orf49 include Chromosome 5 Open Reading Frame 49, Uncharacterized Protein C5orf49 and LOC134121. [5] C5orf49 is predicted to localize to the cilia and have ciliary functions. [6]

Contents

Gene

C5orf49 neighboring genes C5orf49 Neighboring Genes.jpg
C5orf49 neighboring genes

C5orf49 is found on chromosome 5, cytoband p15 between base pairs 7,830,378 and 7,851,151, meaning it has a length of 20,774 base pairs. [7] This gene has two splice forms, one that is 147 amino acids in length and another that is 145 amino acids in length. [8] C5orf49 is oriented on the minus strand. [5] Neighboring genes of C5orf49 include, FASTKD3, MTRR, and ADCY2.

Gene-level regulation

Promoter

Schematic view of C5orf49 with promoter annotation. Schematic view of C5orf49 gene with promoter.jpg
Schematic view of C5orf49 with promoter annotation.

C5orf49 has one upstream promoter, GXP_1271072, that regulates both of the primary transcripts. [8] GXP_1271072 is 1,396 base pairs in length, spanning from base pair 7,851,094 to base pair 7,852,489 on chromosome 5. The transcription start region for the longest transcript of 147 amino acids spans from base pair 7,851,148 to base pair 7,851,164 on chromosome 5.

Protein

Structure

Conceptual translation of C5orf49 with DUF4541 domain Conceptual translation of C5orf49 with domain.jpg
Conceptual translation of C5orf49 with DUF4541 domain

C5orf49 is characterized by the presence of the protein domain DUF4541. [5] Within this protein domain, there is a conserved KLHRDDR sequence motif and a single completely conserved residue Y that may be functionally important. [9] Domain is shown on the annotated conceptual translation.

Predicted properties

The following properties of C5orf49 were predicted using bioinformatic analysis:

Tissue distribution

Normal human tissue expression profiling of C5orf49 Expression data for C5orf49.jpg
Normal human tissue expression profiling of C5orf49

Expression data indicate expression most significantly in the lung, brain, and spinal cord tissues. [12]

Binding partners

CDKN2d, HSF2BP, KRT31 and KRT34 were found to be binding partners of C5orf49 by two hybrid prey pooling approach and two hybrid array. [13]

Species Distribution

Table of C5orf49 orthologs C5orf49 ortholog table.jpg
Table of C5orf49 orthologs

C5orf49 shows conservation through mammals and orthologs can be found in flatworms and sea anemone. The table to the right shows a spread of some orthologs found using BLAST. [14] C5orf49 is not found in sponges, which diverged at a median date of 777 million years ago (MYA), [15] and it is found in its most distant ortholog 736 MYA. Therefore, C5orf49 diverged as a gene between 777 MYA and 736 MYA.

Evolution

C5orf49 protein divergence graph C5orf49 protein divergence graph.jpg
C5orf49 protein divergence graph

C5orf49 does not show a fast or slow evolution rate over time when compared to cytochrome C and fibrinogen alpha. This is shown by the protein divergence graph on the right.

Related Research Articles

<span class="mw-page-title-main">C11orf49</span> Protein-coding gene in the species Homo sapiens

C11orf49 is a protein coding gene that in humans encodes for the C11orf49 protein. It is heavily expressed in brain tissue and peripheral blood mononuclear cells, with the latter being an important component of the immune system. It is predicted that the C11orf49 protein acts as a kinase, and has been shown to interact with HTT and APOE2.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

<span class="mw-page-title-main">C2orf73</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C2orf73 is a protein that in humans is encoded by the C2orf73 gene. The protein is predicted to be localized to the nucleus.

<span class="mw-page-title-main">TMEM44</span> Protein-coding gene in the species Homo sapiens

TMEM44 is a protein that in humans is encoded by the TMEM44 gene. DKFZp686O18124 is a synonym of TMEM44.

<span class="mw-page-title-main">Chromosome 9 open reading frame 43</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 43 is a protein that in humans is encoded by the C9orf43 gene. The gene is also known as MGC17358 and LOC257169. C9orf43 contains DUF 4647 and a polyglutamine repeat region although protein function is not well understood.

<span class="mw-page-title-main">C19orf44</span> Mammalian protein found in Homo sapiens

Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.

<span class="mw-page-title-main">C4orf51</span> Protein-coding gene in the species Homo sapiens

Chromosome 4 open reading frame 51 (C4orf51) is a protein which in humans is encoded by the C4orf51 gene.

<span class="mw-page-title-main">C16orf86</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.

<span class="mw-page-title-main">GOLGA8H</span>

Golgin subfamily A member 8H, also known as GOLGA8H, is a protein that in Homo sapiens is encoded by the GOLGA8H gene. Function of the GOLGA8H involves a process that is carried out at the cellular level which results in the assembly, arrangement of constituent parts, or disassembly of the Golgi apparatus.

Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene. The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.

<span class="mw-page-title-main">LSMEM2</span> Protein-coding gene in the species Homo sapiens

Leucine rich single-pass membrane protein 2 is a single-pass membrane protein rich in leucine, that in humans is encoded by the LSMEM2 gene. The LSMEM2 protein is conserved in mammals, birds, and reptiles. In humans, LSMEM2 is found to be highly expressed in the heart, skeletal muscle and tongue.

<span class="mw-page-title-main">TMEM169</span> Gene

Transmembrane protein 169 (TMEM169) in humans is encoded by TMEM169 gene. The aliases of TMEM169 include FLJ34263, DKFZp781L2456, and LOC92691. TMEM169 has the highest expression in the brain, particularly the fetal brain. TMEM169 has homologs mammals, reptiles, amphibians, birds, fish, chordates and invertebrates. The most distantly related homolog of TMEM169 is Anopheles albimanus.

<span class="mw-page-title-main">C9orf85</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13. When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal. Isoelectric point was found to be 9.54. The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.

<span class="mw-page-title-main">FAM214B</span> Protein-coding gene in the species Homo sapiens

The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.

<span class="mw-page-title-main">C11orf98</span> Protein-coding gene in the species Homo sapiens

C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.

<span class="mw-page-title-main">C3orf38</span> Uncharacterized gene

Chromosome 3 open reading frame 38 (C3orf38) is a protein which in humans is encoded by the C3orf38 gene.

<span class="mw-page-title-main">TMEM212</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 212 is a protein that in humans is encoded by the TMEM212 gene. The protein consists of 5 transmembrane domains and localizes in the plasma membrane and endoplasmic reticulum. TMEM212 has orthologs in vertebrates but not invertebrates. TMEM212 has been associated with sporadic Parkinson's disease, facial processing, and adiposity in African Americans.

<span class="mw-page-title-main">C5orf22</span> Protein-coding gene in the species Homo sapiens

Chromosome 5 open reading frame 22 (c5orf22) is a protein-coding gene of poorly characterized function in Homo sapiens. The primary alias is unknown protein family 0489 (UPF0489).

<span class="mw-page-title-main">C4orf36</span> Draft for page on C4orf36 gene/protein

C4orf36 is a protein that in humans is encoded by the c4orf36 gene.

<span class="mw-page-title-main">C12orf54</span> Protein-coding gene in humans

C12orf54 is a protein in humans that is encoded by the C12orf54 gene.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000215217 - Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000021534 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. 1 2 3 4 "C5orf49". GeneCards: Human Gene Database. Archived from the original on 2011-09-01.
  6. Sigg, Monika Abedin; Menchen, Tabea; Lee, Chanjae; Johnson, Jeffery; Jungnickel, Melissa K.; Choksi, Semil P.; Garcia, Galo; Busengdal, Henriette; Dougherty, Gerard; Pennekamp, Petra; Werner, Claudius (2017-12-18). "Evolutionary proteomics uncovers ancient associations of cilia with signaling pathways". Developmental Cell. 43 (6): 744–762.e11. doi:10.1016/j.devcel.2017.11.014. ISSN   1534-5807. PMC   5752135 . PMID   29257953.
  7. "C5orf49 chromosome 5 open reading frame 49 [Homo sapiens (human)] – Gene – NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-18.
  8. 1 2 "Genomatix: ElDorado entry on C5orf49". Genomatix Software Suite.
  9. "InterPro". www.ebi.ac.uk. Retrieved 2021-12-18.
  10. "C5orf49 (human)". www.phosphosite.org. Retrieved 2021-12-18.
  11. Wang, D (2020). "MusiteDeep: a deep-learning based webserver for protein post-translational modification site prediction and visualization". Nucleic Acids Research. 48 (W1): W140–W146. doi:10.1093/nar/gkaa275. PMC   7319475 . PMID   32324217.
  12. "Home – GEO – NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-18.
  13. "IntAct Portal". www.ebi.ac.uk. Retrieved 2021-12-18.
  14. "Protein BLAST: search protein databases using a protein query". blast.ncbi.nlm.nih.gov. Retrieved 2021-12-18.
  15. "TimeTree :: The Timescale of Life". www.timetree.org. Retrieved 2021-12-18.