C3orf62

Last updated
C3orf62
Identifiers
Aliases C3orf62 , chromosome 3 open reading frame 62, MAPS
External IDs MGI: 2148248 HomoloGene: 14230 GeneCards: C3orf62
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_198562

NM_053216

RefSeq (protein)

NP_940964

NP_444446

Location (UCSC) Chr 3: 49.27 – 49.28 Mb Chr 9: 108.27 – 108.29 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse
C3orf62
Identifiers
SymbolC3orf62
Alt. namesCC062, FLJ43654
NCBI gene 375341
HGNC 24771
RefSeq NM_198562.21
UniProt Q6ZUJ4
Other data
Locus Chr. 3 p21.31{{{LocusSupplementaryData}}}
Search for
Structures Swiss-model
Domains InterPro

Chromosome 3 open reading frame 62 (C3orf62) is a protein that in humans is encoded by the C3orf62 gene. C3orf62 is a glycine-depleted protein relative to the amount of glycine in proteins in the rest of the genome. [5] C3orf62 has a KKXX-like motif and is predicted to be localized in the nucleus. [6] Expression of C3orf62 remains highest in whole blood. [7]

Contents

Gene

Locus

C3orf62 is mapped to the reverse strand of chromosome 3 at 3p21.31 and spans 9,313 bases. [8] C3orf62 starts at 49,268,597 base pairs from the terminus of the short arm (pter) and ending at 49,277,909 base pairs pter. This gene is known to have 3 exons, 4 transcripts, and 37 orthologues. [9] [7] [10] [11] [12]

Gene neighborhood

C3orf62 is flanked by Ubiquitin Specific Protease 4 ( USP4 ) and Coil-Coiled Domain Containing 36 (CCDC36).

C3orf62 and its gene neighbors on chromosome 3 from NCBI Gene Neighborhood C3orf62.gif
C3orf62 and its gene neighbors on chromosome 3 from NCBI

Aliases

C3orf62 possesses the following alternate names and synonyms: CC062; FLJ43654. [10] [13]

Protein

Primary sequence

C3orf62 human protein (Q6ZUJ4) is 267 amino acids long, and has a molecular mass of 30,194 daltons. [9] The isoelectric point of C3orf62 is roughly 5.2. The unmodified C3orf62 protein is a “glycine depleted protein” relative to amounts of glycine in proteins in the rest of the genome. [5] It appears that glycine is evenly distributed throughout the C3orf62 sequence with no preference of areas to cluster in. Before post-translational modifications, C3orf62 is an acidic protein. No charge clusters are present in C3orf62, and no specific spacing of cysteine is found. The isoelectric point of C3orf62 is 5.211000. [14]

NameEnsembl Transcript ID [11] [7] Base PairsProteinBiotypeCCDSUniprotRefseq
C3orf62-001ENST00000343010.74235267aaProtein encodingCCDS2792Q6ZUJ4NM_198562, NP_940964
C3orf62-004ENST00000436325.1581190aaProtein encoding-C9JW57-
C3orf62-003ENST00000424960.160298aaNonsense mediated decay-H7BZX3-
C3orf62-002ENST00000479673.13330No proteinRetained intron---

Domains and motifs

There are no known transmembrane domains for C3orf62. [13] C3orf62 has a KKXX-like motif in the C-terminus meaning C3orf62 may be responsible for retrieval of endoplasmic reticulum (ER) membrane proteins from the Golgi apparatus. [15]

Secondary structure

Roughly 7 alpha helices are predicted for C3orf62 through Pele Protein Structure Protein Prediction and strengthened through orthologous secondary structure predictions by Ali2D. [13] [16]

Subcellular localization

C3orf62 is predicted to be localized in the nucleus. [6] The k-nearest neighbors algorithm predicts C3orf62 to be classified as follows: k=9/23; 69.6% nuclear, 13.0% mitochondrial, 13.0% cytoskeletal, 4.3% cytoplasmic. [6]

Expression

C3orf62 is expressed in more than 30 different tissues; highest expression is in whole blood. [10] [7] [9] Specifically, highest expression of C3orf62 is in the following tissues: lung, tonsil, trachea, small intestine, mammary gland, and salivary gland. Through analysis of various microarray studies, C3orf62 is found to have consistently high expression compared to other genes tested in the datasets. [17] C3orf62 has low expression in brain tissues.

Diagram depicting the expression of C3orf62 in tissues throughout the body C3orf62 Tissue Expression.png
Diagram depicting the expression of C3orf62 in tissues throughout the body

Post-transcriptional modifications

C3orf62 possess two post-translational modifications, both are phosphorylation sites with locations at amino acid 210 and 224. [9] A natural variant is found at amino acid 110 (Glutamic acid (E)--> Lysine K). [12] [11]

It appears as though C3orf62 may have a YinOYang site at residue 115, meaning that this Threonine residue is predicted to be O-GlycNAcylated as well as phosphorylated. This site may be reversibly and dynamically modified by O-GlcNAc or Phosphate groups at different times in the cell. [18]

Regulation of expression

Thirteen promoters have been predicted for C3orf62. [19]

Transcript variants

Transcription of C3orf62 produces 5 alternatively spliced variants and 1 unspliced form. Of the four splice variants, two of them are protein coding, one is nonsense meditated decay, and one is a retained intron. [10] QIAGEN denotes the following as transcription factor binding sites in the C3orf62 promoter: TFCP2, Pax-6, p53, MyoD, YY1, Ik-2, AREB6, IRF-7A3. [7]

Function

Function of C3orf62 is not currently understood by the scientific community.

Interactions

Upwards of 12 interacting proteins have been predicted for C3orf62. [20] [21] [22] Interacting proteins with the strongest confidence to interact with C3orf62 include: HAUS augmin-like complex subunit 1 (HAUS-1), Inhibitor of growth protein 5 (ING5), Thioredoxin domain-containing protein 9 (TXNDC9), and MORF4-family associated proteins (MORF4L1, MFRAP1).

Chemicals known to interact with C3orf62 include the following: Aflatoxin B1, Hydralazine, Valproic acid, and Decitabine. [10]

Clinical significance

Interstitial deletions of chromosome 3 are rare, and only a few patients with a microdeletion of 3p21.31 have been reported to date. Characteristic clinical features found in patients with a microdeletion of 3p21.31 include developmental delay and distinctive facial features (including arched eyebrows, hypertelorism, epicanthus, and micrognathia). [23] [24] [25]

In the gene region, NCBI SNP identified 1,326 SNPS on the reverse minus strand of C3orf62. [26] In the coding region, NCBI SNP identified 147 common SNPs.

Homology

Paralogs

There are no known paralogs of C3orf62. [27]

Orthologs

The ortholog space of C3orf62 is fairly narrow, with the majority of orthologs found in mammals. [27] A small fraction of orthologs have also been found in the following classes: Reptila, Sarcopterygii, and Actinoptergii.

The groupings of nearly all Mammalia ortholog sequences of C3orf62 are as follows: E-value: 2e-94 to 1e-169; similarity 56-84%. Mammals in this group consist largely of primates but also include the following orders: Perissodactyla, Rodentia, Carnivora, Proboscidea, Cetartiodactyla, Cingulata, Artiodactyla, Eulipotyphla, Diselphimorphia, and Afrosoricida. [27]

More distantly related ortholog sequences of C3orf62 include organisms from classes Reptilia, Sarcopterygii, and Actinopterygii ranging from an E-value of 8e-10 to 3e-59 with similarity of 24-39%. [27] Organisms in this grouping consist of Testudines, Coelacanthiformes, Squamata, and Osteoglossiformes orders. No ortholog sequences of C3orf62 were found for the following life forms: Bacteria, archaea, protist, plant, fungus, trichoplax, invertebrate, amphibian, or bird.

Genus and SpeciesCommon NameClassAccessionPercent Identity
Homo sapiensHumanMammaliaNP_940964100
Microcebus murinusGrey Mouse LemurMammaliaXP_01262671888
Propithecus coquereliCoquerel's sifaka (lemur)MammaliaXP_01251088086.9
Equus caballusHorseMammaliaNP_00129587784.3
Loxodonta AfricanaAfrican elephantMammaliaXP_00340971183.2
Castor CanadensisNorth American BeaverMammaliaXP_02003731681.6
Otolemur garnettiiGarnett's Greater GalagoMammaliaXP_00380063381.6
Camelus bactrianusBactrian camelMammaliaXP_010967491.178.3
Ailuropoda melanoleucaGiant PandaMammaliaXP_01965662677.7
Canis lupus familiarisDogMammaliaXP_00343292477.2
Vicugna pacosAlpacaMammaliaXP_00619635677.2
Condylura cristataStar-nosed moleMammaliaXP_01257576076.8
Felis catusCatMammaliaXP_00398226975.1
Pteropus vampyrusLarge flying foxMammaliaXP_01137372073.3
Pantholops hodgsoniiTibetan antelopeMammaliaXP_00596931872.6
Ictidomys tridecemlineatusThirteen lines ground squirrelMammaliaXP_00532696771
Sorex araneusCommon ShrewMammaliaXP_01278968269.5
Monodelphis domesticaGray short-tailed opossumMammaliaXP_00136790765.4
Echinops telfairiLesser Hedgehog TenrecMammaliaXP_00471528363.7
Orcinus orcaKiller whaleMammaliaXP_00428398561.2
Dasypus novemcinctusNine banded armadilloMammaliaXP_00445195058.2
Dipodomys ordiiOrd's Kangaroo RatMammaliaXP_01288351156.3
Myotis lucifugusLittle Brown MyotisMammaliaXP_00610703339.3
Pelodiscus sinensisChinese softshell turtleReptilliaXP_01442623538.5
Chelonia mydasGreen Sea TurtleReptilliaXP_00706183737.1
Latimeria chalumnaeWest Indian Ocean coelacanth (fish)SarcopterygiiXP_00599274035.3
Anolis carolinensisGreen anole (lizard)ReptilliaXP_00810322733.1
Gekko japonicusJapanese GeckoReptilliaXP_01526286130.1

Phylogeny

The most distant ortholog of C3orf62 are species of fish and amphibians. Orthologs of C3orf62 are not seen in birds, invertebrates, or bacteria. [27]

Related Research Articles

<span class="mw-page-title-main">Proline-rich 12</span> Protein-coding gene in the species Homo sapiens

Proline-rich 12 (PRR12) is a protein of unknown function encoded by the gene PRR12.

<span class="mw-page-title-main">FAM63A</span> Protein-coding gene in the species Homo sapiens

Family with sequence similarity 63, member A is a protein that, is encoded by the FAM63A gene in humans,. It is located on the minus strand of chromosome 1 at locus 1q21.3.

<span class="mw-page-title-main">C3orf70</span> Protein-coding gene in the species Homo sapiens

C3orf70 also known as Chromosome 3 Open Reading Frame 70, is a 250aa protein in humans that is encoded by the C3orf70 gene. The protein encoded is predicted to be a nuclear protein; however, its exact function is currently unknown. C3orf70 can be identified with known aliases: Chromosome 3 Open Reading Frame 70, AK091454, UPF0524, and LOC285382.

<span class="mw-page-title-main">C11orf86</span> Protein-coding gene in the species Homo sapiens

Chromosome 11 open reading frame 86, also known as C11orf86, is a protein-coding gene in humans. It encodes for a protein known as uncharacterized protein C11orf86, which is predicted to be a nuclear protein. The function of this protein is currently unknown.

<span class="mw-page-title-main">FAM210B</span> Protein-coding gene in the species Homo sapiens

FAM210B is a gene that which in Homo sapiens encodes the protein FAM210B. It has been conserved throughout evolutionary history, and is highly expressed in multiple tissues within the human body. FAM210B's primary location is the endoplasmic reticulum.

<span class="mw-page-title-main">ERICH2</span> Protein-coding gene in the species Homo sapiens

Glutamate Rich Protein 2 is a protein in humans encoded by the gene ERICH2. This protein is expressed heavily in male tissues specifically in the testes, and proteins are specifically found in the nucleoli fibrillar center and the vesicles of these testicular cells. The protein has multiple protein interactions which indicate that it may play a role in histone modification and proper histone functioning.

FAM227A is a protein that in humans is encoded by FAM227A gene. Current studies have determined the location of this gene to be in the nuclear region of the cell. FAM227A is most highly expressed in the tissues of the fallopian tube, testis, and pituitary gland. FAM227A is present in species of mammals, birds and reptiles, and gene alignment sequences have shown that FAM227A is a rapidly evolving gene.

<span class="mw-page-title-main">C17orf98</span> Protein-coding gene in the species Homo sapiens

C17orf98 is a protein which in humans is coded by the gene c17orf98. The protein is derived from Homo sapiens chromosome 17. The C17orf98 gene consists of a 6,302 base sequence. Its mRNA has three exons and no alternative splice sites. The protein has 154 amino acids, with no abnormal amino acid levels. C17orf98 has a domain of unknown function (DUF4542) and is 17.6kDa in weight. C17orf98 does not belong to any other families nor does it have any isoforms. The protein has orthologs with high percent similarity in mammals and reptiles. The protein has additional distantly related orthologs across the metazoan kingdom, culminating with the sponge family.

<span class="mw-page-title-main">TMEM44</span> Protein-coding gene in the species Homo sapiens

TMEM44 is a protein that in humans is encoded by the TMEM44 gene. DKFZp686O18124 is a synonym of TMEM44.

<span class="mw-page-title-main">LOC101059915</span> Protein-coding gene in the species Homo sapiens

LOC101059915 is a protein, which in humans is encoded by the LOC101059915 gene. It is located on the X chromosome and has restricted expression in the testis.

<span class="mw-page-title-main">SMCO3</span> Protein-coding gene in the species Homo sapiens

Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.

<span class="mw-page-title-main">C16orf90</span> Protein-coding gene in the species Homo sapiens

C16orf90 or chromosome 16 open reading frame 90 produces uncharacterized protein C16orf90 in homo sapiens. C16orf90's protein has four predicted alpha-helix domains and is mildly expressed in the testes and lowly expressed throughout the body. While the function of C16orf90 is not yet well understood by the scientific community, it has suspected involvement in the biological stress response and apoptosis based on expression data from microarrays and post-translational modification data.

<span class="mw-page-title-main">LSMEM2</span> Protein-coding gene in the species Homo sapiens

Leucine rich single-pass membrane protein 2 is a single-pass membrane protein rich in leucine, that in humans is encoded by the LSMEM2 gene. The LSMEM2 protein is conserved in mammals, birds, and reptiles. In humans, LSMEM2 is found to be highly expressed in the heart, skeletal muscle and tongue.

<span class="mw-page-title-main">CCDC121</span> Protein found in humans

Coiled-coil domain containing 121 (CCDC121) is a protein encoded by the CCDC121 gene in humans. CCDC121 is located on the minus strand of chromosome 2 and encodes three protein isoforms. All isoforms of CCDC121 contain a domain of unknown function referred to as DUF4515 or pfam14988.

TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.

<span class="mw-page-title-main">SMIM19</span> Protein-coding gene in the species Homo sapiens

SMIM19, also known as Small Integral Membrane Protein 19, encodes the SMIM19 protein. SMIM19 is a confirmed single-pass transmembrane protein passing from outside to inside, 5' to 3' respectively. SMIM19 has ubiquitously high to medium expression with among varied tissues or organs. The validated function of SMIM19 remains under review because of on sub-cellular localization uncertainty. However, all linked proteins research to interact with SMIM19 are associated with the endoplasmic reticulum (ER), presuming SMIM19 ER association

<span class="mw-page-title-main">FAM214B</span> Protein-coding gene in the species Homo sapiens

The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.

<span class="mw-page-title-main">FAM166C</span>

Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.

<span class="mw-page-title-main">C20orf144</span> Human protein-encoding gene

Chromosome 20 open reading frame 144 (c20orf144) is a human protein-encoding gene. The human c20orf144 protein consists of 153 amino acids, with the first 150 amino acids being characterized as part of the Bcl-2 like protein of testis (Bclt) family.

<span class="mw-page-title-main">LRRC74A</span> Protein-coding gene

Leucine-rich repeat-containing protein 74A (LRRC74A), is a protein encoded by the LRRC74A gene. The protein LRRC74A is localized in the cytoplasm. It has a calculated molecular weight of approximately 55 kDa. The LRRC74A protein is nominally expressed in the testis, salivary gland, and pancreas.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000188315 - Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000032611 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. 1 2 "SAPS". SDSC Biology Workbench. Retrieved 23 April 2017.
  6. 1 2 3 "C3orf62 Homo sapiens". PSORT WWW Server.[ permanent dead link ]
  7. 1 2 3 4 5 "Homo sapiens C3orf62". GeneCards. Retrieved 5 February 2017.
  8. "Homo sapiens C3orf62". NCBI Nucleotide. Retrieved 5 February 2017.
  9. 1 2 3 4 "Homo sapiens C3orf62". NCBI Gene. Retrieved 5 February 2017.
  10. 1 2 3 4 5 "Humans 2010-C3orf62". Aceview. Retrieved 5 February 2017.
  11. 1 2 3 "C3orf62". UniProtKB.
  12. 1 2 "C3orf62". Ensembl. Retrieved 5 February 2017.
  13. 1 2 3 "Human Gene C3orf62". UCSC. Retrieved 5 February 2017.
  14. "PI". SDSC Biology Workbench.
  15. "C3orf62". PSORT WWW Server. Retrieved 7 May 2017.[ permanent dead link ]
  16. "C3orf62". Ali2D. Archived from the original on 22 December 2016. Retrieved 7 May 2017.
  17. "C3orf62 GEO Profiles". NCBI GEO. Retrieved 24 April 2017.
  18. "C3orf62". YingOYang. Retrieved 7 May 2017.
  19. "C3orf62". Genomatix. Archived from the original on 2 December 2021. Retrieved 7 May 2017.
  20. "C3orf62". STRING Interaction Network. Retrieved 7 May 2017.
  21. "C3orf62". BioGRID. Retrieved 7 May 2017.
  22. "C3orf62". InAct. Retrieved 7 May 2017.
  23. Haldeman-Englert CR, Gai X, Perin JC, Ciano M, Halbach SS, Geiger EA, McDonald-McGinn DM, Hakonarson H, Zackai EH, Shaikh TH (13 Dec 2008). "A 3.1-Mb microdeletion of 3p21.31 associated with cortical blindness, cleft lip, CNS abnormalities, and developmental delay". European Journal of Medical Genetics. 52 (4): 265–8. doi:10.1016/j.ejmg.2008.11.005. PMC   4391973 . PMID   19100872.
  24. Eto K, Sakai N, Shimada S, Shioda M, Ishigaki K, Hamada Y, Shinpo M, Azuma J, Tominaga K, Shimojima K, Ozono K, Osawa M, Yamamoto T (December 2013). "Microdeletions of 3p21.31 characterized by developmental delay, distinctive features, elevated serum creatine kinase levels, and white matter involvement". American Journal of Medical Genetics. Part A. 161A (12): 3049–56. doi:10.1002/ajmg.a.36156. PMID   24039031. S2CID   272908.
  25. Lovrecic L, Bertok S, Žerjav Tanšek M (May 2016). "A New Case of an Extremely Rare 3p21.31 Interstitial Deletion". Molecular Syndromology. 7 (2): 93–8. doi:10.1159/000445227. PMC   4906427 . PMID   27385966.
  26. "C3orf62". NCBI SNP.
  27. 1 2 3 4 5 "C3orf62". NCBI BLAST. Retrieved 7 May 2017.