C2orf73

Last updated
C2orf73
Identifiers
Aliases C2orf73 , chromosome 2 open reading frame 73
External IDs MGI: 1922337 HomoloGene: 18988 GeneCards: C2orf73
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001100396
NM_173486
NM_001369401
NM_001369403

NM_001100394

RefSeq (protein)

NP_001093866
NP_001356330
NP_001356332

NP_001093864

Location (UCSC) Chr 2: 54.33 – 54.38 Mb Chr 11: 30.38 – 30.42 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

Uncharacterized protein C2orf73 is a protein that in humans is encoded by the C2orf73 gene. The protein is predicted to be localized to the nucleus.

Contents

Gene

The full gene spans a total of 53,712 base pairs and contains nine exons. The gene's location in the Human genome is on chromosome 2 at position 2p16.2 and is flanked by the genes ACYP2 and SPTBN1. [5] There are no aliases for this gene.

mRNA

The primary mRNA produced by the C2or73 gene is 1921 nucleotides long. There are six other mRNA isoforms produced by alternative splicing and variation in exon length. [6]

IsoformExonsmRNA Length (bases)
Primary2, 3, 5, 6, 71921
X12, 3, 5, 6, 7 (truncated), 8, 91726
X22, 3 (truncated), 5, 6, 7 (truncated)971
X32 (truncated), 5, 6, 7 (truncated)868
X44, 5, 6, 7 (truncated)951
X51, 5, 6, 7 (truncated)1049
X62, 3, 5, 7 (truncated)1034

Protein

The protein has a molecular mass of 32,142 daltons. [7] There are four protein isoforms. The primary isoform (X1) is 287 amino acids long. [8]

C2orf73 contains a short sequence motif, GDWWSH (This motif does not yet have any known function). The protein is lysine rich and leucine poor compared to the content of the average Human gene and has a predicted isoelectric point of 9.305. [9]

IsoformFrom mRNA IsoformLength (Amino Acids)Molecular Weight (kDa)Isoelectric Point
X1Primary, X128732.19.305
X2X222925.49.120
X3X3, X4, X516618.19.703
X4X614316.78.790

Structure

A 3D structure for C2orf73 has not yet been determined experimentally. A computational prediction made by I-TASSER is presented to the right. [10]

Predicted 3D structure of Human C2orf73 protein generated by I-TASSER. Predicted C2orf73 Protein Structure.png
Predicted 3D structure of Human C2orf73 protein generated by I-TASSER.

The PELE tool on Biology Workbench predicts three likely α-helices and one β-strand in the protein. [14]

Post translational modifications

The GPS, NetPhos, MyHits and SUMOsp tools on ExPASy [15] predict potential post-translational modifications for the protein. Six potential phosphorylation sites and one sumoylation site are predicted.

Subcellular localization

PSORT II predicts C2orf73 to be localized to the nucleus. [16] This is supported by the predicted presence of a sumoylation site, which is involved in nuclear cytoplasmic transport. [17]

Expression

GEO profiles from NCBI show that C2orf73 is weakly expressed in the following tissues in Humans: bone marrow, liver, heart, lung, brain, spinal cord, skeletal muscle, thymus, and epithelium. [18]

Regulation of expression

The Genomatix El Dorado tool predicts many transcription factors to have a high binding affinity in the 1100 base pairs upstream of C2orf73. Many of the transcription factors normally regulate processes such as cell development and differentiation, cell death, and the cell cycle. [19]

Interacting Proteins

Three proteins have been experimentally determined to interact with C2orf73 through Yeast Two-Hybrid experiments. [20]

Function

The function of C2orf73 is currently not well understood by the scientific community or anyone else.

Homology

There are no paralogs of C2orf73 in the Human genome. Orthologs are found throughout, but are limited to, the phylum Chordata (with a few exceptions in other phyla of the kingdom Animalia, like the Octopus bimaculoides ). [21]

SpeciesCommon NameNCBI Accession NumberSequence Length (AA)Millions of Years Since LCA [22]  % Identity % Similarity
Homo sapiens HumanNP_001093866.1287---
Heterocephalus glaber Naked mole ratXP_004867342.12359062.268.2
Mus musculus MouseNP_001093864.12339054.962.8
Fukomys damarensis Damaraland mole-ratXP_010614136.22889075.083.7
Pteropus vampyrus Large Flying FoxXP_011362281.12919677.782.8
Eptesicus fuscus Big Brown BatXP_008160678.13219661.867.6
Rhinolophus sinicus Chinese Rufous Horseshoe BatXP_019575083.13019671.479.4
Erinaceus europaeus European HedgehogXP_007528011.12849663.871.4
Condylura cristata Star nosed moleXP_012586937.12919669.879.0
Camelus ferus Wild Bactrian camelXP_006174505.12919675.683.2
Capra hircus GoatXP_013823176.12859673.177.9
Bos taurus CattleNP_001094753.12909675.581.0
Panthera pardus LeopardXP_019277335.12929675.082.2
Ursus maritimus Polar BearXP_008698084.12909677.384.5
Falco peregrinus Peregrine FalconXP_013152712.123131236.244.6
Apteryx mantelli North Island Brown KiwiXP_013805202.119731236.941.9
Python bivittatus Burmese PythonXP_007425859.131431230.845.0
Anolis carolinensis Carolina anoleXP_003216202.232031235.342.7
Xenopus laevis African Clawed FrogXP_018118010.130735236.952.2
Nanorana parkeri FrogXP_018419829.130735236.445.8
Callorhinchus milii Australian GhostsharkXP_007890694.129347328.034.9
Ciona intestinalis Sea squirtXP_002125895.123567622.434.5
Octopus bimaculoides California two-spot octopusXP_014784430.124279722.430.0
Saccoglossus kowalevskii Acorn WormXP_002735239.223268417.927.9

Related Research Articles

<span class="mw-page-title-main">METTL26</span> Protein-coding gene in the species Homo sapiens

METTL26, previously designated C16orf13, is a protein-coding gene for Methyltransferase Like 26, also known as JFP2. Though the function of this gene is unknown, various data have revealed that it is expressed at high levels in various cancerous tissues. Underexpression of this gene has also been linked to disease consequences in humans.

<span class="mw-page-title-main">C8orf48</span> Protein-coding gene in the species Homo sapiens

C8orf48 is a protein that in humans is encoded by the C8orf48 gene. C8orf48 is a nuclear protein specifically predicted to be located in the nuclear lamina. C8orf48 has been found to interact with proteins that are involved in the regulation of various cellular responses like gene expression, protein secretion, cell proliferation, and inflammatory responses. This protein has been linked to breast cancer and papillary thyroid carcinoma.

<span class="mw-page-title-main">C11orf86</span> Protein-coding gene in the species Homo sapiens

Chromosome 11 open reading frame 86, also known as C11orf86, is a protein-coding gene in humans. It encodes for a protein known as uncharacterized protein C11orf86, which is predicted to be a nuclear protein. The function of this protein is currently unknown.

TMEM156 is a gene that encodes the transmembrane protein 156 (TMEM156) in Homo sapiens. It has the clone name of FLJ23235.

The coiled-coil domain containing 142 (CCDC142) is a gene which in humans encodes the CCDC142 protein. The CCDC142 gene is located on chromosome 2, spans 4339 base pairs and contains 9 exons. The gene codes for the coiled-coil domain containing protein 142 (CCDC142), whose function is not yet well understood. There are two known isoforms of CCDC142. CCDC142 proteins produced from these transcripts range in size from 743 to 665 amino acids and contain signals suggesting protein movement between the cytosol and nucleus. Homologous CCDC142 genes are found in many animals including vertebrates and invertebrates but not fungus, plants, protists, archea, or bacteria. Although the function of this protein is not well understood, it contains a coiled-coil domain and a RINT1_TIP1 motif located within the coiled-coil domain.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

OCC-1 is a protein, which in humans is encoded by the gene C12orf75. The gene is approximately 40,882 bp long and encodes 63 amino acids. OCC-1 is ubiquitously expressed throughout the human body. OCC-1 has shown to be overexpressed in various colon carcinomas. Novel splice variant of this gene was also detected in various human cancer types; in addition to encoding a novel smaller protein, OCC-1 gene produces a non-protein coding RNA splice variant lncRNA.

Coiled-coil domain containing protein 180 (CCDC180) is a protein that in humans is encoded by the CCDC180 gene. This protein is known to localize to the nucleus and is thought to be involved in regulation of transcription as are many proteins containing coiled-coil domains. As it is expressed most highly in the testes and is regulated by SRY and SOX transcription factors, it could be involved in sex determination.

<span class="mw-page-title-main">ERICH2</span> Protein-coding gene in the species Homo sapiens

Glutamate Rich Protein 2 is a protein in humans encoded by the gene ERICH2. This protein is expressed heavily in male tissues specifically in the testes, and proteins are specifically found in the nucleoli fibrillar center and the vesicles of these testicular cells. The protein has multiple protein interactions which indicate that it may play a role in histone modification and proper histone functioning.

BEND2 is a protein that in humans is encoded by the BEND2 gene. It is also found in other vertebrates, including mammals, birds, and reptiles. The expression of BEND2 in Homo sapiens is regulated and occurs at high levels in the skeletal muscle tissue of the male testis and in the bone marrow. The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in chromatin modification and regulation.

<span class="mw-page-title-main">LOC101059915</span> Protein-coding gene in the species Homo sapiens

LOC101059915 is a protein, which in humans is encoded by the LOC101059915 gene. It is located on the X chromosome and has restricted expression in the testis.

<span class="mw-page-title-main">C17orf78</span> Mammalian protein found in Homo sapiens

Uncharacterized protein C17orf78 is a protein encoded by the C17orf78 gene in humans. The name denotes the location of the parent gene, being at the 78th open reading frame, on the 17th human chromosome. The protein is highly expressed in the small intestine, especially the duodenum. The function of C17orf78 is not well defined.

<span class="mw-page-title-main">C1orf94</span> Protein-coding gene in the species Homo sapiens

Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. The function of this protein is still poorly understood.

<span class="mw-page-title-main">C12orf24</span> Protein-coding gene in humans

C12orf24 is a gene in humans that encodes a protein known as FAM216A. This gene is primarily expressed in the testis and brain, but has constitutive expression in 25 other tissues. FAM216A is an intracellular protein that has been predicted to reside within the nucleus of cells. The exact function of C12orf24 is unknown. FAM216A is highly expressed in Sertoli cells of the testis as well as different stage spermatids.

<span class="mw-page-title-main">FAM214B</span> Protein-coding gene in the species Homo sapiens

The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.

<span class="mw-page-title-main">C6orf136</span> Protein-coding gene in the species Homo sapiens

C6orf136 is a protein in humans encoded by the C6orf136 gene. The gene is conserved in mammals, mollusks, as well some porifera. While the function of the gene is currently unknown, C6orf136 has been shown to be hypermethylated in response to FOXM1 expression in Head Neck Squamous Cell Carcinoma (HNSCC) tissue cells. Additionally, elevated expression of C6orf136 has been associated with improved survival rates in patients with bladder cancer. C6orf136 has three known isoforms.

<span class="mw-page-title-main">FAM98C</span> Gene

Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.

<span class="mw-page-title-main">ZNF548</span> Protein-coding gene in the species Homo sapiens

Zinc Finger Protein 548 (ZNF548) is a human protein encoded by the ZNF548 gene which is located on chromosome 19. It is found in the nucleus and is hypothesized to play a role in the regulation of transcription by RNA Polymerase II. It belongs to the Krüppel C2H2-type zinc-finger protein family as it contains many zinc-finger repeats.

<span class="mw-page-title-main">C5orf22</span> Protein-coding gene in the species Homo sapiens

Chromosome 5 open reading frame 22 (c5orf22) is a protein-coding gene of poorly characterized function in Homo sapiens. The primary alias is unknown protein family 0489 (UPF0489).

<span class="mw-page-title-main">C13orf46</span> C13of46 Gene and Protein

Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000177994 - Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000040919 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. "Homo sapiens chromosome 2 open reading frame 73 (C2orf73), mRNA - Nucleotide - NCBI". www.ncbi.nlm.nih.gov. Retrieved 28 April 2017.
  6. "C2orf73 chromosome 2 open reading frame 73 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 28 April 2017.
  7. "C2orf73 Gene - GeneCards | CB073 Protein | CB073 Antibody". www.genecards.org. Retrieved 28 April 2017.
  8. "C2orf73 chromosome 2 open reading frame 73 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 28 April 2017.
  9. "SDSC Biology Workbench". workbench.sdsc.edu. Retrieved 28 April 2017.
  10. "I-TASSER server for protein structure and function prediction". zhanglab.ccmb.med.umich.edu. Retrieved 28 April 2017.
  11. Zhang Y (January 2008). "I-TASSER server for protein 3D structure prediction". BMC Bioinformatics. 9 (1): 40. doi: 10.1186/1471-2105-9-40 . PMC   2245901 . PMID   18215316.
  12. Yang J, Yan R, Roy A, Xu D, Poisson J, Zhang Y (January 2015). "The I-TASSER Suite: protein structure and function prediction". Nature Methods. 12 (1): 7–8. doi:10.1038/nmeth.3213. PMC   4428668 . PMID   25549265.
  13. Roy A, Kucukural A, Zhang Y (April 2010). "I-TASSER: a unified platform for automated protein structure and function prediction". Nature Protocols. 5 (4): 725–38. doi:10.1038/nprot.2010.5. PMC   2849174 . PMID   20360767.
  14. "SDSC Biology Workbench". workbench.sdsc.edu. Retrieved 28 April 2017.
  15. "ExPASy: SIB Bioinformatics Resource Portal - Categories". www.expasy.org. Retrieved 28 April 2017.
  16. "PSORT II Prediction". psort.hgc.jp. Retrieved 28 April 2017.
  17. Hay RT (April 2005). "SUMO: a history of modification". Molecular Cell. 18 (1): 1–12. doi: 10.1016/j.molcel.2005.03.012 . PMID   15808504.
  18. "Home - GEO - NCBI". www.ncbi.nlm.nih.gov. Retrieved 28 April 2017.
  19. "Genomatix - NGS Data Analysis & Personalized Medicine". www.genomatix.de. Retrieved 28 April 2017.
  20. "PSICQUIC View". www.ebi.ac.uk. Retrieved 28 April 2017.
  21. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (September 1997). "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs". Nucleic Acids Research. 25 (17): 3389–402. doi:10.1093/nar/25.17.3389. PMC   146917 . PMID   9254694.
  22. "TimeTree :: The Timescale of Life". www.timetree.org. Retrieved 28 April 2017.