C2orf72

Last updated
C2orf72
Identifiers
Aliases C2orf72 , chromosome 2 open reading frame 72, Chromosome 2, Open Reading Frame 72
External IDs MGI: 1920042 HomoloGene: 54780 GeneCards: C2orf72
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001144994

NM_001144992
NM_001144993

RefSeq (protein)

NP_001138466

NP_001138464
NP_001138465

Location (UCSC) Chr 2: 231.04 – 231.05 Mb Chr 1: 85.97 – 85.98 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

C2orf72 (Chromosome 2, Open Reading Frame 72) is a gene in humans ( Homo sapiens) that encodes a protein currently named after its gene, C2orf72. [5] It is also designated LOC257407 [5] and can be found under GenBank accession code NM_001144994.2. [6] The protein can be found under UniProt accession code A6NCS6. [7]

Contents

This gene is primarily expressed in the liver, brain, placental, and small intestine tissues. [8] C2orf72 is an intracellular protein that has been predicted to reside within the nucleus, cytosol, and plasma membrane of cells. [5] The function of C2orf72 is unknown, but it is predicted to be involved in very-low-density lipoprotein particle assembly and also involved in the regulation of cholesterol esterification. [9] This prediction also matches with the fact that both estradiol [10] and testosterone [11] have been reported to upregulate expression of C2orf72. [12]

Gene

Chromosome 2 Ideogram human chromosome 2.svg
Chromosome 2
See the red band: 2q37.1. C2orf72 locus, 2q37.1.png
See the red band: 2q37.1.

Locus

C2orf72 is a protein-coding gene found on the forward (+) strand of chromosome 2 at the locus 2q37.1, on the long arm of the chromosome. [5]

mRNA

C2orf72's mRNA transcript is reported to be about 3,629 base pairs long. [6] It appears to have two polyadenylation sites near the 5′ end of the mRNA transcript, each preceded by their respective regulatory sequences, such as ATTAAA or AATAAA. [6]

There are three predicted exons reported for human C2orf72. [6]

Expression pattern

C2orf72 is preferentially expressed in brain, liver, placenta, colon, small intestine, gallbladder, stomach, and prostate, and to a lesser extent in adrenal gland, appendix, pancreas, lung, kidney, testis, and urinary bladder. [8]

Predicted Biological Functions

Top Ten Predicted Biological Processes (GO) as of July 16, 2022, via Archs4.https://maayanlab.cloud/archs4/gene/C2ORF72 Archs4 C2Orf72 Lipid July 16 2022.png
Top Ten Predicted Biological Processes (GO) as of July 16, 2022, via Archs4.https://maayanlab.cloud/archs4/gene/C2ORF72

It is predicted via Archs4 [13] (July 16, 2022) that the function of this gene may be related to very-low-density lipoprotein particle assembly [14] and also involved in the regulation of cholesterol esterification. [9]

Regulation

Gene-level regulation

Gene perturbation data

In a study of embryonic liver samples lacking hepatocyte nuclear factor 4 alpha (HNF4α), the expression of C2orf72 was downregulated. [15]

Both estradiol [10] and testosterone [11] upregulate expression of C2orf72. [12]

Expression pattern

C2orf72 mRNA and protein products are found preferentially in the liver, kidney, and placenta. [16] The protein is localized to the cell membrane and cytoplasm in liver, brain, and placental tissues. [16]

Transcript-level regulation

miR-1271-5p is a microRNA that could bind to the 3′ untranslated region of the C2orf72 mRNA transcript at 5′-...GUGCCAA...-3′. [6] [17] [18]

Protein-level regulation

Predicted phosphorylation sites

There are at least two predicted phosphorylation sites for the human C2orf72 protein, one at threonine-286 and the other at serine-294. [7]

In this figure, in this case, I listed three possible phosphorylation sites: Serine-294, Threonine-286, and Serine-277. C2Orf72 Schematic July 16 2022.png
In this figure, in this case, I listed three possible phosphorylation sites: Serine-294, Threonine-286, and Serine-277.

Protein

AlphaFold C2Orf72 protein structure prediction, as of July 16, 2022.https://alphafold.ebi.ac.uk/entry/A6NCS6 C2Orf72 AlphaFold July 16 2022 AF-A6NCS6-F1.png
AlphaFold C2Orf72 protein structure prediction, as of July 16, 2022.https://alphafold.ebi.ac.uk/entry/A6NCS6
Page 1 Conceptual Translation C2Orf72 July 16 2022.png
Page 2 Conceptual Translation C2Orf72 July 16 2022.png
Page 3 conceptual translation C2Orf72 polyA mRNA July 16 2022.png

Human protein

The predicted molecular weight of C2orf72 is 30.5 kDa, [19] and it has a predicted isoelectric point (pI) of pH 8.7. [20]

There are eight cysteine residues, for a potential of four disulfide bonds. [21] Most of the cysteine residues are positioned next to a polar amino acid (uncharged or positively or negatively charged). [21]

At physiological pH, there are 33 positively charged amino acid residues, including histidine, most of which are arginines. [21] Likewise, there are 33 negatively charged amino acid residues, most of which are glutamates. [21]

There are 14 hydroxyl-containing residues (tyrosine, threonine or serine) that could serve as typical phosphorylation sites; most of these are serines. [21]

I-TASSER for Protein Structure Prediction (Performed Calculation Approximately in July 2021 for C2Orf72's Predicted Amino Acid Sequence, Raw Form). I-TASSER C2Orf72 Summer 2021 structure prediction.png
I-TASSER for Protein Structure Prediction (Performed Calculation Approximately in July 2021 for C2Orf72's Predicted Amino Acid Sequence, Raw Form).

Interacting proteins

These proteins have been reported to interact with human C2orf72: RASN (GTPase NRas), [23] RASK (GTPase KRas), [23] and CD81. [24] [25]

Homology

There are at least 203 organisms with an ortholog of C2orf72. [26] The most evolutionarily distant reported ortholog of C2orf72 is in the Australian ghost shark (Callorhincus milii);, [27] [28] [29] and it is broadly conserved from Actinopterygii (bony fish) to Mammalia.

Table 1. Examples of species reported to have C2orf72 orthologs
Genus and speciesCommon nameOrderDate of divergence from human

(million years ago)

GenBank accession

code

Sequence

length

Sequence identity (%)Sequence

similarity (%)

Pan troglodytes Chimpanzee Primates 6.7 XP_516141.5 29598.698.6
Pongo abelii Sumatran orangutanPrimates15.76 XP_024099683.1 29595.396.9
Castor canadensis American beaver Rodentia 90 XP_020011841.1 28277.682.4
Oryx dammah Scimitar-horned oryx Artiodactyla 96 XP_040084064.1 28574.679.3
Sus scrofa Wild boarArtiodactyla96 XP_005657646.1 28275.380.7
Tursiops truncatus Common bottlenose dolphin Cetacea 96 XP_033715450.1 28576.980.7
Felis catus Domestic cat Carnivora 96 XP_023115562.1 28680.183.1
Eptesicus fuscus Big brown bat Chiroptera 96XP_027993078.115136.138.9
Corapipo altera White-ruffed manakin Passeriformes 312XP_027503457.118126.734.0
Pipra filicauda Wire-tailed manakinPasseriformes312XP_027606890.124334.745.2
Taeniopygia guttataZebra finchPasseriformes312XP_030136117.325535.145.4
Corvus cornix cornixHooded crowPasseriformes312XP_039412719.124536.045.3
Hirundo rusticaBarn swallowPasseriformes312XP_039930397.124337.046.7
Aythya fuligulaTufted duck Anseriformes 312XP_03204918825136.346.7
Anas platyrhynchosMallardAnseriformes312XP_038039556.125136.346.7
Protobothrops mucrosquamatus Brown-spotted pit viper Squamata 312XP_029139335.127822.934.5
Python bivittatusBurmese pythonSquamata312XP_025023716.127923.335.9
Pseudonaja textilisEastern brown snakeSquamata312XP_026577460.127231.641.0
Pantherophis guttatusCorn snakeSquamata312XP_034263860.125233.042.5
Pogona vitticepsCentral bearded dragonSquamata312XP_020657305.129524.134.0
Zootoca viviparaCommon lizardSquamata312XP_034989711.128537.948.6
Lacerta agilisSand lizardSquamata312XP_033004091.128938.049.5
Podarcis muralisCommon wall lizardSquamata312XP_028587763.127238.750.8
Gopherus evgoodei Goode's thornscrub tortoise Testudines 312XP_030431493.148124.231.1
Terrapene carolina

triunguis

Three-toed box turtleTestudines312XP_029766982.126235.143.2
Chrysemys picta belliiPainted turtleTestudines312XP_023966073.130636.647.4
Dermochelys coriaceaLeatherback sea turtleTestudines312XP_038272534.127138.148.1
Mauremys reevesiiReeves' turtleTestudines312XP_039344659.127739.551.4
Nanorana parkeriHigh Himalaya frog Anura 351.8XP_018432004.130427.340.1
Xenopus tropicalisTropical clawed frogAnura351.8XP_002937397.328930.742.4
Rhinatrema bivittatumTwo-lined caecilian Gymnophiona 351.8XP_029473197.135830.336.1
Geotrypetes seraphiniGaboon caecilian Gymnophiona 351.8XP_033814148.123333.944.2
Parambassis rangaIndian glass fish Perciformes 435XP_028260036.133419.734.5
Acanthochromis polyacanthus Spiny chromisPerciformes435XP_022050415.131721.835.6
Acanthopagrus latus Yellowfin seabreamPerciformes435XP_036971960.130922.035.5
Cyprinodon tularosaWhite Sands pupfish Cyprinodontiformes 435XP_038147473.129620.133.1
Esox lucius Northern pike Esociformes 435XP_012990404.133220.633.1
Thunnus maccoyii Southern bluefin tuna Scombriformes 435XP_042273029.132920.234.0
Syngnathus acus Greater pipefish Syngnathiformes 435XP_037106050.127419.534.9
Callorhinchus milii Australian ghost shark Chimaeriformes 473 XP_007887618.1 41317.626.5

Related Research Articles

<span class="mw-page-title-main">C11orf86</span> Protein-coding gene in the species Homo sapiens

Chromosome 11 open reading frame 86, also known as C11orf86, is a protein-coding gene in humans. It encodes for a protein known as uncharacterized protein C11orf86, which is predicted to be a nuclear protein. The function of this protein is currently unknown.

Chromosome 16 open reading frame 95 (C16orf95) is a gene which in humans encodes the protein C16orf95. It has orthologs in mammals, and is expressed at a low level in many tissues. C16orf95 evolves quickly compared to other proteins.

Uncharacterized protein Chromosome 16 Open Reading Frame 71 is a protein in humans, encoded by the C16orf71 gene. The gene is expressed in epithelial tissue of the respiratory system, adipose tissue, and the testes. Predicted associated biological processes of the gene include regulation of the cell cycle, cell proliferation, apoptosis, and cell differentiation in those tissue types. 1357 bp of the gene are antisense to spliced genes ZNF500 and ANKS3, indicating the possibility of regulated alternate expression.

<span class="mw-page-title-main">C12orf60</span> Protein-coding gene in humans

Uncharacterized protein C12orf60 is a protein that in humans is encoded by the C12orf60 gene. The gene is also known as LOC144608 or MGC47869. The protein lacks transmembrane domains and helices, but it is rich in alpha-helices. It is predicted to localize in the nucleus.

<span class="mw-page-title-main">CRACD-like protein</span>

CRACD-like protein. previously known as KIAA1211L is a protein that in humans is encoded by the CRACDL gene. It is highly expressed in the cerebral cortex of the brain. Furthermore, it is localized to the microtubules and the centrosomes and is subcellularly located in the nucleus. Finally, CRACDL is associated with certain mental disorders and various cancers.

<span class="mw-page-title-main">C6orf62</span> Protein-coding gene in the species Homo sapiens

Chromosome 6 open reading frame 62 (C6orf62), also known as X-trans-activated protein 12 (XTP12), is a gene that encodes a protein of the same name. The encoded protein is predicted to have a subcellular location within the cytosol.

<span class="mw-page-title-main">C16orf46</span> Human gene

Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.

<span class="mw-page-title-main">C15orf39</span>

C15orf39 is a protein that in humans is encoded by the Chromosome 15 open reading frame 15 (C15orf39) gene.

<span class="mw-page-title-main">C19orf44</span> Mammalian protein found in Homo sapiens

Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.

<span class="mw-page-title-main">C9orf50</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.

<span class="mw-page-title-main">C17orf78</span> Mammalian protein found in Homo sapiens

Uncharacterized protein C17orf78 is a protein encoded by the C17orf78 gene in humans. The name denotes the location of the parent gene, being at the 78th open reading frame, on the 17th human chromosome. The protein is highly expressed in the small intestine, especially the duodenum. The function of C17orf78 is not well defined.

<span class="mw-page-title-main">C14orf180</span> Protein-coding gene in the species Homo sapiens

C14orf180 is found on chromosome 14 in humans: 14q32.33. It consists of 1832 bp and 160 amino acids post translation. There is a total number of 6 exons. C14orf180 is also known as NRAC, C14orf77, and Chromosome 14 Open Reading Frame 180.

<span class="mw-page-title-main">C9orf85</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13. When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal. Isoelectric point was found to be 9.54. The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.

<span class="mw-page-title-main">C11orf98</span> Protein-coding gene in the species Homo sapiens

C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.

<span class="mw-page-title-main">C12orf50</span> Protein-coding gene in humans

Chromosome 12 Open Reading Frame 50 (C12orf50) is a protein-encoding gene which in humans encodes for the C12orf50 protein. The accession id for this gene is NM_152589. The location of C12orf50 is 12q21.32. It covers 55.42 kb, from 88429231 to 88373811, on the reverse strand. Some of the neighboring genes to C12orf50 are RPS4XP15, LOC107984542, and C12orf29. RPS4XP15 is upstream C12orf50 and is on the same strand. LOC107984542 and C12orf29 are both downstream. LOC107984542 is on the opposite strand while C12orf29 is on the same strand. C12orf50 has six isoforms. This page is focusing on isoform X1. C12orf50 isoform X1 is 1711 nucleotides long and has a protein with a length of 414 aa.

<span class="mw-page-title-main">KIAA2013</span> Protein-coding gene in the species Homo sapiens

KIAA2013, also known as Q8IYS2 or MGC33867, is a single-pass transmembrane protein encoded by the KIAA2013 gene in humans. The complete function of KIAA2013 has not yet been fully elucidated.

Chromosome 20 open reading frame 85, or most commonly known as C20orf85 is a gene that encodes for the C20orf85 Protein. This gene is not yet well understood by the scientific community.

<span class="mw-page-title-main">NOXRED1</span> Human gene

NADP-dependent oxidoreductase domain-containing protein 1 is a protein that in humans is encoded by the NOXRED1 gene. An alias of this gene is Chromosome 14 Open Reading Frame 148 (c14orf148). This gene is located on chromosome 14, at 14q24.3. NOXRED1 is predicted to be involved in pyrroline-5-carboxylate reductase activity as part of the L-proline biosynthetic pathway. It is expressed in a wide variety of tissues at a relatively low level, including the testes, thyroid, skin, small intestine, brain, kidney, colon, and more.

<span class="mw-page-title-main">C13orf46</span> C13of46 Gene and Protein

Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.

<span class="mw-page-title-main">TMEM271</span> TMEM271 gene and protein

Transmembrane protein 271, or TMEM271 is a protein in Homo sapiens encoded by the TMEM271 gene, located at 4p16.3 on the minus strand. The protein is located on the plasma membrane of cells and highly expressed in several regions of the brain.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000204128 Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000026227 Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. 1 2 3 4 "C2orf72 GeneCards". www.genecards.org. Retrieved 2021-08-02.
  6. 1 2 3 4 5 "Homo sapiens chromosome 2 open reading frame 72 (C2orf72), mRNA". 2020-12-12 via NCBI Nucleotide.
  7. 1 2 "iPTMnet Report A6NCS6 C2orf72". research.bioinformatics.udel.edu. Retrieved 2021-08-02.
  8. 1 2 "C2orf72 chromosome 2 open reading frame 72 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-08-02.
  9. 1 2 "ARCHS4". maayanlab.cloud. Retrieved 2021-08-02.
  10. 1 2 "Gene Set - estradiol_homo sapiens_gpl570_gds3283". maayanlab.cloud. Retrieved 2021-08-02.
  11. 1 2 "Gene Set - testosterone_mus musculus_gpl1261_gse17553". maayanlab.cloud. Retrieved 2021-08-02.
  12. 1 2 "Gene - C2ORF72". maayanlab.cloud. Retrieved 2021-08-02.
  13. "ARCHS4". maayanlab.cloud. Retrieved 2022-07-16.
  14. "QuickGO". www.ebi.ac.uk. Retrieved 2022-07-16.
  15. "Gene Set - hnf4a_16714383_e18dot5_liver_lof_mouse_gpl1261_gds1916". maayanlab.cloud. Retrieved 2021-08-02.
  16. 1 2 "Tissue expression of C2orf72 - Summary - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2021-08-02.
  17. "miRDB - MicroRNA Target Prediction Database". mirdb.org. Retrieved 2021-08-02.
  18. "TargetScanHuman 7.2". www.targetscan.org. Retrieved 2021-08-02.
  19. "C2orf72 protein expression summary - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2021-08-02.
  20. "Compute pI/MW - SIB Swiss Institute of Bioinformatics | Expasy". www.expasy.org. Retrieved 2021-08-02.
  21. 1 2 3 4 5 "uncharacterized protein C2orf72 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-08-02.
  22. "I-TASSER server for protein structure and function prediction". zhanggroup.org. Retrieved 2022-07-16.
  23. 1 2 Kovalski JR, Bhaduri A, Zehnder AM, Neela PH, Che Y, Wozniak GG, Khavari PA (February 2019). "The Functional Proximal Proteome of Oncogenic Ras Includes mTORC2". Molecular Cell. 73 (4): 830–844.e12. doi:10.1016/j.molcel.2018.12.001. PMC   6386588 . PMID   30639242.
  24. Bruening J, Lasswitz L, Banse P, Kahl S, Marinach C, Vondran FW, et al. (July 2018). "Hepatitis C virus enters liver cells using the CD81 receptor complex proteins calpain-5 and CBLB". PLOS Pathogens. 14 (7): e1007111. doi: 10.1371/journal.ppat.1007111 . PMC   6053247 . PMID   30024968.
  25. "HitPredict - High confidence protein-protein interactions". www.hitpredict.org. Retrieved 2021-08-02.
  26. "C2orf72 orthologs". NCBI. Retrieved 2021-08-04.
  27. "LOC103176070 uncharacterized protein C2orf72 homolog [Callorhinchus milii (elephant shark)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-08-04.
  28. "uncharacterized protein C2orf72 homolog [Callorhinchus milii] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-08-08.
  29. "PREDICTED: uncharacterized protein C2orf72 homolog Callorhinchus mili - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-08-08.