C12orf29

Last updated
C12orf29
C12orf29 AlphaFold.png
Identifiers
Aliases C12orf29 , chromosome 12 open reading frame 29, LOC91298, FLJ38158, MGC102978, DKFZp313K0436, DKFZp434N2030, DKFZp686L04169
External IDs MGI: 1921197; HomoloGene: 18409; GeneCards: C12orf29; OMA:C12orf29 - orthologs
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001009894

NM_175128

RefSeq (protein)

NP_001009894

NP_780337

Location (UCSC) Chr 12: 88.03 – 88.05 Mb Chr 10: 100.41 – 100.43 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

C12orf29 is a protein that in humans is encoded by chromosome 12 open reading frame 29. The gene is ubiquitously expressed in various tissues. [5] The protein has 325 amino acids. The biological process of C12orf29 has been annotated as hematopoietic progenitor cell differentiation. [6] The molecular and cellular functions of C12orf29 gene have not yet well understood by the scientific community.

Contents

Research suggested that C12orf29 is a potential structural protein in skeletal tissue with a role in the extracellular matrix of articular and growth cartilage. It has increased expression in osteosarcoma (OS) tumor cells and other tumor cells, and it could be a potential biomarker for detecting osteosarcoma.

Gene

C12orf29 gene in human is located on the positive strand at 12q21.32 (p = short arm, q = long arm). It has 7 exons and 6 introns. The gene spans from 88,035,536 to 88,050,160 with 14,645 base pairs. There is only one isoform of the transcript, which is the transcript of this gene itself. [5]

An overview of human C12orf29 gene transcript from NCBI Gene. Cytogenetic of C12orf29.jpg
An overview of human C12orf29 gene transcript from NCBI Gene.

Neighbor genes around human C12orf29 are: C12orf50 (-), RNA5SP364 (+), LOC107984542 (-), LOC100420011 (+), CEP290 (-).

Neighbor genes of human C12orf29 from NCBI Gene. Neighbor gene C12orf29.gif
Neighbor genes of human C12orf29 from NCBI Gene.

Expression

C12orf29 gene is ubiquitously expressed in 27 tissues. [5] It is expressed a little bit higher in esophagus, skin, brain and bone marrow. [5] It showed increased expression in several tumor cells and tissues such as in colorectal tumor tissue, [7] in ovarian cancer epithelial cells, [8] and in hyperplastic enlarged lobular units epithelial cells. [9]

Protein

C12orf29 protein has 325 amino acids. [10] The molecular weight is 37.5 kD, and the isoelectric point is 6.6 pH. [11] It is not a membrane protein, and it stays in the cytosol. [12] For the content of the amino acids, compared with other human proteins, C12orf29 is high in asparagine and histidine, but it is low in alanine. [11] There are no validated domains or motifs. [10] It has a short repetitive structures of "INGNP", but it is not conserved in orthologs. [11] However, there are two highly conserved predicted motifs. They are the protease caspase 3 and 7 cleavage motif, and the mitogen-activated protein kinase (MAPK) docking motif. [13] [14] There are several predicted protein kinase c phosphorylation sites and casein kinase II phosphorylation sites, but they are not conserved in orthologs. [13] [14]

         1  MKRLGSVQRK MPCVFVTEVK EEPSSKREHQ PFKVLATETV SHKALDADIY SAIPTEKVDG         61  TCCYVTTYKD QPYLWARLDR KPNKQAEKRF KNFLHSKENP KEFFWNVEED FKPAPECWIP        121  AKETEQINGN PVPDENGHIP GWVPVEKNNK QYCWHSSVVN YEFEIALVLK HHPDDSGLLE        181  ISAVPLSDLL EQTLELIGTN INGNPYGLGS KKHPLHLLIP HGAFQIRNLP SLKHNDLVSW        241  FEDCKEGKIE GIVWHCSDGC LIKVHRHHLG LCWPIPDTYM NSRPVIINMN LNKCDSAFDI        301  KCLFNHFLKI DNQKFVRLKD IIFDV
Some SNP and predicted post-translational modification of C12orf29 protein. (Green double underline represent beta-sheet. Pink squiggly underline represents alpha-helix. Bolded letters represent the conserved amino acids in distant orthologs) Conceptual Translation C12orf29.png
Some SNP and predicted post-translational modification of C12orf29 protein.(Green double underline represent beta-sheet. Pink squiggly underline represents alpha-helix. Bolded letters represent the conserved amino acids in distant orthologs)

Protein interaction

NCL and LRRK2 was experimentally determined to interacted with C12orf29. [19] NCL physically interacted with C12orf29, and LRRK2 was associated with C12orf29. [19] There are several predicted protein interactions for C12orf29. C12orf50, C1orf186/RHEX, TOR4A, FAM171A1 are associated with C12orf29 by text-mining, and the specific type of interaction was not studied yet. [20] The interactions with PNN, OTUD68, and ACTR6 were predicted by co-expression. [20]

Predicted protein interaction of C12orf29 by STRING. (Green line represent predicted relation by text-mining, black line represents predicted relation by co-expression) C12orf29 Predicted Protein Interaction.png
Predicted protein interaction of C12orf29 by STRING. (Green line represent predicted relation by text-mining, black line represents predicted relation by co-expression)

Homology and evolution

Orthologs of C12orf29 protein are found within the mammals, birds, reptiles, amphibians, fishes, and invertebrates. [21] The length and contents of the protein sequence was highly conserved in the selected orthologs (table below). [18] It is important to notice that the C12orf29 protein was only found in mollusks, tunicate, and cephalochordata in invertebrates, but it was not found in insects, arachnids, crustaceans, corals, worms, jellyfishes, sponges within the other groups in invertebrates. [17] [22] C12orf29 protein was not found in bacteria, fungi, and viruses. [17] There is no paralog for C12orf29 protein. [17] The mutation rate of C12orf29 was close to that of fibrinogen alpha chain. The human's C12orf29 protein was more closely related to the sheep's ortholog than to the mouse's ortholog. The invertebrates C12orf29 orthologs were mostly distantly related to the human protein.

Ortholog table

Selected Orthologs Table of C12orf29 Protein [18] [21] [17] [22] [23]
Sequence NumberOrtholog GroupGenius and SpeciesTaxonomic GroupCommon NameTime Since Divergence (Estimated MYA)Accession NumberSequence Length (aa)Sequence IdentitySequence Similarity
1MammalsHomo sapiensPrimatesHuman0NP_001009894.2325100%100%
2MammalsMus musculusRodentiaHouse Mouse90NP_780337.232784%90%
3MammalsOvis ariesArtiodactylaSheep96NP_001186723.132591%96%
4MammalsPhascolarctos cinereusDiprotodontiaKoala159XP_020849155.132586%92%
5MammalsOrnithorhynchus anatinusMonotremataPlatypus177XP_028934667.132678%89%
6AvesGallus gallusGalliformesChicken312XP_040518235.132475%86%
7AvesCygnus atratusAnseriformesBlack Swan312XP_035407728.132475%86%
8AvesCalypte annaApodiformesAnna's Hummingbird312XP_030312194.132475%87%
9ReptilesCrocodylus porosusCrocodyliaAustralian Saltwater Crocodile312XP_019400824.132376%88%
10ReptilesZootoca viviparaSquamataCommon Lizard312XP_034983157.132571%85%
11ReptilesPython bivittatusSquamataBurmese Python312XP_007429130.132370%84%
12ReptilesDermochelys coriaceaTestudinesLeatherback Sea Turtle312XP_038257651.132578%90%
13AmphibiansBufo bufoAnuraCommon Toad351.8XP_040264560.132463%77%
14AmphibiansRhinatrema bivittatumGymnophionaTwo-lined Caecilian351.8XP_029457527.133566%82%
15FishesDanio rerioCypriniformesZebrafish435NP_001008606.132560%79%
16FishesRhincodon typusOrectolobiformesWhale Shark473XP_020369226.133056%73%
17InvertebratesStyela clavaStolidobranchiaAsian Tunicate676XP_039256956.135144%59%
18InvertebratesBranchiostoma floridaeAmphioxiformesFlorida Lancelet684XP_035678444.134143%62%
19InvertebratesPecten maximusPectinidaKing Scallop797XP_033752428.134744%62%
20InvertebratesCrassostrea gigasOstreidaPacific Oyster797XP_034327602.135741%59%
Estimated date of divergence and the corrected divergence of orthologous protein sequence. The average mutation rate of C12orf29 protein, cytochrome c protein (slow-evolving), and fibrinogen alpha chain protein (fast-evolving) are estimated by the linear regression. Mutability rate of C12orf29.png
Estimated date of divergence and the corrected divergence of orthologous protein sequence. The average mutation rate of C12orf29 protein, cytochrome c protein (slow-evolving), and fibrinogen alpha chain protein (fast-evolving) are estimated by the linear regression.
Unrooted phylogenetic tree of C12orf 29 orthologs in 20 vertebrates and invertebrates. Unrooted phylogeny tree C12orf29.jpg
Unrooted phylogenetic tree of C12orf 29 orthologs in 20 vertebrates and invertebrates.

In research

In bone

In sheep (Ovis aries) bone, C12orf29 protein has a high expression in mandible osteoblasts cells (mOB cells) and periodontal ligament cells (PDLCs). It has a relatively low but still noticeable amount of expression in prostate cancer cell line (PC3). It is observed to be expressed in the extracellular matrix (ECM) around the mOB cells, and it is suggested that C12orf29 protein is imported and embedded into ECM from mOB cells. C12orf29 are also discovered in the area around mineralization zone of the growth plate and calcified cartilage of trabecular bone in rats. C12orf29 is a potential structural protein in skeletal tissue with a role in the extracellular matrix of articular and growth cartilage. It might be decorated with glycosaminoglycans, thus a potential proteoglycan. [24]

Cancer

C12orf29 is expressed in the most common subtypes (osteoblastic type, mixed osteoblastic/chondroblastic type, and chondroblastic type) of osteosarcoma (OS) patients and humanized OS model. It has a role in promoting the development of the musculoskeletal system. C12orf29 has a significantly high expression in the tumor cells, but its expression is not associated with the proliferation of the tumors. OS located at the jaw or temporal regions has a statistically significant expression of C12orf29 than OS in the extremity or trunk region. C12orf29 expression has a strong positive correlation with the expression of Ki67 gene (a biomarker for tumor proliferation). [25]

KLF9

The mRNA expression of C12orf29 protein is increased with Krüppel-like factor 9 (KLF9) suppression.KLF9 is a transcriptional regulator of uterine endometrial cell proliferation, adhesion, and differentiation, which are essential processes for pregnancy success. KLF9 expression is suppressed during tumorigenesis. [26]

Related Research Articles

<span class="mw-page-title-main">CRACD-like protein</span>

CRACD-like protein. previously known as KIAA1211L is a protein that in humans is encoded by the CRACDL gene. It is highly expressed in the cerebral cortex of the brain. Furthermore, it is localized to the microtubules and the centrosomes and is subcellularly located in the nucleus. Finally, CRACDL is associated with certain mental disorders and various cancers.

<span class="mw-page-title-main">C21orf58</span> Protein-coding gene in the species Homo sapiens

Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.

<span class="mw-page-title-main">C16orf46</span> Human gene

Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.

<span class="mw-page-title-main">C19orf44</span> Mammalian protein found in Homo sapiens

Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.

FAM71E2, also known as Family With Sequence Similarity 71 Member E2, is a protein that, in humans, is encoded by the FAM71E2 gene. Aliases include C19orf16, Protein FAM71E2, Chromosome 19 open reading frame 16, and Putative Protein FAM71E2. The gene is primarily conserved in mammals, but it is also conserved in two reptile species.

<span class="mw-page-title-main">SMCO3</span> Protein-coding gene in the species Homo sapiens

Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.

<span class="mw-page-title-main">Fam89A</span> Human protein and gene

ProteinFAM89A is a protein which in humans is encoded by the FAM89A gene. It is also known as chromosome 1 open reading frame 153 (C1orf153). Highest FAM89A gene expression is observed in the placenta and adipose tissue. Though its function is largely unknown, FAM89A is found to be differentially expressed in response to interleukin exposure, and it is implicated in immune responses pathways and various pathologies such as atherosclerosis and glioma cell expression.

<span class="mw-page-title-main">TMEM221</span> Protein

Transmembrane protein 221 (TMEM221) is a protein that in humans is encoded by the TMEM221 gene. The function of TMEM221 is currently not well understood.

<span class="mw-page-title-main">TMEM169</span> Gene

Transmembrane protein 169 (TMEM169) in humans is encoded by TMEM169 gene. The aliases of TMEM169 include FLJ34263, DKFZp781L2456, and LOC92691. TMEM169 has the highest expression in the brain, particularly the fetal brain. TMEM169 has homologs mammals, reptiles, amphibians, birds, fish, chordates and invertebrates. The most distantly related homolog of TMEM169 is Anopheles albimanus.

<span class="mw-page-title-main">SMIM19</span> Protein-coding gene in the species Homo sapiens

SMIM19, also known as Small Integral Membrane Protein 19, encodes the SMIM19 protein. SMIM19 is a confirmed single-pass transmembrane protein passing from outside to inside, 5' to 3' respectively. SMIM19 has ubiquitously high to medium expression with among varied tissues or organs. The validated function of SMIM19 remains under review because of on sub-cellular localization uncertainty. However, all linked proteins research to interact with SMIM19 are associated with the endoplasmic reticulum (ER), presuming SMIM19 ER association

<span class="mw-page-title-main">PANO1</span> Mammalian protein found in Homo sapiens

PANO1 is a protein which in humans is encoded by the PANO1 gene. PANO1 is an apoptosis inducing protein that is able to regulate the function of tumor suppressor. More specifically, P14ARF is a protein in which in humans is modulated by the PANO1 gene. P14ARF is known to function as a tumor suppressor. When PANO1 is highly expressed in the cells, it is able to modulate p14ARF by stabilizing it and protecting it from degradation. With a confidence level of 5 out of 5, PANO1 has been theorized to be expressed in the nucleolus of the cell. PANO1 is an intron-less gene. Intron-less genes only make up about 3% of the human genome. A functional analysis of these types of genes revealed that they often have tissue-specific expression in tissues such as the nervous system and testis. This kind of expression is commonly associated with neuropathies, disease, and cancer. The tissue types that PANO1 has the highest expression in, are the cerebellum regions of the brain as well as pituitary and testis tissues.

<span class="mw-page-title-main">C11orf98</span> Protein-coding gene in the species Homo sapiens

C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.

<span class="mw-page-title-main">C12orf50</span> Protein-coding gene in humans

Chromosome 12 Open Reading Frame 50 (C12orf50) is a protein-encoding gene which in humans encodes for the C12orf50 protein. The accession id for this gene is NM_152589. The location of C12orf50 is 12q21.32. It covers 55.42 kb, from 88429231 to 88373811, on the reverse strand. Some of the neighboring genes to C12orf50 are RPS4XP15, LOC107984542, and C12orf29. RPS4XP15 is upstream C12orf50 and is on the same strand. LOC107984542 and C12orf29 are both downstream. LOC107984542 is on the opposite strand while C12orf29 is on the same strand. C12orf50 has six isoforms. This page is focusing on isoform X1. C12orf50 isoform X1 is 1711 nucleotides long and has a protein with a length of 414 aa.

<span class="mw-page-title-main">GPATCH2L</span> It is Wikipedia article of unknown gene called "GPATCH2L".

GPATCH2L is a protein that is encoded by the GPATCH2L human gene located at 14q24.3. In humans, the length of mRNA in GPATCH2L (NM_017926) is 14,021 base pairs and the gene spans bases is 62,422 nt between chr14: 76,151,922 - 76,214,343. GPATCH2L is on the positive strand. IFT43 is the gene directly before GPATCH2L on the positive strand and LOC105370575 is the uncharacterized gene on the negative strand, which is approximately one and a half the size of GPATCH2L. Known aliases for GPATCH2L contain C14orf118, FLJ20689, FLJ10033, and KIAA1152. GPATCH2L produces 28 distinct introns, 17 different mRNAs, 14 alternatively spliced variants, and 3 unspliced forms. It has 5 probable alternative promoters, 7 validated polyadenylation sites, and 6 predicted promoters of varying lengths.

<span class="mw-page-title-main">C3orf38</span> Uncharacterized gene

Chromosome 3 open reading frame 38 (C3orf38) is a protein which in humans is encoded by the C3orf38 gene.

<span class="mw-page-title-main">C4orf19</span> Human C4orf19 gene

C4orf19 is a protein which in humans is encoded by the C4orf19 gene.

<span class="mw-page-title-main">C5orf22</span> Protein-coding gene in the species Homo sapiens

Chromosome 5 open reading frame 22 (c5orf22) is a protein-coding gene of poorly characterized function in Homo sapiens. The primary alias is unknown protein family 0489 (UPF0489).

<span class="mw-page-title-main">C4orf36</span> Draft for page on C4orf36 gene/protein

C4orf36 is a protein that in humans is encoded by the c4orf36 gene.

<span class="mw-page-title-main">C13orf46</span> C13of46 Gene and Protein

Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.

<span class="mw-page-title-main">TMEM248</span> Transmembrane protein 248/TMEM248 gene

Transmembrane protein 248, also known as C7orf42, is a gene that in humans encodes the TMEM248 protein. This gene contains multiple transmembrane domains and is composed of seven exons.TMEM248 is predicted to be a component of the plasma membrane and be involved in vesicular trafficking. It has low tissue specificity, meaning it is ubiquitously expressed in tissues throughout the human body. Orthology analyses determined that TMEM248 is highly conserved, having homology with vertebrates and invertebrates. TMEM248 may play a role in cancer development. It was shown to be more highly expressed in cases of colon, breast, lung, ovarian, brain, and renal cancers.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000133641 Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000046567 Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. 1 2 3 4 "C12orf29 chromosome 12 open reading frame 29 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-09-26.
  6. Consortium, Gene Ontology. "AmiGO 2: Gene Product Details for UniProtKB:Q8N999". amigo.geneontology.org. Retrieved 2021-09-26.
  7. "88351102 - GEO Profiles - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-16.
  8. "61688502 - GEO Profiles - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-16.
  9. "39401542 - GEO Profiles - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-16.
  10. 1 2 "uncharacterized protein C12orf29 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-09-26.
  11. 1 2 3 4 "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk. Retrieved 2021-12-16.
  12. "PSORT: Protein Subcellular Localization Prediction Tool". www.genscript.com. Retrieved 2021-12-16.
  13. 1 2 3 "ELM - Search the ELM resource". elm.eu.org. Retrieved 2021-12-16.
  14. 1 2 3 "Motif Scan". myhits.sib.swiss. Retrieved 2021-12-16.
  15. "iCn3D: Web-based 3D Structure Viewer". structure.ncbi.nlm.nih.gov. Retrieved 2021-12-16.
  16. "AlphaFold Protein Structure Database". alphafold.ebi.ac.uk. Retrieved 2021-12-16.
  17. 1 2 3 4 5 "BLAST: Basic Local Alignment Search Tool". blast.ncbi.nlm.nih.gov. Retrieved 2021-12-16.
  18. 1 2 3 "Clustal Omega < Multiple Sequence Alignment < EMBL-EBI". www.ebi.ac.uk. Retrieved 2021-12-16.
  19. 1 2 "IntAct Portal". www.ebi.ac.uk. Retrieved 2021-12-16.
  20. 1 2 "C12orf29 protein (human) - STRING interaction network". string-db.org. Retrieved 2021-12-16.
  21. 1 2 "C12orf29 orthologs". NCBI. Retrieved 2021-12-16.
  22. 1 2 "Human BLAT Search". genome.ucsc.edu. Retrieved 2021-12-16.
  23. "TimeTree :: The Timescale of Life". www.timetree.org. Retrieved 2021-12-16.
  24. Friis TE, Stephenson S, Xiao Y, Whitehead J, Hutmacher DW (October 2014). "A polymerase chain reaction-based method for isolating clones from a complimentary [sic] DNA library in sheep". Tissue Engineering. Part C, Methods. 20 (10): 780–789. doi:10.1089/ten.tec.2013.0099. PMC   4186646 . PMID   24447069.
  25. Wagner F, Holzapfel BM, McGovern JA, Shafiee A, Baldwin JG, Martine LC, et al. (July 2018). "Humanization of bone and bone marrow in an orthotopic site reveals new potential therapeutic targets in osteosarcoma" (PDF). Biomaterials. 171: 230–246. doi:10.1016/j.biomaterials.2018.04.030. hdl:10072/385762. PMID   29705656. S2CID   19093873.
  26. Simmen FA, Su Y, Xiao R, Zeng Z, Simmen RC (September 2008). "The Krüppel-like factor 9 (KLF9) network in HEC-1-A endometrial carcinoma cells suggests the carcinogenic potential of dys-regulated KLF9 expression". Reproductive Biology and Endocrinology. 6 (1): 41. doi: 10.1186/1477-7827-6-41 . PMC   2542371 . PMID   18783612.