C12orf29 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | C12orf29 , chromosome 12 open reading frame 29, LOC91298, FLJ38158, MGC102978, DKFZp313K0436, DKFZp434N2030, DKFZp686L04169 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 1921197; HomoloGene: 18409; GeneCards: C12orf29; OMA:C12orf29 - orthologs | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
C12orf29 is a protein that in humans is encoded by chromosome 12 open reading frame 29. The gene is ubiquitously expressed in various tissues. [5] The protein has 325 amino acids. The biological process of C12orf29 has been annotated as hematopoietic progenitor cell differentiation. [6] The molecular and cellular functions of C12orf29 gene have not yet well understood by the scientific community.
Research suggested that C12orf29 is a potential structural protein in skeletal tissue with a role in the extracellular matrix of articular and growth cartilage. It has increased expression in osteosarcoma (OS) tumor cells and other tumor cells, and it could be a potential biomarker for detecting osteosarcoma.
C12orf29 gene in human is located on the positive strand at 12q21.32 (p = short arm, q = long arm). It has 7 exons and 6 introns. The gene spans from 88,035,536 to 88,050,160 with 14,645 base pairs. There is only one isoform of the transcript, which is the transcript of this gene itself. [5]
Neighbor genes around human C12orf29 are: C12orf50 (-), RNA5SP364 (+), LOC107984542 (-), LOC100420011 (+), CEP290 (-).
C12orf29 gene is ubiquitously expressed in 27 tissues. [5] It is expressed a little bit higher in esophagus, skin, brain and bone marrow. [5] It showed increased expression in several tumor cells and tissues such as in colorectal tumor tissue, [7] in ovarian cancer epithelial cells, [8] and in hyperplastic enlarged lobular units epithelial cells. [9]
C12orf29 protein has 325 amino acids. [10] The molecular weight is 37.5 kD, and the isoelectric point is 6.6 pH. [11] It is not a membrane protein, and it stays in the cytosol. [12] For the content of the amino acids, compared with other human proteins, C12orf29 is high in asparagine and histidine, but it is low in alanine. [11] There are no validated domains or motifs. [10] It has a short repetitive structures of "INGNP", but it is not conserved in orthologs. [11] However, there are two highly conserved predicted motifs. They are the protease caspase 3 and 7 cleavage motif, and the mitogen-activated protein kinase (MAPK) docking motif. [13] [14] There are several predicted protein kinase c phosphorylation sites and casein kinase II phosphorylation sites, but they are not conserved in orthologs. [13] [14]
1 MKRLGSVQRK MPCVFVTEVK EEPSSKREHQ PFKVLATETV SHKALDADIY SAIPTEKVDG 61 TCCYVTTYKD QPYLWARLDR KPNKQAEKRF KNFLHSKENP KEFFWNVEED FKPAPECWIP 121 AKETEQINGN PVPDENGHIP GWVPVEKNNK QYCWHSSVVN YEFEIALVLK HHPDDSGLLE 181 ISAVPLSDLL EQTLELIGTN INGNPYGLGS KKHPLHLLIP HGAFQIRNLP SLKHNDLVSW 241 FEDCKEGKIE GIVWHCSDGC LIKVHRHHLG LCWPIPDTYM NSRPVIINMN LNKCDSAFDI 301 KCLFNHFLKI DNQKFVRLKD IIFDV
NCL and LRRK2 was experimentally determined to interacted with C12orf29. [19] NCL physically interacted with C12orf29, and LRRK2 was associated with C12orf29. [19] There are several predicted protein interactions for C12orf29. C12orf50, C1orf186/RHEX, TOR4A, FAM171A1 are associated with C12orf29 by text-mining, and the specific type of interaction was not studied yet. [20] The interactions with PNN, OTUD68, and ACTR6 were predicted by co-expression. [20]
Orthologs of C12orf29 protein are found within the mammals, birds, reptiles, amphibians, fishes, and invertebrates. [21] The length and contents of the protein sequence was highly conserved in the selected orthologs (table below). [18] It is important to notice that the C12orf29 protein was only found in mollusks, tunicate, and cephalochordata in invertebrates, but it was not found in insects, arachnids, crustaceans, corals, worms, jellyfishes, sponges within the other groups in invertebrates. [17] [22] C12orf29 protein was not found in bacteria, fungi, and viruses. [17] There is no paralog for C12orf29 protein. [17] The mutation rate of C12orf29 was close to that of fibrinogen alpha chain. The human's C12orf29 protein was more closely related to the sheep's ortholog than to the mouse's ortholog. The invertebrates C12orf29 orthologs were mostly distantly related to the human protein.
Sequence Number | Ortholog Group | Genius and Species | Taxonomic Group | Common Name | Time Since Divergence (Estimated MYA) | Accession Number | Sequence Length (aa) | Sequence Identity | Sequence Similarity |
1 | Mammals | Homo sapiens | Primates | Human | 0 | NP_001009894.2 | 325 | 100% | 100% |
2 | Mammals | Mus musculus | Rodentia | House Mouse | 90 | NP_780337.2 | 327 | 84% | 90% |
3 | Mammals | Ovis aries | Artiodactyla | Sheep | 96 | NP_001186723.1 | 325 | 91% | 96% |
4 | Mammals | Phascolarctos cinereus | Diprotodontia | Koala | 159 | XP_020849155.1 | 325 | 86% | 92% |
5 | Mammals | Ornithorhynchus anatinus | Monotremata | Platypus | 177 | XP_028934667.1 | 326 | 78% | 89% |
6 | Aves | Gallus gallus | Galliformes | Chicken | 312 | XP_040518235.1 | 324 | 75% | 86% |
7 | Aves | Cygnus atratus | Anseriformes | Black Swan | 312 | XP_035407728.1 | 324 | 75% | 86% |
8 | Aves | Calypte anna | Apodiformes | Anna's Hummingbird | 312 | XP_030312194.1 | 324 | 75% | 87% |
9 | Reptiles | Crocodylus porosus | Crocodylia | Australian Saltwater Crocodile | 312 | XP_019400824.1 | 323 | 76% | 88% |
10 | Reptiles | Zootoca vivipara | Squamata | Common Lizard | 312 | XP_034983157.1 | 325 | 71% | 85% |
11 | Reptiles | Python bivittatus | Squamata | Burmese Python | 312 | XP_007429130.1 | 323 | 70% | 84% |
12 | Reptiles | Dermochelys coriacea | Testudines | Leatherback Sea Turtle | 312 | XP_038257651.1 | 325 | 78% | 90% |
13 | Amphibians | Bufo bufo | Anura | Common Toad | 351.8 | XP_040264560.1 | 324 | 63% | 77% |
14 | Amphibians | Rhinatrema bivittatum | Gymnophiona | Two-lined Caecilian | 351.8 | XP_029457527.1 | 335 | 66% | 82% |
15 | Fishes | Danio rerio | Cypriniformes | Zebrafish | 435 | NP_001008606.1 | 325 | 60% | 79% |
16 | Fishes | Rhincodon typus | Orectolobiformes | Whale Shark | 473 | XP_020369226.1 | 330 | 56% | 73% |
17 | Invertebrates | Styela clava | Stolidobranchia | Asian Tunicate | 676 | XP_039256956.1 | 351 | 44% | 59% |
18 | Invertebrates | Branchiostoma floridae | Amphioxiformes | Florida Lancelet | 684 | XP_035678444.1 | 341 | 43% | 62% |
19 | Invertebrates | Pecten maximus | Pectinida | King Scallop | 797 | XP_033752428.1 | 347 | 44% | 62% |
20 | Invertebrates | Crassostrea gigas | Ostreida | Pacific Oyster | 797 | XP_034327602.1 | 357 | 41% | 59% |
In sheep (Ovis aries) bone, C12orf29 protein has a high expression in mandible osteoblasts cells (mOB cells) and periodontal ligament cells (PDLCs). It has a relatively low but still noticeable amount of expression in prostate cancer cell line (PC3). It is observed to be expressed in the extracellular matrix (ECM) around the mOB cells, and it is suggested that C12orf29 protein is imported and embedded into ECM from mOB cells. C12orf29 are also discovered in the area around mineralization zone of the growth plate and calcified cartilage of trabecular bone in rats. C12orf29 is a potential structural protein in skeletal tissue with a role in the extracellular matrix of articular and growth cartilage. It might be decorated with glycosaminoglycans, thus a potential proteoglycan. [24]
C12orf29 is expressed in the most common subtypes (osteoblastic type, mixed osteoblastic/chondroblastic type, and chondroblastic type) of osteosarcoma (OS) patients and humanized OS model. It has a role in promoting the development of the musculoskeletal system. C12orf29 has a significantly high expression in the tumor cells, but its expression is not associated with the proliferation of the tumors. OS located at the jaw or temporal regions has a statistically significant expression of C12orf29 than OS in the extremity or trunk region. C12orf29 expression has a strong positive correlation with the expression of Ki67 gene (a biomarker for tumor proliferation). [25]
The mRNA expression of C12orf29 protein is increased with Krüppel-like factor 9 (KLF9) suppression.KLF9 is a transcriptional regulator of uterine endometrial cell proliferation, adhesion, and differentiation, which are essential processes for pregnancy success. KLF9 expression is suppressed during tumorigenesis. [26]
CRACD-like protein. previously known as KIAA1211L is a protein that in humans is encoded by the CRACDL gene. It is highly expressed in the cerebral cortex of the brain. Furthermore, it is localized to the microtubules and the centrosomes and is subcellularly located in the nucleus. Finally, CRACDL is associated with certain mental disorders and various cancers.
Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.
Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.
Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.
FAM71E2, also known as Family With Sequence Similarity 71 Member E2, is a protein that, in humans, is encoded by the FAM71E2 gene. Aliases include C19orf16, Protein FAM71E2, Chromosome 19 open reading frame 16, and Putative Protein FAM71E2. The gene is primarily conserved in mammals, but it is also conserved in two reptile species.
Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.
ProteinFAM89A is a protein which in humans is encoded by the FAM89A gene. It is also known as chromosome 1 open reading frame 153 (C1orf153). Highest FAM89A gene expression is observed in the placenta and adipose tissue. Though its function is largely unknown, FAM89A is found to be differentially expressed in response to interleukin exposure, and it is implicated in immune responses pathways and various pathologies such as atherosclerosis and glioma cell expression.
Transmembrane protein 221 (TMEM221) is a protein that in humans is encoded by the TMEM221 gene. The function of TMEM221 is currently not well understood.
Transmembrane protein 169 (TMEM169) in humans is encoded by TMEM169 gene. The aliases of TMEM169 include FLJ34263, DKFZp781L2456, and LOC92691. TMEM169 has the highest expression in the brain, particularly the fetal brain. TMEM169 has homologs mammals, reptiles, amphibians, birds, fish, chordates and invertebrates. The most distantly related homolog of TMEM169 is Anopheles albimanus.
SMIM19, also known as Small Integral Membrane Protein 19, encodes the SMIM19 protein. SMIM19 is a confirmed single-pass transmembrane protein passing from outside to inside, 5' to 3' respectively. SMIM19 has ubiquitously high to medium expression with among varied tissues or organs. The validated function of SMIM19 remains under review because of on sub-cellular localization uncertainty. However, all linked proteins research to interact with SMIM19 are associated with the endoplasmic reticulum (ER), presuming SMIM19 ER association
PANO1 is a protein which in humans is encoded by the PANO1 gene. PANO1 is an apoptosis inducing protein that is able to regulate the function of tumor suppressor. More specifically, P14ARF is a protein in which in humans is modulated by the PANO1 gene. P14ARF is known to function as a tumor suppressor. When PANO1 is highly expressed in the cells, it is able to modulate p14ARF by stabilizing it and protecting it from degradation. With a confidence level of 5 out of 5, PANO1 has been theorized to be expressed in the nucleolus of the cell. PANO1 is an intron-less gene. Intron-less genes only make up about 3% of the human genome. A functional analysis of these types of genes revealed that they often have tissue-specific expression in tissues such as the nervous system and testis. This kind of expression is commonly associated with neuropathies, disease, and cancer. The tissue types that PANO1 has the highest expression in, are the cerebellum regions of the brain as well as pituitary and testis tissues.
C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.
Chromosome 12 Open Reading Frame 50 (C12orf50) is a protein-encoding gene which in humans encodes for the C12orf50 protein. The accession id for this gene is NM_152589. The location of C12orf50 is 12q21.32. It covers 55.42 kb, from 88429231 to 88373811, on the reverse strand. Some of the neighboring genes to C12orf50 are RPS4XP15, LOC107984542, and C12orf29. RPS4XP15 is upstream C12orf50 and is on the same strand. LOC107984542 and C12orf29 are both downstream. LOC107984542 is on the opposite strand while C12orf29 is on the same strand. C12orf50 has six isoforms. This page is focusing on isoform X1. C12orf50 isoform X1 is 1711 nucleotides long and has a protein with a length of 414 aa.
GPATCH2L is a protein that is encoded by the GPATCH2L human gene located at 14q24.3. In humans, the length of mRNA in GPATCH2L (NM_017926) is 14,021 base pairs and the gene spans bases is 62,422 nt between chr14: 76,151,922 - 76,214,343. GPATCH2L is on the positive strand. IFT43 is the gene directly before GPATCH2L on the positive strand and LOC105370575 is the uncharacterized gene on the negative strand, which is approximately one and a half the size of GPATCH2L. Known aliases for GPATCH2L contain C14orf118, FLJ20689, FLJ10033, and KIAA1152. GPATCH2L produces 28 distinct introns, 17 different mRNAs, 14 alternatively spliced variants, and 3 unspliced forms. It has 5 probable alternative promoters, 7 validated polyadenylation sites, and 6 predicted promoters of varying lengths.
Chromosome 3 open reading frame 38 (C3orf38) is a protein which in humans is encoded by the C3orf38 gene.
C4orf19 is a protein which in humans is encoded by the C4orf19 gene.
Chromosome 5 open reading frame 22 (c5orf22) is a protein-coding gene of poorly characterized function in Homo sapiens. The primary alias is unknown protein family 0489 (UPF0489).
C4orf36 is a protein that in humans is encoded by the c4orf36 gene.
Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.
Transmembrane protein 248, also known as C7orf42, is a gene that in humans encodes the TMEM248 protein. This gene contains multiple transmembrane domains and is composed of seven exons.TMEM248 is predicted to be a component of the plasma membrane and be involved in vesicular trafficking. It has low tissue specificity, meaning it is ubiquitously expressed in tissues throughout the human body. Orthology analyses determined that TMEM248 is highly conserved, having homology with vertebrates and invertebrates. TMEM248 may play a role in cancer development. It was shown to be more highly expressed in cases of colon, breast, lung, ovarian, brain, and renal cancers.