C10orf67

Last updated
C10orf67
Identifiers
Aliases C10orf67 , C10orf115, LINC01552, bA215C7.4, chromosome 10 open reading frame 67
External IDs HomoloGene: 82326 GeneCards: C10orf67
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_153714
NM_001351306
NM_001365862
NM_001371909

n/a

RefSeq (protein)

NP_714925
NP_001338235
NP_001352791
NP_001358838

n/a

Location (UCSC) Chr 10: 23.2 – 23.34 Mb n/a
PubMed search [2] n/a
Wikidata
View/Edit Human

Chromosome 10 open reading frame 67 (C10orf67), also known as C10orf115, LINC01552, and BA215C7.4, is an un-characterized human protein-coding gene. Several studies indicate a possible link between genetic polymorphisms of this and several other genes to chronic inflammatory barrier diseases such as Crohn's Disease and sarcoidosis. [3] [4] [5]

Contents

Gene

A map of Chromosome 10 with the location of C10orf67 marked in red Gene Map.png
A map of Chromosome 10 with the location of C10orf67 marked in red

The gene spans 142,366 base pairs and is located at the 10p12.2 locus on the minus (-) or sense strand of chromosome 10. It is flanked upstream by the gene ARMC3 [6] and downstream by the gene KIAA1217. [7] [8] These genes are approximately 150,000 bp and 350,000 bp from C10orf67, respectively.

This segment depicts approximately 1,700,000 base pairs of chromosome 10. The green lines indicate the start of transcription while the red diamonds indicate the termination of transcription. C10orf67 is transcribed in the opposite direction of its flanking genes, which are located on the anti-sense strand. Genomic Context of C10orf67.png
This segment depicts approximately 1,700,000 base pairs of chromosome 10. The green lines indicate the start of transcription while the red diamonds indicate the termination of transcription. C10orf67 is transcribed in the opposite direction of its flanking genes, which are located on the anti-sense strand.

Transcript

There are 23 alternatively spliced exons, which encode 13 transcript variants. The primary transcript, only 2943 bp, is not well conserved among orthologs, rather, the X2 variant, 3417 bp, has far greater identity with orthologous proteins. This X2 transcript variant contains 15 exons which yield a polypeptide of 551 amino acids. [9] [10]

Protein

General properties

PropertyPreproteinCleaved proteinMature protein
Amino Acid length551515515
Isoelectric Point 9.38.68.3-8.9*
Molecular Weight63 kDa59 kDa~59-61 kDa**

*depending on post-translational modifications (PTMs)

**From no PTMs - all possible PTMs

The isoelectric point is significantly greater than average for human proteins (6.81). [11]

Predicted tertiary structure of C10orf67 generated by software. Based on a protein template covering 74% of the protein sequence with 96% identity. Tertiary Structure of C10orf67.png
Predicted tertiary structure of C10orf67 generated by software. Based on a protein template covering 74% of the protein sequence with 96% identity.

Structure

Shown to the right is a predicted tertiary structure of the protein. It is marked by long alpha-helices with several coil regions and beta strands localized to the end of the protein opposite the N- and C- terminal ends.

Expression

Expression of C10orf67 in various tissues. C10orf67 Expression Analysis.png
Expression of C10orf67 in various tissues.

C10orf67 is moderately expressed (50-75%) in most tissues in the body. [13] However, a study on NCBI GEO discussing the influence of interleukin-13 (IL-13) on gene expression [14] found that protein expression dropped to zero in the presence of IL-13 in airway epithelia.

Subcellular localization

The protein contains a mitochondrial signal peptide localizing it to the mitochondrial matrix. [15] Analysis with subcellular localization software [16] [17] confirmed this finding. However, some orthologs were also predicted to localize in the nucleus. Though the high isoelectric point of the Human protein provides further evidence for the mitochondrial localization due to the high pH of the mitochondrial matrix.

Post-translational modifications

Cleavage sites

The protein is initially cleaved to remove the 36 amino acid N-terminal signal peptide after it is localized to the mitochondrion. [18]

Phosphorylation

The possible phosphorylation sites of C10orf67. The concentration of possible phosphorylation sites is far greater near the C-terminus of the protein and far lower near the N-terminus, which contains the signal peptide. Phosphorylation sites of C10orf67.png
The possible phosphorylation sites of C10orf67. The concentration of possible phosphorylation sites is far greater near the C-terminus of the protein and far lower near the N-terminus, which contains the signal peptide.

There are a number of predicted phosphorylation sites, however there is one experimentally-confirmed phosphorylation site at threonine 69. [19] The other phosphorylation sites are summarized in the protein diagram below.

Sumoylation

There are five predicted sumoylation sites within C10orf67. These are summarized by the following table:

No.Pos.GroupScore
1K461NSFHV LKNE MFTRH0.91
2K401MPKKA LKED QAVVE0.91
3K224EVIKE LKEE LDQYK0.91
4K136KFEDR LKEE SLS L0.91
5K130KQLLQ LKFE DRLKE0.91
Post translational modifications of C10orf67. The N-terminus is on the left with the 36 amino acid signal peptide and the C-terminus is on the right. C10orf67 Diagram.png
Post translational modifications of C10orf67. The N-terminus is on the left with the 36 amino acid signal peptide and the C-terminus is on the right.

Homology and evolution

Evolution

C10orf67 has no known paralogs but has many orthologs within eukaryotes and retains significant identity with species as distantly related as invertebrates. Several select orthologs are listed below with some identifying information.

Genus and SpeciesCommon NameOrganism TypeTime Since Last

Common Ancestor

(million years ago)

Accession #

(NCBI)

Sequence Length% IdentityIsoelectric Point

(pre-protein)

Homo SapiensHumansPrimate0XP_0168715185511009.3
Pan troglodytes Chimpanzee6.65XP_009456334573959.27
Macaca nemestrina Southern pig-tailed macaque29.44XP_01173676857288.19.17
Bubalus bubalis Water BuffaloMammal96XP_00608004256556.66.24
Felis catus Cat96XP_01968963056055.17.68
Sus scrofa Wild Boar96XP_013835714515556.53
Panthera pardus Leopard96XP_01931607150453.96.24
Ovis aries Sheep96XP_01204372451653.66.61
Mustela putorius furo Ferret96XP_01291437956650.89.34
Castor canadensis Beaver90XP_020038711617448.92
Mus musculus Mouse90NP_08187656043.65.89
Myotis lucifugus Little Brown Bat96XP_01431600159838.96.22
Myotis brandtii Brandt's bat96XP_01439486963938.36.7
Elephantulus edwardii Cape elephant shrew105XP_00688716449337.95.62
Gallus gallus ChickenBird312XP_00364068743026.35.44
Astyanax mexicanus Mexican TetraFish435XP_00725306847526.14.76
Lepisosteus oculatus Spotted Gar435XP_01520895747925.26.73
Danio rerio Zebrafish435XP_69834646124.55.93
Salmo salar Atlantic Salmon435XP_01399588745521.66.18
Amphimedon queenslandica RenieraInvertebrate951.8XP_01140287251324.17.05
Branchiostoma belcheri Branchiostoma684XP_01964594156323.56.24

Evolution

The rate of evolution of C10orf67 relative to Fibrinogen and Cytochrome c. Evolution Graph.png
The rate of evolution of C10orf67 relative to Fibrinogen and Cytochrome c.

The rate of evolution of C10orf67 was compared to that of fibrinogen and cytochrome c, which represent fast and slow rates of evolution, respectively. The bolded species in the table were selected to represent the fibrinogen and cytochrome c orthologs to determine the rate of evolution of the respective proteins.

The rate of evolution of C10orf67 is very curious in that it follows a logarithmic trend rather than a linear trend, like most proteins.

Clinical significance

Sarcoidosis

While the function of C10orf67 is unknown, its interactions with IL-13 further suggest a role of C10orf67 in sarcoidosis as the disease is known to involve various interleukins.

Cancer

While several NCBI GEO profiles examining various factors on gene expression show that C10orf67 is expressed in varying levels in different cancer tissues, [20] [21] the mitochondrial localization may yield some insight as to a clinical function. Mitochondria have been shown to have some influence in cell proliferation. Given the high energy demand from cell proliferation, there have been several hypotheses that the mitochondria may play a role in the cell cycle and that C10orf67, being localized to the mitochondria, may have a hand in this as well.

Related Research Articles

<span class="mw-page-title-main">C8orf48</span> Protein-coding gene in the species Homo sapiens

C8orf48 is a protein that in humans is encoded by the C8orf48 gene. C8orf48 is a nuclear protein specifically predicted to be located in the nuclear lamina. C8orf48 has been found to interact with proteins that are involved in the regulation of various cellular responses like gene expression, protein secretion, cell proliferation, and inflammatory responses. This protein has been linked to breast cancer and papillary thyroid carcinoma.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

Coiled-coil domain containing protein 180 (CCDC180) is a protein that in humans is encoded by the CCDC180 gene. This protein is known to localize to the nucleus and is thought to be involved in regulation of transcription as are many proteins containing coiled-coil domains. As it is expressed most highly in the testes and is regulated by SRY and SOX transcription factors, it could be involved in sex determination.

Uncharacterized protein Chromosome 16 Open Reading Frame 71 is a protein in humans, encoded by the C16orf71 gene. The gene is expressed in epithelial tissue of the respiratory system, adipose tissue, and the testes. Predicted associated biological processes of the gene include regulation of the cell cycle, cell proliferation, apoptosis, and cell differentiation in those tissue types. 1357 bp of the gene are antisense to spliced genes ZNF500 and ANKS3, indicating the possibility of regulated alternate expression.

Chromosome 19 open reading frame 18 (c19orf18) is a protein which in humans is encoded by the c19orf18 gene. The gene is exclusive to mammals and the protein is predicted to have a transmembrane domain and a coiled coil stretch. This protein has a function that is not yet fully understood by the scientific community.

<span class="mw-page-title-main">C17orf53</span>

C17orf53 is a gene in humans that encodes a protein known as C17orf53, uncharacterized protein C17orf53. It has been shown to target the nucleus, with minor localization in the cytoplasm. Based on current findings C17orf53 is predicted to perform functions of transport, however further research into the protein could provide more specific evidence regarding its function.

<span class="mw-page-title-main">C21orf58</span> Protein-coding gene in the species Homo sapiens

Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.

<span class="mw-page-title-main">C16orf46</span> Human gene

Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.

<span class="mw-page-title-main">C15orf39</span>

C15orf39 is a protein that in humans is encoded by the Chromosome 15 open reading frame 15 (C15orf39) gene.

<span class="mw-page-title-main">C16orf86</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.

<span class="mw-page-title-main">C7orf26</span> Human protein-encoding gene on chromosome 7

c7orf26 is a gene in humans that encodes a protein known as c7orf26. Based on properties of c7orf26 and its conservation over a long period of time, its suggested function is targeted for the cytoplasm and it is predicted to play a role in regulating transcription.

<span class="mw-page-title-main">SMCO3</span> Protein-coding gene in the species Homo sapiens

Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.

<span class="mw-page-title-main">C1orf94</span> Protein-coding gene in the species Homo sapiens

Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. The function of this protein is still poorly understood.

<span class="mw-page-title-main">C14orf180</span> Protein-coding gene in the species Homo sapiens

C14orf180 is found on chromosome 14 in humans: 14q32.33. It consists of 1832 bp and 160 amino acids post translation. There is a total number of 6 exons. C14orf180 is also known as NRAC, C14orf77, and Chromosome 14 Open Reading Frame 180.

<span class="mw-page-title-main">FAM214B</span> Protein-coding gene in the species Homo sapiens

The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.

<span class="mw-page-title-main">C11orf98</span> Protein-coding gene in the species Homo sapiens

C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.

<span class="mw-page-title-main">C3orf38</span> Uncharacterized gene

Chromosome 3 open reading frame 38 (C3orf38) is a protein which in humans is encoded by the C3orf38 gene.

<span class="mw-page-title-main">C5orf22</span> Protein-coding gene in the species Homo sapiens

Chromosome 5 open reading frame 22 (c5orf22) is a protein-coding gene of poorly characterized function in Homo sapiens. The primary alias is unknown protein family 0489 (UPF0489).

<span class="mw-page-title-main">KIAA2013</span> Protein-coding gene in the species Homo sapiens

KIAA2013, also known as Q8IYS2 or MGC33867, is a single-pass transmembrane protein encoded by the KIAA2013 gene in humans. The complete function of KIAA2013 has not yet been fully elucidated.

<span class="mw-page-title-main">Chromosome 5 open reading frame 47</span> Human C5ORF47 Gene

Chromosome 5 Open Reading Frame 47, or C5ORF47, is a protein which, in humans, is encoded by the C5ORF47 gene. It also goes by the alias LOC133491. The human C5ORF47 gene is primarily expressed in the testis.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000179133 - Ensembl, May 2017
  2. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  3. Thiébaut R, Esmiol S, Lecine P, Mahfouz B, Hermant A, Nicoletti C, Parnis S, Perroy J, Borg JP, Pascoe L, Hugot JP, Ollendorff V (2016-01-01). "Characterization and Genetic Analyses of New Genes Coding for NOD2 Interacting Proteins". PLOS ONE. 11 (11): e0165420. Bibcode:2016PLoSO..1165420T. doi: 10.1371/journal.pone.0165420 . PMC   5094585 . PMID   27812135.
  4. Cozier YC, Ruiz-Narvaez EA, McKinnon CJ, Berman JS, Rosenberg L, Palmer JR (October 2012). "Fine-mapping in African-American women confirms the importance of the 10p12 locus to sarcoidosis". Genes and Immunity. 13 (7): 573–8. doi:10.1038/gene.2012.42. PMC   3475762 . PMID   22972473.
  5. Franke A, Fischer A, Nothnagel M, Becker C, Grabe N, Till A, Lu T, Müller-Quernheim J, Wittig M, Hermann A, Balschun T, Hofmann S, Niemiec R, Schulz S, Hampe J, Nikolaus S, Nürnberg P, Krawczak M, Schürmann M, Rosenstiel P, Nebel A, Schreiber S (October 2008). "Genome-wide association analysis in sarcoidosis and Crohn's disease unravels a common susceptibility locus on 10p12.2". Gastroenterology. 135 (4): 1207–15. doi: 10.1053/j.gastro.2008.07.017 . PMID   18723019.
  6. "ARMC3 armadillo repeat containing 3 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov.
  7. "KIAA1217 KIAA1217 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov.
  8. "C10orf67 chromosome 10 open reading frame 67 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2017-04-30.
  9. "Homo sapiens chromosome 10 open reading frame 67 (C10orf67), mRNA". www.ncbi.nlm.nih.gov. Retrieved 2017-02-05.
  10. Database, GeneCards Human Gene. "C10orf67 Gene - GeneCards | CJ067 Protein | CJ067 Antibody". www.genecards.org. Retrieved 2017-02-06.
  11. Kozlowski, Lukasz P. "Proteome-pI - Proteome Isoelectric Point Database statistics". isoelectricpointdb.org. Retrieved 2017-04-30.
  12. Kelley, Lawrence. "PHYRE2 Protein Fold Recognition Server". www.sbg.bio.ic.ac.uk. Retrieved 2017-05-05.
  13. 1 2 "GDS4794 / 1553845_x_at". www.ncbi.nlm.nih.gov. Retrieved 2017-04-30.
  14. "GDS4981 / ILMN_1719577". www.ncbi.nlm.nih.gov. Retrieved 2017-04-30.
  15. "uncharacterized protein C10orf67, mitochondrial [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2017-05-05.
  16. "PSORT II Prediction". psort.hgc.jp. Retrieved 2017-05-05.
  17. "MitoFates". mitf.cbrc.jp. Retrieved 2017-05-05.
  18. "WoLF PSORT: Protein Subcellular Localization Prediction". wolfpsort.hgc.jp. Retrieved 2017-04-30.
  19. "Thr69". www.phosphosite.org. Retrieved 2017-04-30.
  20. "GDS4080 / 1553844_a_at". www.ncbi.nlm.nih.gov. Retrieved 2017-05-06.
  21. "GDS1807 / 1553843_at". www.ncbi.nlm.nih.gov. Retrieved 2017-05-06.