CCDC190

Last updated
CCDC190
Identifiers
Aliases CCDC190 , C1orf110, coiled-coil domain containing 190
External IDs MGI: 1925715 HomoloGene: 52274 GeneCards: CCDC190
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_178550
NM_001394065

NM_001033185
NM_001347138

RefSeq (protein)

NP_848645

NP_001028357
NP_001334067

Location (UCSC) Chr 1: 162.82 – 162.87 Mb Chr 1: 169.76 – 169.76 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

Coiled-Coil Domain Containing 190, also known as C1orf110, the Chromosome 1 Open Reading Frame 110, MGC48998 and CCDC190, is found to be a protein coding gene widely expressed in vertebrates. [5] [6] RNA-seq gene expression profile shows that this gene selectively expressed in different organs of human body like lung brain and heart. [7] The expression product of c1orf110 is often called Coiled-coil domain-containing protein 190 with a size of 302 aa. It may get the name because a coiled-coil domain is found from position 14 to 72. [5] At least 6 spliced variants of its mRNA and 3 isoforms of this protein can be identified, which is caused by alternative splicing in human. [6] [7]

Contents

Gene

The mRNA variants for human ccdc190 gene. Numbers on the exon/intron show the frequency of splicing for each mRNA variants. MRNA variants for C1orf110.png
The mRNA variants for human ccdc190 gene. Numbers on the exon/intron show the frequency of splicing for each mRNA variants.

In human genome, it locates in c1q23.3 with a size of 44,461 bps from 162838810 to 162792037 on the reverse stand. [7] The closest neighbors of ccdc190 gene are HSD17B7 and DDR2 gene in the long arm of chromosome 1. This gene can have 3 to 5 exons depend on the splicing pattern of mRNA. [6] Some of its mRNA variants are shown in the left figure, it describes the alternative mRNAs aligned from 5' to 3', the region covered with different colors are the introns. Exons are located at both ends of the intron. Their lengths can be estimated from the calibrator shown at bottom left.

Homology

Though some isoforms are found for ccdc190 protein in each species, no paralog can be identified for this gene. The orthologs of ccdc190 are determined using BLAST. [8]

Orthologs

The E values for all the orthologs are from 0 to 8e-126, and the similarity is from 63.1% to 98.7% (excluding human sequence). [8] The least related sequence is found from red fox ( Vulpes vulpes ). Then the homology of ccdc190 is also identified in HomoloGene database of NCBI and the least relative sequence is found from coelacanth ( Latimeria chalumnae ). [9] Thus, this gene is found to be expressed in vertebrate like fish, amphibian, reptile and mammal, expression for ccdc190 in invertebrate is not identified.

Evolution

The graph showing the approximate date of divergence (from human) for a given species (MYA [million years ago]) versus the corrected % divergence (m) of that species' orthologous protein. The data and trendlines of cytochrome C and fibrinogen alpha chain are also included as indicators for proteins with low divergence and high divergence. Divergence rate.png
The graph showing the approximate date of divergence (from human) for a given species (MYA [million years ago]) versus the corrected % divergence (m) of that species' orthologous protein. The data and trendlines of cytochrome C and fibrinogen alpha chain are also included as indicators for proteins with low divergence and high divergence.

This gene first appeared in the elephant shark ( Callorhinchus milii ), which is 473 million years ago from now. While the gene family cannot be determined due to the lack of paralogs. The most distantly related organism is found to be coelacanth ( Latimeria chalumnae ) with a sequence identity of 24.8%. [8] The number of alternative splicing variants in it can be hard to determine due to the lack of data. The divergence rate of ccdc190 is estimated based on comparison with two known protein: cytochrome C and fibrinogen alpha, which is shown in the left figure. As it can be found from the graph, the trendline of ccdc190 is closer to that of fibrinogen alpha, showing a relative high divergence and evolution rate.

Proteins

Based on the result of BLAST of human ccdc190 protein sequence, two isoforms can be found with high similarity near 99.8%. Also, from the Protein database of NCBI it can be found that four isoforms of ccdc190 protein (isoform 1, isoform X1, isoform 2, isoform X3) which are derived from transcription variants. [8]

The theoretical isoelectric point and molecular weight of human ccdc190 isoform 1 are 9.62 and 34kD. [10] Also, based on the analysis of SAPS, this protein seems to have more lysine and glutamine compared to other proteins in human. The sum of arginine and lysine is also higher than normal, indicating a high amount of basic amino acid in this protein. This can be also validated by the fact that the difference of basic amino acid and acidic amino acid is higher in ccdc190.The repeats are identified at the start and middle of the protein sequence. While this protein has no significant charge segments as well as hydrophobic or transmembrane domains. The similar pattern can be identified in its close orthologs from gorilla and horse. In its distant orthologs, significant charge segments and transmembrane segments are still not identified, however, the composition of amino acid can be very different with that of human, as well as the repeats pattern. [11] Most of the secondary structure are found to be helix from prediction tools, this pattern is conserved in the close orthologs and are different in the distant orthologs. The predicted tertiary structure is shown on the left figure, which shows that this protein may have a helix-loop-helix motif that has the capability to interact with the DNA double strand. suggesting a potential DNA binding function of ccdc190. [12]

The predicted tertiary structure with the highest confidence of human ccdc190 based on the results of Phyre2. Two coiled-coil domains can be identified from the figure. Regions of red color have the highest confidence and region of blue has the lowest confidence. Strucutre.png
The predicted tertiary structure with the highest confidence of human ccdc190 based on the results of Phyre2. Two coiled-coil domains can be identified from the figure. Regions of red color have the highest confidence and region of blue has the lowest confidence.

Regulation

Gene level

Five promoters are identified for human ccdc190 gene, and they contain many different transcription factors like nuclear factor kappa B/c-rel, human and murine ETS1 factors and Krueppel like transcription factors. [13] In human body, this gene is found selected expressed in tissues like lung, brain and heart. Tissues like esophagus, trachea and stomach also have the highest expression. Generally, ccdc190 is identified to be highly expressed in tissues of human’s respiratory system, digestive system and reproductive system, and also found expressed in heart and brain. [6]

Transcription level

The RNA folding result of 5' UTR regions of human based on RNAFold webset. The highly conserved regions are lined in the figure. These regions are important likely to represent conserved transcription factor binding sites. The RBP regions are identified based on information from RBPDB database, and miRNA binding sites are found from miRDB. 5' UTR loop.jpg
The RNA folding result of 5’ UTR regions of human based on RNAFold webset. The highly conserved regions are lined in the figure. These regions are important likely to represent conserved transcription factor binding sites. The RBP regions are identified based on information from RBPDB database, and miRNA binding sites are found from miRDB.

Some conserved regions are found in the 5' UTR and 3' UTR region of human ccdc190 mRNA based on multiple sequence alignment results, they can be potential RNA binding protein and miRNA binding regions. [14]

Protein level

The protein level regulation for human ccdc190 includes the protein trafficking leading by signal peptide and post translation modification like sumoylation and acetylation. The potential sites for modification and signal peptide regions are labeled in the conceptual translation below and the localization of ccdc190 is discussed.

Protein localization

Based on the result of PSORT II, 52.2% of the ccdc190 protein is in the nuclear, 26.6% of it is located in mitochondria, 17.4% of the total amount is in cytoplasm and 4.3% are in the endoplasmic reticulum (ER). This result can be validated by that there are also predicted cleavage site for mitochondrial presequence for this protein, which makes sense due to nearly a quarter of the amount is in mitochondria. Also, nuclear localization signals were also identified with a NLS score of 0.13. The ER membrane retention signals are also identified in the N-terminus and C-terminus. [15]

Conceptual translation

The conceptual translation of human ccdc190 gene is identified and labeled. In the figure below, amino acid regions for exon dividing sites, predicted domains, post translation modification sites and signal peptides are labelled. The conserved regions found by multiple sequence alignment are bolded. The legends of each label are either shown on the right or on the margin of translation.

Conceptual trans.png

Function

The human ccdc190 protein is predicted to have coiled-coil domain that may have multiple functions. First. the physical properties like length and flexibility of this domain make it function as a good molecular spacer. [16] It is found that the length variation of coiled-coil domain is 3.6 times lower than that of other regions, and the length of amino acid is relatively conserved within different proteins, indicating that the physical size of this domain may play an important role in its function. [17] Also, coiled-coil domain are found essential in accurate chromosome segregation. It is found that during cell division, some coiled-coil contain proteins like Ska1, Ska2, and Ska3 form a dimer and the mutation of the dimer leads to chromosome congression failure which can lead to cell death. [18] Finally, there is evidence that this domain can play a role in DNA recognition and binding. In the restriction enzyme found inside Methanococcus jannaschii , subunits that encoding coiled-coil domain are found served as molecular ruler for the recognition of direct repeats in DNA sequence, similar pattern is found in the MerR family of transcriptional activators in bacteria. [19]

Interacting proteins

The potential interactants of ccdc190 is found and summarized in the following table:

Abbreviated nameFull nameTools that reported the interactionBasis of identificationScores or statistical measuresPotential locationBrief function
APP amyloid beta (A4) precursor protein ELISA BioGRIDProteolysis of APP generates neurotoxic Aβ peptide which is crucial for development of Alzheimer's disease
NPM1 nucleophosmincrosslinking mass spectrometryBioGRIDNucleusInvolved in diverse cellular processes such as ribosome biogenesis, centrosome duplication, protein chaperoning, histone assembly, cell proliferation, and regulation of tumor suppressors p53/TP53 and ARF.
PPIB peptidylprolyl isomerase Bcrosslinking mass spectrometryBioGRIDPPIase that catalyzes the cis-trans isomerization of proline imidic peptide bonds in oligopeptides and may therefore assist protein folding.
PPP1R14C protein phosphatase 1, regulatory (inhibitor) subunit 14Ccrosslinking mass spectrometryBioGRIDInhibitor of the PP1 regulatory subunit PPP1CA.
PAQR6 Membrane progestin receptor deltaTextminingSTRING0.721May be involved in regulating rapid P4 signaling in the nervous system. Also binds dehydroepiandrosterone (DHEA), pregnanolone, pregnenolone and allopregnanolone
C2orf66 TextminingSTRING0.669Uncharaterized
NPSR1 Neuropeptide S receptorTextminingSTRING0.625Promotes mobilization of intracellular Ca(2+) stores. Inhibits cell growth in response to NPS binding. Involved in pathogenesis of asthma and other IgE-mediated diseases
EYA3 Eyes absent homolog 3TextminingSTRING0.574NucleusPromotes efficient DNA repair by dephosphorylating H2AX, promoting the recruitment of DNA repair complexes containing MDC1.
Transmembrane protein 217 TextminingSTRING0.556Uncharaterized
SOX-14 Transcription factor SOX-14TextminingSTRING0.536Acts as a negative regulator of transcription
KRTAP5-2 Keratin-associated protein 5-2TextminingSTRING0.529In the hair cortex, hair keratin intermediate filaments are embedded in an interfilamentous matrix, consisting of hair keratin-associated protein (KRTAP), which are essential for the formation of a rigid and resistant hair shaft through their extensive disulfide bond cross-linking with abundant cysteine residues of hair keratins.
TCEB3C Elongin-A3TextminingSTRING0.526NucleusA general transcription elongation factor that increases the RNA polymerase II transcription elongation past template-encoded arresting sites.

Clinical significance

The gene ccdc190 is found to be related to some diseases. [7] For example, it is found contributed significantly to cannabis dependence risk by sequence kernel association tests and found to be downregulated in nasopharyngeal carcinoma microarray data. [20] [21] Also, this gene is found to be a biomarker for short photoperiods for cells in pars tuberalis. [22] The detailed relation for c1orf110 with phenotypes mentioned above still needs further study. Based on the microarray data on GEO database of NCBI, it can be found that the H3K9me2 methyltransferase G9a depletion can increase the expression amount of c1orf110 on breast cancer cell line. indicating that DNA methylation caused by breast cancer can lead to inhibition of ccdc190 expression. [23] [20] Also, the αSMA+ myofibroblast early depletion on pancreatic ductal adenocarcinoma mouse can lead to lower expression of ccdc190 in pancreas, suggesting the potential function of this gene in pancreatic cancer pathology. [24]

Related Research Articles

<span class="mw-page-title-main">FAM63A</span> Protein-coding gene in the species Homo sapiens

Family with sequence similarity 63, member A is a protein that, is encoded by the FAM63A gene in humans,. It is located on the minus strand of chromosome 1 at locus 1q21.3.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

<span class="mw-page-title-main">FAM71F2</span> Protein-coding gene in the species Homo sapiens

FAM71F2 or Family with Sequence Similarity 71 member F2 is a protein that in humans is encoded by the Family with Sequence Similarity 71 member F2 gene. This gene is highly active in the reproductive tissues, specifically the testis, and may serve as a potential biomarker for determining metastatic testicular cancer.

<span class="mw-page-title-main">C21orf58</span> Protein-coding gene in the species Homo sapiens

Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.

Uncharacterized protein Chromosome 1 Open Reading Frame 27 is a protein in humans, encoded by the C1orf27 gene. It is accession number NM_017847. This is a membrane protein that is 3926 base pairs long with the most extensive string of amino acids being 454aa long. C1orf27 exhibits cytoplasmic expression in epidermal tissues. Predicted associated biological processes of the gene include cell fate specification and developmental properties.

<span class="mw-page-title-main">C15orf39</span>

C15orf39 is a protein that in humans is encoded by the Chromosome 15 open reading frame 15 (C15orf39) gene.

<span class="mw-page-title-main">CFAP299</span> Protein-coding gene in the species Homo sapiens

Cilia- and flagella-associated protein 299 (CFAP299), is a protein that in humans is encoded by the CFAP299 gene. CFAP299 is predicted to play a role in spermatogenesis and cell apoptosis.

Chromosome 1 open reading frame 141, or C1orf141 is a protein which, in humans, is encoded by gene C1orf141. It is a precursor protein that becomes active after cleavage. The function is not yet well understood, but it is suggested to be active during development

<span class="mw-page-title-main">C7orf26</span> Human protein-encoding gene on chromosome 7

c7orf26 is a gene in humans that encodes a protein known as c7orf26. Based on properties of c7orf26 and its conservation over a long period of time, its suggested function is targeted for the cytoplasm and it is predicted to play a role in regulating transcription.

Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene. The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.

<span class="mw-page-title-main">SMCO3</span> Protein-coding gene in the species Homo sapiens

Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.

<span class="mw-page-title-main">C9orf85</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13. When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal. Isoelectric point was found to be 9.54. The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.

<span class="mw-page-title-main">FAM120AOS</span> Protein-coding gene in the species Homo sapiens

FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.

<span class="mw-page-title-main">FAM98C</span> Gene

Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.

<span class="mw-page-title-main">FAM166C</span>

Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.

<span class="mw-page-title-main">THAP3</span> Protein in Humans

THAP domain-containing protein 3 (THAP3) is a protein that, in Homo sapiens (humans), is encoded by the THAP3 gene. The THAP3 protein is as known as MGC33488, LOC90326, and THAP domain-containing, apoptosis associated protein 3. This protein contains the Thanatos-associated protein (THAP) domain and a host-cell factor 1C binding motif. These domains allow THAP3 to influence a variety of processes, including transcription and neuronal development. THAP3 is ubiquitously expressed in H. sapiens, though expression is highest in the kidneys.

<span class="mw-page-title-main">C13orf46</span> C13of46 Gene and Protein

Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.

<span class="mw-page-title-main">Chromosome 5 open reading frame 47</span> Human C5ORF47 Gene

Chromosome 5 Open Reading Frame 47, or C5ORF47, is a protein which, in humans, is encoded by the C5ORF47 gene. It also goes by the alias LOC133491. The human C5ORF47 gene is primarily expressed in the testis.

<span class="mw-page-title-main">SCRN3</span> Protein-coding gene in the species Homo sapiens

Secernin-3 (SCRN3) is a protein that is encoded by the human SCRN3 gene. SCRN3 belongs to the peptidase C69 family and the secernin subfamily. As a part of this family, the protein is predicted to enable cysteine-type exopeptidase activity and dipeptidase activity, as well as be involved in proteolysis. It is ubiquitously expressed in the brain, thyroid, and 25 other tissues. Additionally, SCRN3 is conserved in a variety of species, including mammals, birds, fish, amphibians, and invertebrates. SCRN3 is predicted to be an integral component of the cytoplasm.

<span class="mw-page-title-main">LRRC74A</span> Protein-coding gene

Leucine-rich repeat-containing protein 74A (LRRC74A), is a protein encoded by the LRRC74A gene. The protein LRRC74A is localized in the cytoplasm. It has a calculated molecular weight of approximately 55 kDa. The LRRC74A protein is nominally expressed in the testis, salivary gland, and pancreas.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000185860 - Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000070532 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. 1 2 "CCDC190 - Coiled-coil domain-containing protein 190 - Homo sapiens (Human) - CCDC190 gene & protein". www.uniprot.org. Retrieved 2021-06-17.
  6. 1 2 3 4 "CCDC190 coiled-coil domain containing 190 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-06-17.
  7. 1 2 3 4 "AceView: Gene:C1orf110, a comprehensive annotation of human, mouse and worm genes with mRNAs or ESTsAceView". www.ncbi.nlm.nih.gov. Retrieved 2021-06-17.
  8. 1 2 3 4 "BLAST: Basic Local Alignment Search Tool". blast.ncbi.nlm.nih.gov. Retrieved 2021-07-31.
  9. "HomoloGene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-07-31.
  10. "ExPASy - Compute pI/Mw tool". web.expasy.org. Retrieved 2021-07-31.
  11. "SAPS Results". www.ebi.ac.uk. Retrieved 2021-07-31.
  12. "Phyre 2 Results for hsa_c1orf110". www.sbg.bio.ic.ac.uk. Retrieved 2021-07-31.
  13. "Genomatix Suite – OSTR" . Retrieved 2021-07-31.
  14. "Clustal Omega < Multiple Sequence Alignment < EMBL-EBI". www.ebi.ac.uk. Retrieved 2021-07-31.
  15. "PSORT II Prediction". psort.hgc.jp. Retrieved 2021-07-31.
  16. Truebestein L, Leonard TA (September 2016). "Coiled-coils: The long and short of it". BioEssays. 38 (9): 903–16. doi:10.1002/bies.201600062. PMC   5082667 . PMID   27492088.
  17. Surkont J, Diekmann Y, Ryder PV, Pereira-Leal JB (December 2015). "Coiled-coil length: Size does matter". Proteins. 83 (12): 2162–9. doi:10.1002/prot.24932. PMID   26387794. S2CID   33374577.
  18. Jeyaprakash AA, Santamaria A, Jayachandran U, Chan YW, Benda C, Nigg EA, Conti E (May 2012). "Structural and functional organization of the Ska complex, a key component of the kinetochore-microtubule interface". Molecular Cell. 46 (3): 274–86. doi: 10.1016/j.molcel.2012.03.005 . PMID   22483620.
  19. Changela A, Chen K, Xue Y, Holschen J, Outten CE, O'Halloran TV, Mondragón A (September 2003). "Molecular basis of metal-ion selectivity and zeptomolar sensitivity by CueR". Science. 301 (5638): 1383–7. Bibcode:2003Sci...301.1383C. doi:10.1126/science.1085950. PMID   12958362. S2CID   8988176.
  20. 1 2 Gizer IR, Bizon C, Gilder DA, Ehlers CL, Wilhelmsen KC (January 2018). "Whole genome sequence study of cannabis dependence in two independent cohorts". Addiction Biology. 23 (1): 461–473. doi:10.1111/adb.12489. PMC   5522771 . PMID   28111843.
  21. Ye Z, Wang F, Yan F, Wang L, Li B, Liu T, et al. (April 2019). "Bioinformatic identification of candidate biomarkers and related transcription factors in nasopharyngeal carcinoma". World Journal of Surgical Oncology. 17 (1): 60. doi:10.1186/s12957-019-1605-9. PMC   6444505 . PMID   30935420.
  22. Wood SH, Christian HC, Miedzinska K, Saer BR, Johnson M, Paton B, et al. (October 2015). "Binary Switching of Calendar Cells in the Pituitary Defines the Phase of the Circannual Cycle in Mammals". Current Biology. 25 (20): 2651–62. doi:10.1016/j.cub.2015.09.014. PMC   4612467 . PMID   26412130. S2CID   5995912.
  23. "GDS4800 / 7921909". www.ncbi.nlm.nih.gov. Retrieved 2021-07-31.
  24. "GDS5820 / ILMN_1229202". www.ncbi.nlm.nih.gov. Retrieved 2021-07-31.

Further reading