CCDC121

Last updated
CCDC121
Identifiers
Aliases CCDC121 , coiled-coil domain containing 121
External IDs MGI: 2685601 HomoloGene: 136217 GeneCards: CCDC121
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001142682
NM_001142683
NM_024584

NM_207280

RefSeq (protein)

NP_001136155
NP_078860

n/a

Location (UCSC) Chr 2: 27.63 – 27.63 Mb Chr 1: 181.34 – 181.34 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

Coiled-coil domain containing 121 (CCDC121) is a protein encoded by the CCDC121 gene in humans. CCDC121 is located on the minus strand of chromosome 2 and encodes three protein isoforms. [5] All isoforms of CCDC121 contain a domain of unknown function referred to as DUF4515 or pfam14988.

Contents

Gene

Aliases, locus and size

CCDC121 has known aliases of FLJ43364, FLJ13646, hCG_1988995, LOC79635, and coiled-coil domain containing 121. [6]

CCDC121 is located on the minus strand of chromosome 2 at 2q23.3. It is 3,394 base pairs in length. [5]

Isoforms and alternative splicing

CCDC121 produces four different mRNAs: three alternatively spliced variants and one unspliced form. [7] The three alternatively spliced mRNAs give rise to three known protein isoforms. Transcripts for isoforms 1-3 are 2,880, 2,762 and 2,361 base pairs in length respectively. [8] [9] [10] Each of the mRNA variants contains two exons separated by a gt-ag intron. [5]

Protein

Primary sequence, molecular weight and pI

ProteinAccession NumberLength (amino acids)Molecular Weight (kDa)Predicted pI
Isoform 1 [11] XP_00526461744250.89.80
Isoform 2 [12] NP_00113615544050.99.81
Isoform 3 [13] NP_07886027833.19.84

Molecular weight and pI were calculated using ExPasy. [14]

Compositional analysis

Compositional analysis of all isoforms shows that they have below-average levels of aspartate (D) and valine (V) and above-average levels of glutamine (Q). In addition, they have above-average levels of lysine (K) and arginine (R) groupings. Isoform 3 also exhibits above-average lysine levels and below-average proline and glycine levels. Chimpanzee, dog, and ferret orthologs also exhibited above-average glutamine levels and lysine and arginine groupings. [15]

Secondary and tertiary structure

The secondary structure prediction for CCDC121 was obtained using Ali2D. [16] CCDC121 adopts a predominant alpha helical secondary structure (shown in red) due to the presence of the Coiled-coil motif. [17]

The tertiary structure of CCDC121 is composed mostly of alpha helices and contains some random coil. [18]

Domains, motifs and post-translational modifications

CCDC121 contains one domain of unknown function, DUF4515 or pfam14988. It also contains three predicted coiled-coil motifs from residues A165 to E192, L264 to E305 and N363 to E397. [19]

CCDC121 is predicted to have post-translational modification sites for: acetylation, [20] [21] Protein Kinase C and Casein Kinase II phosphorylation, [22] glycation, [23] GalNAc O-glycosylation, [24] SUMOylation, [25] [26] and O-β-GlcNAc attachment. [27]

Subcellular localization

Current evidence suggests that CCDC121 is partially localized in the nucleus. CCDC121 has a predicted nuclear localization signal from amino acids R327 to L337. This sequence has a score of 7, which is consistent with being a partial nuclear protein. [28] In addition, PSORT II found that there is 56.5% chance that CCDC121 is found in the nucleus. [29]

There is also evidence to suggest that CCDC121 is partially located in the cytosol. Cytochemistry studies of the Anti-CCDC121 antibody from The Human Protein Atlas indicate that CCDC121 is expressed in the cytosol and actin filaments. These tentative results are promising but further research of other anti-CCDC121 antibodies is needed. [30] Additionally, TargetP did not find a mitochondrial transfer peptide, which suggests that CCDC121 is likely not a mitochondrial protein. [31]

Expression and function

CCDC121 is expressed at the highest levels in the testes, ovaries, prostate, and thyroid. [32] It is expressed 40% less than the average gene so it is considered to have low levels of expression. [7]

The function of CCDC121 protein is not yet well understood by the scientific community. There is no known phenotype associated with the CCDC121 gene. [7]

Homology

Rate of molecular evolution

CCDC121 Rate of Molecular Evolution CCDC121 rate of molecular evolution.png
CCDC121 Rate of Molecular Evolution

Cytochrome c is a highly conserved protein and fibrinogen is a rapidly evolving protein. CCDC121 has a faster rate of molecular of evolution relative to both these proteins, suggesting that CCDC121 evolves very rapidly on an evolutionary timescale.

Orthologs

There are 126 confirmed orthologs of CCDC121. [33] CCDC121 orthologs are most abundant in mammals. 122 of the 126 orthologs are within the Eutheria, Marsupialia, and Monotremata clades. 119 of the 122 mammalian orthologs are found within eutherian mammals. The four remaining orthologs are the two-lined caecilian, the West Indian Ocean coelacanth, the electric eel, and the northern pike. These orthologs represent the Amphibia, Sarcopterygii, and Actinopterygii clades respectively. The CCDC121 gene likely appeared 433 million years ago in a common ancestor of Actinopterygii and Sarcopterygii.

Paralogs

CCDC166 is the only known paralog of CCDC121. They share a 23% sequence identity. Both CCDC121 and CCDC166 include the Domain of Unknown Function 4515 (DUF4515), or pfam14988, as a highly conserved sequence. [34]

Clinical significance

Mutations in the CCDC121 gene have been found in patients with certain cancers such as endometrial, lung, bladder, gastric/stomach, head/neck, and prostate cancers but no causal relationship has been determined. [35] [36] CCDC121 may also serve as a marker gene for inner ear development. [37]

Related Research Articles

<span class="mw-page-title-main">KIAA0895</span> Protein-coding gene in the species Homo sapiens

KIAA0895 is a protein that in Homo sapiens is encoded by the KIAA0895 gene. The gene encodes a protein commonly known as the KIAA0895 protein. It's aliases include hypothetical protein LOC23366, OTTHUMP00000206979, OTTHUMP00000206980, 9530077C05Rik, and 1110003N12Rik. It is located at 7p14.2.

<span class="mw-page-title-main">CCDC138</span> Protein found in humans

Coiled-coil domain-containing protein 138, also known as CCDC138, is a human protein encoded by the CCDC138 gene. The exact function of CCDC138 is unknown.

The coiled-coil domain containing 142 (CCDC142) is a gene which in humans encodes the CCDC142 protein. The CCDC142 gene is located on chromosome 2, spans 4339 base pairs and contains 9 exons. The gene codes for the coiled-coil domain containing protein 142 (CCDC142), whose function is not yet well understood. There are two known isoforms of CCDC142. CCDC142 proteins produced from these transcripts range in size from 743 to 665 amino acids and contain signals suggesting protein movement between the cytosol and nucleus. Homologous CCDC142 genes are found in many animals including vertebrates and invertebrates but not fungus, plants, protists, archea, or bacteria. Although the function of this protein is not well understood, it contains a coiled-coil domain and a RINT1_TIP1 motif located within the coiled-coil domain.

<span class="mw-page-title-main">C8orf58</span> Protein-coding gene in the species Homo sapiens

Chromosome 8 open reading frame 58 is an uncharacterised protein that in humans is encoded by the C8orf58 gene. The protein is predicted to be localized in the nucleus.

<span class="mw-page-title-main">TMEM44</span> Protein-coding gene in the species Homo sapiens

TMEM44 is a protein that in humans is encoded by the TMEM44 gene. DKFZp686O18124 is a synonym of TMEM44.

<span class="mw-page-title-main">C4orf51</span> Protein-coding gene in the species Homo sapiens

Chromosome 4 open reading frame 51 (C4orf51) is a protein which in humans is encoded by the C4orf51 gene.

<span class="mw-page-title-main">SMCO3</span> Protein-coding gene in the species Homo sapiens

Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.

<span class="mw-page-title-main">TMEM128</span>

TMEM128, also known as Transmembrane Protein 128, is a protein that in humans is encoded by the TMEM128 gene. TMEM128 has three variants, varying in 5' UTR's and start codon location. TMEM128 contains four transmembrane domains and is localized in the Endoplasmic Reticulum membrane. TMEM128 contains a variety of regulation at the gene, transcript, and protein level. While the function of TMEM128 is poorly understood, it interacts with several proteins associated with the cell cycle, signal transduction, and memory.

<span class="mw-page-title-main">WD Repeat and Coiled Coil Containing Protein</span> Protein-coding gene in humans

WD Repeat and Coiled-coiled containing protein (WDCP) is a protein which in humans is encoded by the WDCP gene. The function of the protein is not completely understood, but WDCP has been identified in a fusion protein with anaplastic lymphoma kinase found in colorectal cancer. WDCP has also been identified in the MRN complex, which processes double-stranded breaks in DNA.

<span class="mw-page-title-main">C16orf90</span> Protein-coding gene in the species Homo sapiens

C16orf90 or chromosome 16 open reading frame 90 produces uncharacterized protein C16orf90 in homo sapiens. C16orf90's protein has four predicted alpha-helix domains and is mildly expressed in the testes and lowly expressed throughout the body. While the function of C16orf90 is not yet well understood by the scientific community, it has suspected involvement in the biological stress response and apoptosis based on expression data from microarrays and post-translational modification data.

<span class="mw-page-title-main">C1orf122</span> Protein-coding gene in the species Homo sapiens

C1orf122 is a gene in the human genome that encodes the cytosolic protein ALAESM.. ALAESM is present in all tissue cells and highly up-regulated in the brain, spinal cord, adrenal gland and kidney. This gene can be expressed up to 2.5 times the average gene in its highly expressed tissues. Although the function of C1orf122 is unknown, it is predicted to be used for mitochondria localization.

<span class="mw-page-title-main">C7orf50</span> Mammalian protein found in Homo sapiens

C7orf50 is a gene in humans that encodes a protein known as C7orf50. This gene is ubiquitously expressed in the kidneys, brain, fat, prostate, spleen, among 22 other tissues and demonstrates low tissue specificity. C7orf50 is conserved in chimpanzees, Rhesus monkeys, dogs, cows, mice, rats, and chickens, along with 307 other organisms from mammals to fungi. This protein is predicted to be involved with the import of ribosomal proteins into the nucleus to be assembled into ribosomal subunits as a part of rRNA processing. Additionally, this gene is predicted to be a microRNA (miRNA) protein coding host gene, meaning that it may contain miRNA genes in its introns and/or exons.

TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.

<span class="mw-page-title-main">C6orf136</span> Protein-coding gene in the species Homo sapiens

C6orf136 is a protein in humans encoded by the C6orf136 gene. The gene is conserved in mammals, mollusks, as well some porifera. While the function of the gene is currently unknown, C6orf136 has been shown to be hypermethylated in response to FOXM1 expression in Head Neck Squamous Cell Carcinoma (HNSCC) tissue cells. Additionally, elevated expression of C6orf136 has been associated with improved survival rates in patients with bladder cancer. C6orf136 has three known isoforms.

<span class="mw-page-title-main">TMEM101</span>

Transmembrane protein 101 (TMEM101) is a protein that in humans is encoded by the TMEM101 gene. The TMEM101 protein has been demonstrated to activate the NF-κB signaling pathway. High levels of expression of TMEM101 have been linked to breast cancer.

<span class="mw-page-title-main">FAM98C</span> Gene

Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.

<span class="mw-page-title-main">ZNF548</span> Protein-coding gene in the species Homo sapiens

Zinc Finger Protein 548 (ZNF548) is a human protein encoded by the ZNF548 gene which is located on chromosome 19. It is found in the nucleus and is hypothesized to play a role in the regulation of transcription by RNA Polymerase II. It belongs to the Krüppel C2H2-type zinc-finger protein family as it contains many zinc-finger repeats.

<span class="mw-page-title-main">THAP3</span> Protein in Humans

THAP domain-containing protein 3 (THAP3) is a protein that, in Homo sapiens (humans), is encoded by the THAP3 gene. The THAP3 protein is as known as MGC33488, LOC90326, and THAP domain-containing, apoptosis associated protein 3. This protein contains the Thanatos-associated protein (THAP) domain and a host-cell factor 1C binding motif. These domains allow THAP3 to influence a variety of processes, including transcription and neuronal development. THAP3 is ubiquitously expressed in H. sapiens, though expression is highest in the kidneys.

<span class="mw-page-title-main">C13orf46</span> C13of46 Gene and Protein

Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.

<span class="mw-page-title-main">LRRC74A</span> Protein-coding gene

Leucine-rich repeat-containing protein 74A (LRRC74A), is a protein encoded by the LRRC74A gene. The protein LRRC74A is localized in the cytoplasm. It has a calculated molecular weight of approximately 55 kDa. The LRRC74A protein is nominally expressed in the testis, salivary gland, and pancreas.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000176714 - Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000050625 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. 1 2 3 GeneCards entry on CCDC121
  6. HGNC (HUGO Gene Nomenclature Committee) entry on CCDC121
  7. 1 2 3 NCBI-Aceview entry on CCDC121
  8. PREDICTED: Homo sapiens coiled-coil domain containing 121 (CCDC121), transcript variant X1, mRNA. NCBI Nucleotide.
  9. Homo sapiens coiled-coil domain containing 121 (CCDC121), transcript variant 2, mRNA. NCBI Nucleotide.
  10. Homo sapiens coiled-coil domain containing 121 (CCDC121), transcript variant 3, mRNA. NCBI Nucleotide.
  11. coiled-coil domain-containing protein 121 isoform X1 [Homo sapiens]. NCBI Protein.
  12. coiled-coil domain-containing protein 121 isoform 2 [Homo sapiens]. NCBI Protein.
  13. coiled-coil domain-containing protein 121 isoform 3 [Homo sapiens]. NCBI Protein.
  14. ExPASy Compute pI/MW tool
  15. Statistical Analysis of Protein Sequences Tool
  16. Secondary Structure prediction for CCDC121. Ali2D
  17. Secondary Structure Prediction for CCDC121. Chou Fassman Secondary Structure Prediction Server (CFSSP).
  18. The Phyre2 web portal for protein modeling, prediction and analysis. Kelley LA et al.. Nature Protocols 10, 845-858 (2015)
  19. Coiled-Coils Prediction. Prediction for CCDC121 isoform 1
  20. NETAcet-1.0 Server
  21. Terminus--N-Terminal PTM prediction. Swiss Institute of Bioinformatics
  22. NetPhos-3.1 Server. Prediction for CCDC121
  23. NetGlycate-1.0 Server. Predition for CCDC121
  24. NetOGlyc-4.0 Server. Prediction for CCDC121
  25. SUMOplotTM Analysis Program. Prediction for CCDC121
  26. GPS-SUMO: Prediction of SUMOylation Sites & SUMO-binding Motifs. Prediction for CCDC121.
  27. YinOYang 1.2 Server. Prediction for CCDC121
  28. "NLS Mapper. Predition for CCDC121". Archived from the original on 2021-11-22. Retrieved 2020-05-03.
  29. PSORT II Prediction Tool
  30. Human Protein Atlas entry on CCDC121
  31. TargetP-2.0 Server
  32. NCBI GeoProfile entry on CCDC121. GDS3113 Various normal tissues
  33. NCBI Gene Database entry on CCDC121
  34. Clustal Omega: Multiple Sequence Alignment Tool. Alignment of CCDC166 and CCDC121 isoforms 1 and 2.
  35. Zhang J, Huang JY, Chen YN, Yuan F, Zhang H, Yan FH, et al. (October 2015). "Erratum: Whole genome and transcriptome sequencing of matched primary and peritoneal metastatic gastric carcinoma". Scientific Reports. 5: 15309. Bibcode:2015NatSR...515309Z. doi:10.1038/srep15309. PMC   4613365 . PMID   26485306.
  36. PhosphoSitePlus® entry on CCDC121 protein
  37. Liu, Q., Chen, J., Gao, X., Ding, J., Tang, Z., Zhang, C., … Wang, J. (2015). Identification of stage-specific markers during the differentiation of hair cells from mouse inner ear stem cells or progenitor cells in vitro. International Journal of Biochemistry and Cell Biology, 60, 99–111. https://doi.org/10.1016/j.biocel.2014.12.024