CCDC121 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | CCDC121 , coiled-coil domain containing 121 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 2685601 HomoloGene: 136217 GeneCards: CCDC121 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Coiled-coil domain containing 121 (CCDC121) is a protein encoded by the CCDC121 gene in humans. CCDC121 is located on the minus strand of chromosome 2 and encodes three protein isoforms. [5] All isoforms of CCDC121 contain a domain of unknown function referred to as DUF4515 or pfam14988.
CCDC121 has known aliases of FLJ43364, FLJ13646, hCG_1988995, LOC79635, and coiled-coil domain containing 121. [6]
CCDC121 is located on the minus strand of chromosome 2 at 2q23.3. It is 3,394 base pairs in length. [5]
CCDC121 produces four different mRNAs: three alternatively spliced variants and one unspliced form. [7] The three alternatively spliced mRNAs give rise to three known protein isoforms. Transcripts for isoforms 1-3 are 2,880, 2,762 and 2,361 base pairs in length respectively. [8] [9] [10] Each of the mRNA variants contains two exons separated by a gt-ag intron. [5]
Protein | Accession Number | Length (amino acids) | Molecular Weight (kDa) | Predicted pI |
---|---|---|---|---|
Isoform 1 [11] | XP_005264617 | 442 | 50.8 | 9.80 |
Isoform 2 [12] | NP_001136155 | 440 | 50.9 | 9.81 |
Isoform 3 [13] | NP_078860 | 278 | 33.1 | 9.84 |
Molecular weight and pI were calculated using ExPasy. [14]
Compositional analysis of all isoforms shows that they have below-average levels of aspartate (D) and valine (V) and above-average levels of glutamine (Q). In addition, they have above-average levels of lysine (K) and arginine (R) groupings. Isoform 3 also exhibits above-average lysine levels and below-average proline and glycine levels. Chimpanzee, dog, and ferret orthologs also exhibited above-average glutamine levels and lysine and arginine groupings. [15]
The secondary structure prediction for CCDC121 was obtained using Ali2D. [16] CCDC121 adopts a predominant alpha helical secondary structure (shown in red) due to the presence of the Coiled-coil motif. [17]
The tertiary structure of CCDC121 is composed mostly of alpha helices and contains some random coil. [18]
CCDC121 contains one domain of unknown function, DUF4515 or pfam14988. It also contains three predicted coiled-coil motifs from residues A165 to E192, L264 to E305 and N363 to E397. [19]
CCDC121 is predicted to have post-translational modification sites for: acetylation, [20] [21] Protein Kinase C and Casein Kinase II phosphorylation, [22] glycation, [23] GalNAc O-glycosylation, [24] SUMOylation, [25] [26] and O-β-GlcNAc attachment. [27]
Current evidence suggests that CCDC121 is partially localized in the nucleus. CCDC121 has a predicted nuclear localization signal from amino acids R327 to L337. This sequence has a score of 7, which is consistent with being a partial nuclear protein. [28] In addition, PSORT II found that there is 56.5% chance that CCDC121 is found in the nucleus. [29]
There is also evidence to suggest that CCDC121 is partially located in the cytosol. Cytochemistry studies of the Anti-CCDC121 antibody from The Human Protein Atlas indicate that CCDC121 is expressed in the cytosol and actin filaments. These tentative results are promising but further research of other anti-CCDC121 antibodies is needed. [30] Additionally, TargetP did not find a mitochondrial transfer peptide, which suggests that CCDC121 is likely not a mitochondrial protein. [31]
CCDC121 is expressed at the highest levels in the testes, ovaries, prostate, and thyroid. [32] It is expressed 40% less than the average gene so it is considered to have low levels of expression. [7]
The function of CCDC121 protein is not yet well understood by the scientific community. There is no known phenotype associated with the CCDC121 gene. [7]
Cytochrome c is a highly conserved protein and fibrinogen is a rapidly evolving protein. CCDC121 has a faster rate of molecular of evolution relative to both these proteins, suggesting that CCDC121 evolves very rapidly on an evolutionary timescale.
There are 126 confirmed orthologs of CCDC121. [33] CCDC121 orthologs are most abundant in mammals. 122 of the 126 orthologs are within the Eutheria, Marsupialia, and Monotremata clades. 119 of the 122 mammalian orthologs are found within eutherian mammals. The four remaining orthologs are the two-lined caecilian, the West Indian Ocean coelacanth, the electric eel, and the northern pike. These orthologs represent the Amphibia, Sarcopterygii, and Actinopterygii clades respectively. The CCDC121 gene likely appeared 433 million years ago in a common ancestor of Actinopterygii and Sarcopterygii.
CCDC166 is the only known paralog of CCDC121. They share a 23% sequence identity. Both CCDC121 and CCDC166 include the Domain of Unknown Function 4515 (DUF4515), or pfam14988, as a highly conserved sequence. [34]
Mutations in the CCDC121 gene have been found in patients with certain cancers such as endometrial, lung, bladder, gastric/stomach, head/neck, and prostate cancers but no causal relationship has been determined. [35] [36] CCDC121 may also serve as a marker gene for inner ear development. [37]
KIAA0895 is a protein that in Homo sapiens is encoded by the KIAA0895 gene. The gene encodes a protein commonly known as the KIAA0895 protein. It's aliases include hypothetical protein LOC23366, OTTHUMP00000206979, OTTHUMP00000206980, 9530077C05Rik, and 1110003N12Rik. It is located at 7p14.2.
Coiled-coil domain-containing protein 138, also known as CCDC138, is a human protein encoded by the CCDC138 gene. The exact function of CCDC138 is unknown.
The coiled-coil domain containing 142 (CCDC142) is a gene which in humans encodes the CCDC142 protein. The CCDC142 gene is located on chromosome 2, spans 4339 base pairs and contains 9 exons. The gene codes for the coiled-coil domain containing protein 142 (CCDC142), whose function is not yet well understood. There are two known isoforms of CCDC142. CCDC142 proteins produced from these transcripts range in size from 743 to 665 amino acids and contain signals suggesting protein movement between the cytosol and nucleus. Homologous CCDC142 genes are found in many animals including vertebrates and invertebrates but not fungus, plants, protists, archea, or bacteria. Although the function of this protein is not well understood, it contains a coiled-coil domain and a RINT1_TIP1 motif located within the coiled-coil domain.
Chromosome 8 open reading frame 58 is an uncharacterised protein that in humans is encoded by the C8orf58 gene. The protein is predicted to be localized in the nucleus.
TMEM44 is a protein that in humans is encoded by the TMEM44 gene. DKFZp686O18124 is a synonym of TMEM44.
Chromosome 4 open reading frame 51 (C4orf51) is a protein which in humans is encoded by the C4orf51 gene.
Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.
TMEM128, also known as Transmembrane Protein 128, is a protein that in humans is encoded by the TMEM128 gene. TMEM128 has three variants, varying in 5' UTR's and start codon location. TMEM128 contains four transmembrane domains and is localized in the Endoplasmic Reticulum membrane. TMEM128 contains a variety of regulation at the gene, transcript, and protein level. While the function of TMEM128 is poorly understood, it interacts with several proteins associated with the cell cycle, signal transduction, and memory.
WD Repeat and Coiled-coiled containing protein (WDCP) is a protein which in humans is encoded by the WDCP gene. The function of the protein is not completely understood, but WDCP has been identified in a fusion protein with anaplastic lymphoma kinase found in colorectal cancer. WDCP has also been identified in the MRN complex, which processes double-stranded breaks in DNA.
C16orf90 or chromosome 16 open reading frame 90 produces uncharacterized protein C16orf90 in homo sapiens. C16orf90's protein has four predicted alpha-helix domains and is mildly expressed in the testes and lowly expressed throughout the body. While the function of C16orf90 is not yet well understood by the scientific community, it has suspected involvement in the biological stress response and apoptosis based on expression data from microarrays and post-translational modification data.
C1orf122 is a gene in the human genome that encodes the cytosolic protein ALAESM.. ALAESM is present in all tissue cells and highly up-regulated in the brain, spinal cord, adrenal gland and kidney. This gene can be expressed up to 2.5 times the average gene in its highly expressed tissues. Although the function of C1orf122 is unknown, it is predicted to be used for mitochondria localization.
C7orf50 is a gene in humans that encodes a protein known as C7orf50. This gene is ubiquitously expressed in the kidneys, brain, fat, prostate, spleen, among 22 other tissues and demonstrates low tissue specificity. C7orf50 is conserved in chimpanzees, Rhesus monkeys, dogs, cows, mice, rats, and chickens, along with 307 other organisms from mammals to fungi. This protein is predicted to be involved with the import of ribosomal proteins into the nucleus to be assembled into ribosomal subunits as a part of rRNA processing. Additionally, this gene is predicted to be a microRNA (miRNA) protein coding host gene, meaning that it may contain miRNA genes in its introns and/or exons.
TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.
C6orf136 is a protein in humans encoded by the C6orf136 gene. The gene is conserved in mammals, mollusks, as well some porifera. While the function of the gene is currently unknown, C6orf136 has been shown to be hypermethylated in response to FOXM1 expression in Head Neck Squamous Cell Carcinoma (HNSCC) tissue cells. Additionally, elevated expression of C6orf136 has been associated with improved survival rates in patients with bladder cancer. C6orf136 has three known isoforms.
Transmembrane protein 101 (TMEM101) is a protein that in humans is encoded by the TMEM101 gene. The TMEM101 protein has been demonstrated to activate the NF-κB signaling pathway. High levels of expression of TMEM101 have been linked to breast cancer.
Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.
Zinc Finger Protein 548 (ZNF548) is a human protein encoded by the ZNF548 gene which is located on chromosome 19. It is found in the nucleus and is hypothesized to play a role in the regulation of transcription by RNA Polymerase II. It belongs to the Krüppel C2H2-type zinc-finger protein family as it contains many zinc-finger repeats.
THAP domain-containing protein 3 (THAP3) is a protein that, in Homo sapiens (humans), is encoded by the THAP3 gene. The THAP3 protein is as known as MGC33488, LOC90326, and THAP domain-containing, apoptosis associated protein 3. This protein contains the Thanatos-associated protein (THAP) domain and a host-cell factor 1C binding motif. These domains allow THAP3 to influence a variety of processes, including transcription and neuronal development. THAP3 is ubiquitously expressed in H. sapiens, though expression is highest in the kidneys.
Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.
Leucine-rich repeat-containing protein 74A (LRRC74A), is a protein encoded by the LRRC74A gene. The protein LRRC74A is localized in the cytoplasm. It has a calculated molecular weight of approximately 55 kDa. The LRRC74A protein is nominally expressed in the testis, salivary gland, and pancreas.