CCDC94

Last updated
YJU2
Identifiers
Aliases YJU2 , coiled-coil domain containing 94, CCDC94, YJU2 splicing factor homolog
External IDs MGI: 1920136 HomoloGene: 6350 GeneCards: YJU2
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_018074

NM_028381

RefSeq (protein)

NP_060544

NP_082657

Location (UCSC) Chr 19: 4.25 – 4.27 Mb Chr 17: 56.27 – 56.28 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

Coiled-coil domain containing 94 (CCDC94) is a protein that in humans is encoded by the CCDC94 gene. [5] The CCDC94 protein contains a coiled-coil domain, a domain of unknown function (DUF572), an uncharacterized conserved protein (COG5134), and lacks a transmembrane domain.

Contents

Gene

Overview

Genomic location of CCDC94 at 19p13.3 Genomic location of CCDC94 at 19p13.3.png
Genomic location of CCDC94 at 19p13.3

CCDC94 is a 21,975 basepair gene orientated on the plus strand (see Sense) of chromosome 19 from 4,247,111-4,269,085. [5] The gene product is a 1,441 base pair mRNA with 8 predicted exons in the human gene. As predicted by Ensemble, there exists one protein-coding alternative splice form. [7] This splice form contains 5 exons, and 4 of them are coding exons. Promoter prediction and analysis was carried out using ElDorado. [8] The predicted promoter region spans 714 basepairs from 4,246,532 to 4,247,245 on the plus strand of chromosome 19.

Gene neighborhood

CCDC94 is located directly adjacent to the EBI3 gene (4,229,540-4,237,525) on the positive DNA strand. The SH2 domain gene (4,278,598-4,290,720) lies upstream from CCDC94 on the positive strand. [9]

Gene expression

CCDC94 is expressed in low to moderate levels throughout most regions of the body. However, slightly elevated levels of CCDC94 are expressed in the thyroid, lung, dendritic cells, and lymphoblasts. Expression data is available at BioGPS. [10] GEO expression data is available from NCBI. [11]

CCDC94 Geo profile expression in normal tissues. CCDC94 Geo profile expression in normal tissues..png
CCDC94 Geo profile expression in normal tissues.

Protein

Properties and characteristics

CCDC94 belongs to the CWC16 family [12] and its function is not well understood. The human form as 323 amino acid residues, with an isoelectric point of 5.618 and a molecular mass of 37,086 daltons. There are no predicted transmembrane domains. [13] The one alternative splice form of CCDC94 encodes for a protein with 161 amino acids. [14] A DUF572 and COG5134 domains are located at residues 1-319 and 7–108, respectively. [15] The coiled-coil domain region is located at residues 105–206. [16] The intracellular localization of CCDC94 has not yet been experimentally determined, but bioinformatic analysis using PSORT highly suggests CCDC94 resides in the nucleus due to the presence of nuclear localization signals. [17]

The CCDC94 protein construct, including the COG5134, DUF572, and coiled-coil domains. The CCDC94 protein construct, including the COG5134, DUF572, and coiled-coil domains..png
The CCDC94 protein construct, including the COG5134, DUF572, and coiled-coil domains.

Protein interactions

Protein interaction analysis for CCDC94 has been carried out using computational tools. No interactions were identified through the MINT database. [18] CCDC94 is shown to interact with CDC5L, PLRG1, and PRPF19 with the highest score based on an anti tag coimmunoprecipitation assay. [19] 6 additional interacting proteins were found. Closer analysis shows very little potential for these interactions to be real, thus none should be considered actual protein-protein interactions. The protein interaction from the STRING analysis is shown.


Transcription factors

CCDC94 has a promoter region that contains sites for transcription factor binding. Notable transcription factors, as generated by the ElDorado program on Genomatix: [20]

  • Myeloid zinc finger protein (MZF1)
  • Forkhead box H1 (Foxh1)
  • Polyomavirus enhancer A binding protein 3 (ETV4)
  • E2F-myc activator/cell cycle regulator (E2F)
  • SPI-1 proto-oncogene; hematopoietic transcription factor (PU1)

Post-translational modifications

Bioinformatic analysis of CCDC94 using NetPhos [21] predicted 7 phosphorylation sites at serine residues, 3 at threonine residues, and 3 at tyrosine residues. Two of the threonine and all of the tyrosine phosphorylated residues are highly conserved as supported by their occurrence at the same location in several analyzed orthologs. Predicted phosphorylated tyrosines with high scores occurred on the N-terminus half of CCDC94 while serine residues are phosphorylated on the C-terminus half. Sulfinator predicted only one tyrosine sulfation site at amino acid 98. [22] Highly probably sumoylation sites at residues 90, 24, and 270 were predicted by SUMOplot. [23]

Tertiary structure

The tertiary structure of CCDC94 was shown to have several beta sheet regions and only one highly predicted alpha helix region. The PHYRE2 analysis of 65 residues of CCDC94, 20% of the entire amino acid sequence, was modeled with 87.9% confidence. [24]

CCDC94 tertiary structure as predicted by PHYRE2. CCDC94 secondary structure as predicted by PHYRE2.png
CCDC94 tertiary structure as predicted by PHYRE2.

Homology

Orthologs

CCDC94 is very well conserved in many species, and the entire protein is conserved throughout all of its orthologs. [25] However, conservation does not extend as far back as bacteria. A phylogenetic tree, generated from Biology WorkBench [26] shows the evolutionary relationships between Homo sapiens CCDC94 and its orthologs. The table below show CCDC94 conservation among orthologs:

Genus Species Organism Common NameDivergence from Humans (MYA) [27] NCBI Protein AccessionSequence Similarity [25] Protein Length
Pan panicous Bonobo6.3XP_003819321.199%323
Gorilla gorilla gorilla Gorilla8.8XP_004059817.198%286
Callithrix jacchus Common marmoset42.6XP_002761642.183%278
Mus musculus Mouse92.3NP_082657.187%314
Rattus norvegicus Rat92.4NP_001103143.187%313
Cricetulus griseus Chinese hamster92.4XP_003501789.185%321
Bos taurus Cow94.4NP_001069159.189%320
Felis catus Cat94.4XP_003981794.173%363
Sarcophilus harrisii Tasmanian Devil163.9XP_003760628.178%326
Monodelphis domestica Opossum163.9XP_001374444.186%326
Gallus gallus Red junglefoul296.4XP_423475.384%291
Anolis carolinensis Lizard324.5XP_003230268.172%311
Xenopou tropicalis Western clawed frog342.7NP_001017176.173%345
Xenopus laevis African clawed frog371.2NP_001087648.183%280
Takifugu rubripes Puffer fish454.6XP_003962830.164%348
Acyrthosiphon pisum Pea aphid (insect)910NP_001155925.149%278
Harpegnathos saltor Ant910EFN80619.147%351

Paralogs

CCDC94 has only one paralog, CCDC130 or MGC10471. [28] CCDC130 is very similar to CCDC94, as it contains both the DUF572 and COG5134 domain. [29]

Related Research Articles

<span class="mw-page-title-main">TMEM242</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 242 (TMEM242) is a protein that in humans is encoded by the TMEM242 gene. The tmem242 gene is located on chromosome 6, on the long arm, in band 2 section 5.3. This protein is also commonly called C6orf35, BM033, and UPF0463 Transmembrane Protein C6orf35. The tmem242 gene is 35,238 base pairs long, and the protein is 141 amino acids in length. The tmem242 gene contains 4 exons. The function of this protein is not well understood by the scientific community. This protein contains a DUF1358 domain.

<span class="mw-page-title-main">QRICH1</span> Protein-coding gene in the species Homo sapiens

QRICH1, also known as Glutamine-rich protein 1, is a protein that in humans is encoded by the QRICH1 gene. One notable feature of this protein is that it contains a Caspase Activation Recruitment Domain, also known as a CARD domain. As a result of having this domain, QRICH1 is believed to be involved in apoptotic, inflammatory, and host-immune response pathways.

<span class="mw-page-title-main">Protein FAM46B</span> Protein-coding gene in the species Homo sapiens

Protein FAM46B also known as family with sequence similarity 46 member B is a protein that in humans is encoded by the FAM46B gene. FAM46B contains one protein domain of unknown function, DUF1693. Yeast two-hybrid screening has identified three proteins that physically interact with FAM46B. These are ATX1, PEPP2 and DAZAP2.

<span class="mw-page-title-main">FAM214A</span> Protein-coding gene in the species Homo sapiens

Protein FAM214A, also known as protein family with sequence similarity 214, A (FAM214A) is a protein that, in humans, is encoded by the FAM214A gene. FAM214A is a gene with unknown function found at the q21.2-q21.3 locus on Chromosome 15 (human). The protein product of this gene has two conserved domains, one of unknown function (DUF4210) and another one called Chromosome_Seg. Although the function of the FAM214A protein is uncharacterized, both DUF4210 and Chromosome_Seg have been predicted to play a role in chromosome segregation during meiosis.

<span class="mw-page-title-main">CCDC138</span> Protein found in humans

Coiled-coil domain-containing protein 138, also known as CCDC138, is a human protein encoded by the CCDC138 gene. The exact function of CCDC138 is unknown.

<span class="mw-page-title-main">CFAP206</span> Protein-coding gene in the species Homo sapiens

Cilia And Flagella Associated Protein 206 (CFAP206) is a gene that in humans encodes a protein “DUF3508”. This protein has a function that is not currently very well understood. Other known aliases are “dJ382I10.1, UPF0704 Protein C6orf165.” In humans, the gene coding sequence is 56,501 base pairs long, with an mRNA of 2,215 base pairs, and a protein sequence of 622 amino acids. The C6orf165 gene is conserved in chimpanzee, rhesus monkey, dog, cow, mouse, rat, chicken, zebrafish, mosquito, frog, and more C6orf165 is rarely expressed in humans, with relatively high expression in brain, lungs (trachea) and testis. The molecular weight of UPF0704 is 71,193 Da and the PI is 6.38

<span class="mw-page-title-main">Coiled-coil domain containing 42B</span> Protein found in humans

Coiled Coil Domain Containing protein 42B, also known as CCDC42B, is a protein encoded by the protein-coding gene CCDC42B.

<span class="mw-page-title-main">IFFO1</span> Protein-coding gene in the species Homo sapiens

Intermediate filament family orphan 1 is a protein that in humans is encoded by the IFFO1 gene. IFFO1 has uncharacterized function and a weight of 61.98 kDa. IFFO1 proteins play an important role in the cytoskeleton and the nuclear envelope of most eukaryotic cell types.

C6orf222 is a protein that in humans is encoded by the C6orf222 gene (6p21.31). C6orf222 is conserved in mammals, birds and reptiles with the most distant ortholog being the green sea turtle, Chelonia mydas. The C6orf222 protein contains one mammalian conserved domain: DUF3293. The protein is also predicted to contain a BH3 domain, which has predicted conservation in distant orthologs from the clade Aves.

<span class="mw-page-title-main">FAM76A</span> Protein-coding gene in the species Homo sapiens

FAM76A is a protein that in Homo sapiens is encoded by the FAM76A gene. Notable structural characteristics of FAM76A include an 83 amino acid coiled coil domain as well as a four amino acid poly-serine compositional bias. FAM76A is conserved in most chordates but it is not found in other deuterostrome phlya such as echinodermata, hemichordata, or xenacoelomorpha—suggesting that FAM76A arose sometime after chordates in the evolutionary lineage. Furthermore, FAM76A is not found in fungi, plants, archaea, or bacteria. FAM76A is predicted to localize to the nucleus and may play a role in regulating transcription.

<span class="mw-page-title-main">C3orf62</span> Protein

Chromosome 3 Open Reading Frame 62 (C3orf62), is a protein that in humans is encoded by the C3orf62 gene. C3orf62 is a glycine depleted protein relative to the amount of glycine in proteins in the rest of the genome. C3orf62 has a KKXX-like motif and is predicted to be localized in the nucleus. Expression of C3orf62 remains highest in whole blood.

<span class="mw-page-title-main">C21orf58</span> Protein-coding gene in the species Homo sapiens

Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.

<span class="mw-page-title-main">C4orf51</span> Protein-coding gene in the species Homo sapiens

Chromosome 4 open reading frame 51 (C4orf51) is a protein which in humans is encoded by the C4orf51 gene.

<span class="mw-page-title-main">CFAP299</span> Protein-coding gene in the species Homo sapiens

Cilia- and flagella-associated protein 299 (CFAP299), is a protein that in humans is encoded by the CFAP299 gene. CFAP299 is predicted to play a role in spermatogenesis and cell apoptosis.

<span class="mw-page-title-main">SMCO3</span> Protein-coding gene in the species Homo sapiens

Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.

<span class="mw-page-title-main">WD Repeat and Coiled Coil Containing Protein</span> Protein-coding gene in humans

WD Repeat and Coiled-coiled containing protein (WDCP) is a protein which in humans is encoded by the WDCP gene. The function of the protein is not completely understood, but WDCP has been identified in a fusion protein with anaplastic lymphoma kinase found in colorectal cancer. WDCP has also been identified in the MRN complex, which processes double-stranded breaks in DNA.

<span class="mw-page-title-main">CCDC121</span> Protein found in humans

Coiled-coil domain containing 121 (CCDC121) is a protein encoded by the CCDC121 gene in humans. CCDC121 is located on the minus strand of chromosome 2 and encodes three protein isoforms. All isoforms of CCDC121 contain a domain of unknown function referred to as DUF4515 or pfam14988.

TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.

<span class="mw-page-title-main">FAM98C</span> Gene

Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.

<span class="mw-page-title-main">ZNF548</span> Protein-coding gene in the species Homo sapiens

Zinc Finger Protein 548 (ZNF548) is a human protein encoded by the ZNF548 gene which is located on chromosome 19. It is found in the nucleus and is hypothesized to play a role in the regulation of transcription by RNA Polymerase II. It belongs to the Krüppel C2H2-type zinc-finger protein family as it contains many zinc-finger repeats.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000105248 - Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000003208 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. 1 2 "Coiled-coil domain-containing 94 Homo sapiens". NCBI. Retrieved May 10, 2013.
  6. "Coiled-coil domain-containing 94". GeneCards. Retrieved May 12, 2013.
  7. "Transcript variants". Ensemble. Retrieved May 10, 2013.
  8. "ElDorado:Genomes and Annotation". Genomatix. Archived from the original on May 22, 2021. Retrieved May 11, 2013.
  9. "Coiled-coil domain-containing 94 Homo sapiens". NCBI. Retrieved May 11, 2013.
  10. "Tissue-specific mRNA expression". BioGPS. Retrieved May 11, 2013.
  11. 1 2 "CCDC94:Multiple Normal Tissues". NCBI. Retrieved May 12, 2013.
  12. "GeneCards:CCDC94". GeneCards. Retrieved May 10, 2013.
  13. "Biology WorkBench SAPS Program". Biology WorkBench. Retrieved May 11, 2013.[ permanent dead link ]
  14. "Transcript: CCDC94". Ensemble. Retrieved May 11, 2013.
  15. "Coiled-coil domain-containing 94". NCBI. Retrieved May 11, 2013.
  16. "UniProt CCDC94". UniProt. Retrieved May 11, 2013.
  17. "PSORT Prediction". PSORT. Retrieved May 11, 2013.
  18. "MINT Protein Interactions". MINT.
  19. "Relevant datasets in Homo sapiens". STRING. Retrieved May 11, 2013.
  20. "ElDorado:Genome and Annotation". Geonmatix. Archived from the original on May 22, 2021. Retrieved May 11, 2013.
  21. "NetPhos 2.0 server". ExPasy. Retrieved May 12, 2013.
  22. "The Sulfinator". ExPasy. Retrieved May 12, 2013.
  23. "SUMOplot Analysis Program". ABGENT. Retrieved May 12, 2013.
  24. 1 2 "CCDC94 Tertiary Structure Prediction" . Retrieved May 11, 2013.
  25. 1 2 "BLAST". NCBI. Retrieved May 12, 2013.
  26. "Protein Analysis Tools". Biology WorkBench. Retrieved May 12, 2013.[ permanent dead link ]
  27. "Time Tree".
  28. "coiled-coil domain-containing 94". GeneCards. Retrieved May 11, 2013.
  29. "Coiled-coil domain-containing 130 Homo sapiens". NCBI. Retrieved May 11, 2013.