CCDC74A | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | CCDC74A , coiled-coil domain containing 74A | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | GeneCards: CCDC74A | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Coiled-coil domain containing 74A is a protein that in humans is encoded by the CCDC74A gene. [3] The protein is most highly expressed in the testis and may play a role in developmental pathways. [4] The gene has undergone duplication in the primate lineage within the last 9 million years, and its only true ortholog is found in Pan troglodytes.
The gene locus is located on the long arm of chromosome 2 at 2q21.1, and spans 5991 base pairs. [5] A common alternative alias is LOC90557. [6]
The mRNA encoding the largest peptide product, isoform 6, contains 8 exons and 9 introns. It is 1842bps in length. Altogether, 11 protein isoforms have been characterized as a result of alternative splicing. [7]
The longest CCDC74A peptide product, isoform 6, is 420 amino acids in length. [8] This protein has a predicted molecular weight of 45.9kD and a predicted isoelectric point of 10.65. [9] The entire length of the protein is evenly enriched in lysine and arginine residues. The protein contains 2 eukaryotic coiled-coil domains of unknown function, CCDC92 and CCDC74C. [10] Its predicted localization is to the nucleus, but the protein may shuttle between the nucleus and the cytoplasm due to the presence of both a nuclear localization signal and a nuclear export signal. [11]
Predicted secondary structure for CCDC74A consists of 4 alpha helix regions, which are summarized in the table below and the diagram to the right. [12]
Structure | Start | End |
---|---|---|
Alpha Helix 1 | 47 | 81 |
Alpha Helix 2 | 315 | 330 |
Alpha Helix 3 | 371 | 378 |
Alpha Helix 4 | 384 | 417 |
A threonine residue (T395) which is highly conserved across Animalia orthologs may serve as a phosphorylation site by PKG kinase. [13] Additionally, SUMOylation, methylation, and acetylation sites are predicted within highly conserved regions and may play a part in regulation. [14] [15] These predicted post-translational modifications and conserved domains are summarized in the diagram to the right.
In humans, CCDC74A has one important paralog, CCDC74B. Gene duplication is estimated to have occurred approximately 7 million years ago (MYA). As such, the only true ortholog of CCDC74A is found in Pan troglodytes, and is not found in Gorilla gorilla. However, distant orthologs prior to gene duplication are conserved in species that diverged from humans between 92-797 MYA. This includes species as distant as Cnidaria, but excludes Porifera or species outside of the kingdom Animalia.
CCDC74A localization, expression, and interactions suggest that the protein may play a role in the expression of genes related to developmental and differentiation pathways, particularly during spermatogenesis.
The protein has been found most highly expressed in the testes and trachea. It is also expressed at moderate levels in the lung, brain, prostate, spinal cord, bone marrow, ovary, thymus, and thyroid gland. [16]
Consistent with predicted post-translational methylation, CCDC74A has been shown to interact with the lysine demethylase KDM1A through a yeast 2-hybrid assay. [17] Additionally, through a yeast 2-hybrid assay, CCDC74A has been shown to interact with the lymphocyte activation molecule associated protein SH2D1A. [18]
In a study on androgen-independent prostate cancer, knockout of CCDC74A in androgen-dependent prostate cancer inhibited cell proliferation. [19] Experiments in genital fibroblast cells have shown that CCDC74A expression significantly increases upon exposure to dihydrotestosterone. [20]
MAP11 is a protein that in human is encoded by the gene MAP11. It was previously referred to by the generic name C7orf43. C7orf43 has no other human alias, but in mice can be found as BC037034.
Transmembrane protein 33 is a protein that in humans, is encoded by the TMEM33 gene, also known as SHINC3. Another name for the TMEM33 protein is DB83.
FAM76A is a protein that in Homo sapiens is encoded by the FAM76A gene. Notable structural characteristics of FAM76A include an 83 amino acid coiled coil domain as well as a four amino acid poly-serine compositional bias. FAM76A is conserved in most chordates but it is not found in other deuterostrome phlya such as echinodermata, hemichordata, or xenacoelomorpha—suggesting that FAM76A arose sometime after chordates in the evolutionary lineage. Furthermore, FAM76A is not found in fungi, plants, archaea, or bacteria. FAM76A is predicted to localize to the nucleus and may play a role in regulating transcription.
Chromosome 8 open reading frame 82 is a protein encoded in humans by the C8orf82 gene.
Zinc Finger Protein 800 or ZNF800 is a protein that in humans is encoded by the ZNF800 gene. The specific function of ZNF800 is not yet well understood by the scientific community.
Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.
C2orf16 is a protein that in humans is encoded by the C2orf16 gene. Isoform 2 of this protein is 1,984 amino acids long. The gene contains 1 exon and is located at 2p23.3. Aliases for C2orf16 include Open Reading Frame 16 on Chromosome 2 and P-S-E-R-S-H-H-S Repeats Containing Sequence.
Chromosome X Open Reading Frame 38 (CXorf38) is a protein which, in humans, is encoded by the CXorf38 gene. CXorf38 appears in multiple studies regarding the escape of X chromosome inactivation.
LOC101928193 is a protein which in humans is encoded by the LOC101928193 gene. There are no known aliases for this gene or protein. Similar copies of this gene, called orthologs, are known to exist in several different species across mammals, amphibians, fish, mollusks, cnidarians, fungi, and bacteria. The human LOC101928193 gene is located on the long (q) arm of chromosome 9 with a cytogenic location at 9q34.2. The molecular location of the gene is from base pair 133,189,767 to base pair 133,192,979 on chromosome 9 for an mRNA length of 3213 nucleotides. The gene and protein are not yet well understood by the scientific community, but there is data on its genetic makeup and expression. The LOC101928193 protein is targeted for the cytoplasm and has the highest level of expression in the thyroid, ovary, skin, and testes in humans.
TMEM128, also known as Transmembrane Protein 128, is a protein that in humans is encoded by the TMEM128 gene. TMEM128 has three variants, varying in 5' UTR's and start codon location. TMEM128 contains four transmembrane domains and is localized in the Endoplasmic Reticulum membrane. TMEM128 contains a variety of regulation at the gene, transcript, and protein level. While the function of TMEM128 is poorly understood, it interacts with several proteins associated with the cell cycle, signal transduction, and memory.
WD Repeat and Coiled-coiled containing protein (WDCP) is a protein which in humans is encoded by the WDCP gene. The function of the protein is not completely understood, but WDCP has been identified in a fusion protein with anaplastic lymphoma kinase found in colorectal cancer. WDCP has also been identified in the MRN complex, which processes double-stranded breaks in DNA.
Coiled-coil domain containing 121 (CCDC121) is a protein encoded by the CCDC121 gene in humans. CCDC121 is located on the minus strand of chromosome 2 and encodes three protein isoforms. All isoforms of CCDC121 contain a domain of unknown function referred to as DUF4515 or pfam14988.
Transmembrane protein 39B (TMEM39B) is a protein that in humans is encoded by the gene TMEM39B. TMEM39B is a multi-pass membrane protein with eight transmembrane domains. The protein localizes to the plasma membrane and vesicles. The precise function of TMEM39B is not yet well-understood by the scientific community, but differential expression is associated with survival of B cell lymphoma, and knockdown of TMEM39B is associated with decreased autophagy in cells infected with the Sindbis virus. Furthermore, the TMEM39B protein been found to interact with the SARS-CoV-2 ORF9C protein. TMEM39B is expressed at moderate levels in most tissues, with higher expression in the testis, placenta, white blood cells, adrenal gland, thymus, and fetal brain.
Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.
Coiled-Coil Domain Containing 190, also known as C1orf110, the Chromosome 1 Open Reading Frame 110, MGC48998 and CCDC190, is found to be a protein coding gene widely expressed in vertebrates. RNA-seq gene expression profile shows that this gene selectively expressed in different organs of human body like lung brain and heart. The expression product of c1orf110 is often called Coiled-coil domain-containing protein 190 with a size of 302 aa. It may get the name because a coiled-coil domain is found from position 14 to 72. At least 6 spliced variants of its mRNA and 3 isoforms of this protein can be identified, which is caused by alternative splicing in human.
C13orf42 is a protein which, in humans, is encoded by the gene chromosome 13 open reading frame 42 (C13orf42). RNA sequencing data shows low expression of the C13orf42 gene in a variety of tissues. The C13orf42 protein is predicted to be localized in the mitochondria, nucleus, and cytosol. Tertiary structure predictions for C13orf42 indicate multiple alpha helices.
THAP domain-containing protein 3 (THAP3) is a protein that, in Homo sapiens (humans), is encoded by the THAP3 gene. The THAP3 protein is as known as MGC33488, LOC90326, and THAP domain-containing, apoptosis associated protein 3. This protein contains the Thanatos-associated protein (THAP) domain and a host-cell factor 1C binding motif. These domains allow THAP3 to influence a variety of processes, including transcription and neuronal development. THAP3 is ubiquitously expressed in H. sapiens, though expression is highest in the kidneys.
Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.
C10orf53 is a protein that in humans is encoded by the C10orf53 gene. The gene is located on the positive strand of the DNA and is 30,611 nucleotides in length. The protein is 157 amino acids and the gene has 3 exons. C10orf53 orthologs are found in mammals, birds, reptiles, amphibians, fish, and invertebrates. It is primarily expressed in the testes and at very low levels in the cerebellum, liver, placenta, and trachea.
Secernin-3 (SCRN3) is a protein that is encoded by the human SCRN3 gene. SCRN3 belongs to the peptidase C69 family and the secernin subfamily. As a part of this family, the protein is predicted to enable cysteine-type exopeptidase activity and dipeptidase activity, as well as be involved in proteolysis. It is ubiquitously expressed in the brain, thyroid, and 25 other tissues. Additionally, SCRN3 is conserved in a variety of species, including mammals, birds, fish, amphibians, and invertebrates. SCRN3 is predicted to be an integral component of the cytoplasm.