C7orf25 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | C7orf25 , C7orf25 protein UPF0415, chromosome 7 open reading frame 25 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 2145422 HomoloGene: 11439 GeneCards: C7orf25 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
C7orf25 protein UPF0415 (UPF0415) is a protein encoded on chromosome 7, in open reading frame 25 (C7orf25) and are located at domain of unknown function 1308. C7orf25 is located at the minus strand and encodes 12 proteins, one of them being UPF0415. This protein is believed to be active in the proteosome pathway. C7orf25 protein UPF0415 is not a transmembrane protein and has no signal peptide. UPF0415 has two isoforms, Q9BPX7-1 and Q9BPX7-2. Both consists of two exons that are both highly conserved among vertebrates.
Gene Size | Protein Size | # of exons | Promoter Sequence | Signal Peptide | Molecular Weight | Chromosome position | Protein Isoelectric point |
---|---|---|---|---|---|---|---|
1844 bp | 421 aa | 2 | ~600bp | No | 46.45 kDa [5] | 7p14 | 5.76 [5] |
C7orf25 is an open reading frame that encodes 12 proteins. Most of these are of unknown function. One of these proteins is UPF0415 and another is PSMA2 which is also functional in the proteosome pathway. [6] Other genes located near C7orf25 protein UPF0415 are TCP1P1, HECW1, MIR3943 and MRPL32.
The promoter for the C7orf25 protein UPF0415 gene spans 600 base pairs from 42,951,804 to 42,952,404 with a predicted transcriptional start site that encodes a sequence of 1844 base pairs. The sequence spans from 42,908,726 to 42,912,090. [7] The promoter region and beginning of the C7orf25 gene (20,008,263 to 20,009,250) is not conserved past primates. This region was used to determine transcription factor interactions.
Some of the main transcription factors that bind to the promoter are listed below. [8]
Reference | Detailed Family Information | Start (nucleotide) | End (nucleotide) | Strand |
---|---|---|---|---|
XCPE | Activator-, mediator- and TBP-dependent core promoter element for RNA polymerase II transcription from TATA-less promoters | 360 | 370 | - |
MIZ1 | Myc-interacting Zn finger protein 1 | 120 | 130 | - |
PLAG1 | Pleomorphic adenoma gene | 160 | 182 | - |
FOX proteins | Fork head domain factors | 35 | 51 | + |
E2FF | E2F-myc activator/cell cycle regulator | 357 | 373 | - |
MZF1 | Myeloid zinc finger 1 factors | 497 | 507 | + |
HAND | Twist subfamily of class B bHLH transcription factors | 335 | 355 | - |
RU49 | Zinc finger transcription factor RU49, zinc finger proliferation 1 - Zipro1 | 651 | 657 | - |
NF-κB | Nuclear factor kappa B/c-rel | 219 | 233 | - |
E2FF | E2F-myc activator/cell cycle regulator | 492 | 508 | + |
SP1F | GC-Box factors SP1/GC | 490 | 506 | + |
IKRS | Ikaros zinc finger family | 64 | 76 | + |
STAT1 | Signal transducer and activator of transcription | 409 | 427 | - |
RBP2 | Retinoblastoma-binding proteins with demethylase activity | 538 | 546 | - |
YY1F | Activator/repressor binding to transcription initiation site | 553 | 575 | - |
SP1F | GC-Box factors SP1/GC | 359 | 375 | - |
HSF | Heat shock factors | 240 | 264 | - |
BPTF | Bromodomain and PHD domain transcription factors | 423 | 433 | + |
ZNF35 | Zinc finger protein ZNF35 | 309 | 321 | + |
ETSF | Human and murine ETS1 factors | 235 | 255 | + |
C7orf25 protein UPF0415 is believed to be active in ATP dependent protein breakdown in the proteosome pathway. It is expressed ubiquitously in humans. [9]
UPF0415 protein C7orf25 has one paralog which is FLJ18411. [10] UPF0415 is also highly conserved in vertebrates. The following table shows a small selection of orthologs found using BLAST and BLAT and their identity to C7orf25 protein UPF0415.
Genus and species | Accession number | Similarity (aa) |
---|---|---|
Homo sapiens | NM_001099858 | - |
Bos taurus (Cow) | NM_001076140.1 | 95% |
Falco cherrug (Falcon) | XM_005446213.1 | 78% |
Xenopus laevis (Frog) | XM_005014944.1 | 73% |
UPF0415 protein C7orf25 is not a transmembrane protein as it has no transmembrane domains. C7orf25 protein UPF0415 has multiple phosphorylation which are believed to be responsible in protein activation. [11]
Multiple stem loops have been predicted in both 3` and 5`UTR and these are believed to be functional in gene transcription. [12]
Other proteins that are known to interact with UPF0415 protein C7orf25 are FRA10AC1, FLJ23825 and TUBB (tubulin, beta class I) and only TUBB is associated with proteosome activity.
All post transnational modifications, genetic or proteomic factors that are relevant for UPF0415 protein C7orf25 transcription and regulation, mentioned above, are annotated in the conceptual translation.
C7orf25 encodes 12 different transcripts. Two of these transcripts are (PSMA2 and UPF0415). No specific phenotypes or polymorphisms are yet to be related to mutations in C7orf25. This suggests that this reading frame is important for survival in vertebrates. The picture below shows all predicted transcripts encoded in C7orf25. [13]
Uncharacterized protein KIAA1109 is a protein that in humans is encoded by the KIAA1109 gene.
Transmembrane protein 98 is a single-pass membrane protein that in humans is encoded by the TMEM98 gene. The function of this protein is currently unknown. TMEM98 is also known as UNQ536/PRO1079.
Gene C11orf16, chromosome 11 open reading frame 16, is a protein in humans that is encoded by the C11orf16 gene. It has 7 exons, and the size of 467 amino acids.
E3 ubiquitin-protein ligase RNF128 is an enzyme that in humans is encoded by the RNF128 gene.
QRICH1, also known as Glutamine-rich protein 1, is a protein that in humans is encoded by the QRICH1 gene. One notable feature of this protein is that it contains a Caspase Activation Recruitment Domain, also known as a CARD domain. As a result of having this domain, QRICH1 is believed to be involved in apoptotic, inflammatory, and host-immune response pathways.
The human gene Chromosome 3 open reading frame 14 is a gene of uncertain function located at 3p14.2 near fragile site FRBA3—which falls between this gene and the centromere. Its protein is expected to localize to the nucleus and bind DNA. Orthologs have been identified in all of the major animal groups, minus amphibians and insects, tracing as far back as the sea anemone; indicating an origin of over 1000 mya, highlighting its importance in the animal genome.
Protein FAM46B also known as family with sequence similarity 46 member B is a protein that in humans is encoded by the FAM46B gene. FAM46B contains one protein domain of unknown function, DUF1693. Yeast two-hybrid screening has identified three proteins that physically interact with FAM46B. These are ATX1, PEPP2 and DAZAP2.
ARMH3 or Armadillo Like Helical Domain Containing 3, also known as UPF0668 and c10orf76, is a protein that in humans is encoded by the ARMH3 gene. Its function is not currently known, but experimental evidence has suggested that it may be involved in transcriptional regulation. The protein contains a conserved proline-rich motif, suggesting that it may participate in protein-protein interactions via an SH3-binding domain, although no such interactions have been experimentally verified. The well-conserved gene appears to have emerged in Fungi approximately 1.2 billion years ago. The locus is alternatively spliced and predicted to yield five protein variants, three of which contain a protein domain of unknown function, DUF1741.
Coiled-coil domain containing 109B (CCDC109B) is a potential calcium uniporter protein found in the membrane of human cells and is encoded by the CCDC109B gene. While CCDC109B is a transmembrane protein it is unclear if it is located within the cell membrane or mitochondrial membrane.
WW and C2 domain containing 2 (WWC2) is a protein that in humans is encoded by the WWC2 gene (4q35.1). Though function of WWC2 remains unknown, it has been predicted that WWC2 may play a role in cancer.
Megf8 also known as Multiple Epidermal Growth Factor-like Domains 8, is a protein coding gene that encodes a single pass membrane protein, known to participate in developmental regulation and cellular communication. It is located on chromosome 19 at the 49th open reading frame in humans (19q13.2). There are two isoform constructs known for MEGF8, which differ by a 67 amino acid indel. The isoform 2 splice version is 2785 amino acids long, and predicted to be 296.6 kdal in mass. Isoform 1 is composed of 2845 amino acids and predicted to weigh 303.1 kdal. Using BLAST searches, orthologs were found primarily in mammals, but MEGF8 is also conserved in invertebrates and fishes, and rarely in birds, reptiles, and amphibians. A notably important paralog to multiple epidermal growth factor-like domains 8 is ATRNL1, which is also a single pass transmembrane protein, with several of the same key features and motifs as MEGF8, as indicated by Simple Modular Architecture Research Tool (SMART) which is hosted by the European Molecular Biology Laboratory located in Heidelberg, Germany. MEGF8 has been predicted to be a key player in several developmental processes, such as left-right patterning and limb formation. Currently, researchers have found MEGF8 SNP mutations to be the cause of Carpenter syndrome subtype 2.
ZC3H12B, also known as CXorf32 or MCPIP2, is a protein encoded by gene ZC3H12B located on chromosome Xq12 in humans.
Transmembrane protein 261 is a protein that in humans is encoded by the TMEM261 gene located on chromosome 9. TMEM261 is also known as C9ORF123 and DMAC1, Chromosome 9 Open Reading Frame 123 and Transmembrane Protein C9orf123 and Distal membrane-arm assembly complex protein 1.
Transmembrane protein 268 is a protein that in humans is encoded by TMEM268 gene. The protein is a transmembrane protein of 342 amino acids long with eight alternative splice variants. The protein has been identified in organisms from the common fruit fly to primates. To date, there has been no protein expression found in organisms simpler than insects.
Uncharacterized protein C12orf60 is a protein that in humans is encoded by the C12orf60 gene. The gene is also known as LOC144608 or MGC47869. The protein lacks transmembrane domains and helices, but it is rich in alpha-helices. It is predicted to localize in the nucleus.
UPF0488 is a protein that in humans is encoded by the C8orf33 gene. Chromosome 8 open reading frame 33 (C8orf33) is a human protein-coding gene of currently unknown function.
Transmembrane protein 255A is a protein that is encoded by the TMEM255A gene. TMEM255A is often referred to as family with sequence similarity 70, member A (FAM70A). The TMEM255A protein is transmembrane and is predicted to be located the nuclear envelope of eukaryote organisms.
Chromosome 1 open reading frame 198 (C1orf198) is a protein that in humans is encoded by the C1orf198 gene. This particular gene does not have any paralogs in Homo sapiens, but many orthologs have been found throughout the Eukarya domain. C1orf198 has high levels of expression in all tissues throughout the human body, but is most highly expressed in lung, brain, and spinal cord tissues. Its function is most likely involved in lung development and hypoxia-associated events in the mitochondria, which are major consumers of oxygen in cells and are severely affected by decreases in available cellular oxygen.
Transmembrane protein 179 is a protein that in humans is encoded by the TMEM179 gene. The function of transmembrane protein 179 is not yet well understood, but it is believed to have a function in the nervous system.
Small integral membrane protein 14, also known as SMIM14 or C4orf34, is a protein encoded on chromosome 4 of the human genome by the SMIM14 gene. SMIM14 has at least 298 orthologs mainly found in jawed vertebrates and no paralogs. SMIM14 is classified as a type I transmembrane protein. While this protein is not well understood by the scientific community, the transmembrane domain of SMIM14 may be involved in ER retention.