| C5orf22 | |||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Aliases | C5orf22 , chromosome 5 open reading frame 22 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| External IDs | MGI: 1925127; HomoloGene: 10149; GeneCards: C5orf22; OMA:C5orf22 - orthologs | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Chromosome 5 open reading frame 22 (c5orf22) is a protein-coding gene of poorly characterized function in Homo sapiens . [5] The primary alias is unknown protein family 0489 (UPF0489). [5]
C5orf22 is located on the positive strand of Chromosome 5 at 5P13.3, spanning 22,779 nucleotides, from base pair 31532275 to 31555053. [6] C5orf22 encodes 9 total exons and contains 7 isoforms. [5] Isoform variants differ in their exon configuration and untranslated region. Transcript variant 1 is the canonical isoform, encoding 442 amino acids across 9 exons. [7]
C5orf22 displays ubiquitous RNA expression across tissue types from all 3 germ layers and from all phases of development in humans, mice, chickens, and zebrafish. [5] There are statistically significant differences in RNA expression between select tissues, with skeletal muscle containing the greatest abundance (7.8 RPKM) [5] [9]
C5orf22 contains 1 predicted promoter directly upstream of the gene (GXP_55076). [8] This promoter is 1,081 base pairs and partially overlaps with the 5’ untranslated region. [8] GXP_55076 is assigned to all transcript variants. [8] Transcription factor binding elements consist of TATA box binding elements, SMAD transcription factors, MAF/AP1 binding factors, and several others. [8]
C5orf22 closest neighboring element is Drosha, a ribonuclease which is encoded by the minus strand proximal to C5orf22. [5] [10] Drosha is a double stranded endoribonuclease that assists with the first step of microRNA biogenesis. [11]
C5orf22 contains 2 globular domains and 3 small disordered regions. [12] The molecular-weight is approximately 50 kDa. [13] The isoelectric point is 4.7. [13] C5orf22 contains relatively average amino acid proportions compared to most proteins. [14] There were no significant outliers in abundance of individual amino acids. C5orf22 contains several predicted post-translational modifications including phosphorylation sites, ubiquitination sites, glycosylation sites, SH2 domain, and a myristylation site. [12]
C5orf22 is most likely to exist as a soluble protein located within the cytoplasm and nucleus. [15] Amino acid sequence predictions and immunohistochemical staining support the localization of C5orf22 to cytoplasm and nucleus. [9] [16] Furthermore, amino acid sequence analysis indicated a predicted partial nuclear localization signal (NLS) from AA 175-185. [17]
The precise function of C5orf22 is still unknown however it is hypothesized to be a component of a DNA splicing complex. [18] Proteomic research implicated the protein product as a novel component of the WBP11/PQBP1 splicing complex which regulates expression of genes involved in a spectrum of processes ranging from DNA repair to immunomodulation. [18] C5orf22 knockdown was associated with downregulation of alternative splicing events that led to aberrant gene expression of select genes and ultimately cell cycle dysfunction. [18] Cell localization evidence and the presence of a NLS further support this hypothesized function.
Experimental evidence has indicated over 20 interactors with C5orf22. [19] [20] [21] Interactants are localized to both the nucleus and cytoplasm. [22] The most likely interactors are WBP11, OSM, Surf2, ELOF1, and DDITL4. [20]
C5orf22 initially appeared in invertebrates approximately 797 million years ago. [23] It is the only member of its gene family. Human UPF0489 C5orf22 is conserved through invertebrates. [23] C5orf22 orthologs showed conservation of the two globular domains through bony fish and conservation of 1 globular domain within arthropods. [12] Isoelectric point and molecular weights of C5orf22 orthologs were within ∓ 0.15 and ∓ 3kDa through bony fish. [12] There are no paralogs to c5orf22 in humans. [23]
UPF0489 C5orf22 is slow evolving protein, based on comparisons of the percent corrected divergence of orthologous proteins. [24]
| Taxonomic Class | Common name | Genus species | Date of Divergence Millions of Years Ago (MYA) | Sequence Identity (%) | Sequence Similarity (%) | Sequence Length (AA) | Query Coverage (%) | Accession number |
|---|---|---|---|---|---|---|---|---|
| Mammal | Human | Homo sapiens | N/A | 100 | 100 | 442 | 100 | NP_060826.2 |
| Mouse | Mus musculus | 90 | 78 | 86 | 442 | 100 | NP_084274.1 | |
| Whale | Balaenoptera musculus | 96 | 89 | 94 | 467 | 100 | XP_036705025.1 | |
| Aves | Chicken | Gallus gallus | 312 | 68 | 79 | 446 | 98 | XP_418996.3 |
| Reptile | Tiger rattlesnake | Crotalus tigris | 312 | 65 | 75 | 476 | 98 | XP_039212189.1 |
| Amphibian | African clawed frog | Xenopus laevis | 352 | 67 | 78 | 459 | 95 | XP_018121838.1 |
| Fish | Zebrafish | Danio rerio | 435 | 57 | 71 | 439 | 95 | NP_956625.1 |
| Sea lamprey | Petromyzon marinus | 615 | 51 | 69 | 589 | 89 | XP_032827184.1 | |
| Invertebrate | Fruit fly | Drosophila suzukii | 797 | 33 | 50 | 481 | 95 | XP_036671373.1 |
Recent studies on miRNA's role in breast cancer pathogenesis has correlated upregulation of C5orf22 with reduced survival of breast cancer patients. [26]
Patient's with tibial muscular dystrophy, exhibit decreased expression of C5orf22. [27] Patient's with non-ischemic cardiomyopathy exhibit increased expression of C5orf22.