C11orf98 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | C11orf98 , C11orf48, chromosome 11 open reading frame 98 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 1913526 HomoloGene: 84408 GeneCards: C11orf98 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. [5] The gene spans the chromosomal locus from 62,662,817-62,665,210. [6] There are 4 exons. It spans across 2,394 base pairs of DNA [7] and produces an mRNA that is 646 base pairs long. [8]
This gene is expressed at a very high level,4.4 times the average gene. [9] The c11orf98 protein is expressed in a wide array of tissues. RNA-seq dat showed this gene to be expressed highest in the appendix, lymph node, and thymus. [10]
An analysis via PSORT II concluded that the C11orf98 gene product is localized to the nucleus 82.6% reliability. This nuclear localization suggests that C11orf98 protein may have a function related to the expression and regulation of genes in the nucleus.
Several different transcription factors are predicted to regulate the expression of the c11orf98 gene. These transcription factors were predicted based on DNA sequences found in the gene using Genomatix which also provided the name and description. [11]
# | Name | TF Description |
1 | V$ZIC3.03 | Zinc finger protein of the cerebellum (Zic3) |
2 | V$SPZ1.01 | Spermatogenic Zip 1 transcription factor |
3 | V$YB1.01 | Y box binding protein 1, has a preference for binding ssDNA |
4 | V$PAX5 | PAX5 paired domain protein |
5 | V$TCF21.02 | Transcription factor 21 |
6 | V$GFI1B.02 | Growth factor independence 1 zinc finger protein Gfi-1B |
7 | V$SPI1.02 | SPI-1 proto-oncogene; hematopoietic transcription factor PU.1 |
8 | V$ZKSCAN3.01 | Zinc finger with KRAB and SCAN domains 3 |
9 | V$EGR2.03 | Early growth response 2 |
10 | V$ZNF300.01 | KRAB-containing zinc finger protein 300 |
11 | V$AP4.03 | Activating enhancer binding protein 4 |
12 | V$AML2.01 | RUNX3 (Runt-related transcription factor 3), AML2 (Acute myeloid leukemia 2) |
13 | V$WHN.01 | Winged helix protein, involved in hair keratinization and thymus epithelium differentiation |
14 | O$DINR.01 | Drosophila initiator motifs |
15 | V$ZTRE.03 | 5' half site of ZTRE motif |
16 | V$ZNF35.01 | Human zinc finger protein ZNF35 |
17 | V$SP1.02 | Stimulating protein 1, ubiquitous zinc finger transcription factor |
18 | V$DMP1.02 | Cyclin D binding myb-like transcription factor |
19 | V$ETV1.02 | Ets variant 1 |
20 | V$WT1.02 | Wilms Tumor Suppressor |
The c11orf98 gene encodes a protein that is 123 amino acids long. [12] The predicted molecular weight of the protein is 14.2 kDa. [13] The basal isoelectric point was determined to be 11.53. [14] The protein's subcellular localization was predicted to be in the nucleus. [15] [16]
The c11orf98 protein contains a region of unknown function (DUF5564) that spans from amino acids 1-98. There are also 2 disordered regions within the protein that span from amino acids 1-21 and 32-123. [17] C11orf98 contains 4 bipartite nuclear localization signals (NLS_BP) which indicates the protein will be 'tagged' for import into the cell nucleus by nuclear transport. The NLS_BP sequence usually consists of positively charged arginines, which would also explain the arginine rich region (ARG_RICH). [18]
The secondary structures of the c11orf98 protein was predicted to have multiple alpha helices as well as beta sheets. [19] The tertiary structure was predicted using AlphaFold [20]
C11orf98 protein undergoes modifications following translation. The c11orf98 protein was predicted to have an amidation site. This functions as an active peptide precursor cleavage site. Next, a cAMP- and cGMP-dependent protein kinase phosphorylation site was predicted as well as other phosphorylation sites such as a Casein kinase II phosphorylation site and a protein kinase C phosphorylation site . An N-myristoylation site was predicted as well. Phosphorylation is significant because a phosphoryl group is added to the site, which only can occur in the nucleus or in cytosol. Myristoylation is significant because a myristoyl group (fatty acid group) is added to the site which helps anchor the transmembrane protein or cytosolic protein to the membrane. [21] [22] [16] There were twelve predicted o-beta-GlcNAc glycosylation sites. This is significant because this modification is exclusively found on nuclear and cytoplasmic proteins rather than membrane proteins and secretory proteins. [16] One sumoylation site was predicted. Sumoylation is a post-translational modification involved in nuclear-cytosolic transport, transcriptional regulation, apoptosis, protein stability, response to stress, and progression through the cell cycle. [23]
Abbreviated Name | Name | Basis of ID | Score | Description |
JUN | C-jun | Proximity-dependent biotin identification | Various | c-Jun, in combination with c-Fos, forms the AP-1 early response transcription factor |
FBL | Fibrillarin | Proximity-dependent biotin identification | Various | component of a nucleolar small nuclear ribonucleoprotein (snRNP) particle thought to participate in the first step in processing pre-ribosomal (r)RNA |
ESR1 | Estrogen receptor 1 | Tandem affinity purification | 0.35 | activated by the sex hormone estrogen, is a transcription factor composed of several domains important for hormone binding, DNA binding, and activation of transcription |
SCARB2 | Scavenger Receptor Class B Member 2 | Pull Down | 0.35 | protein is primarily found in the membrane of cellular structures called lysosomes, which are specialized compartments that digest and recycle materials |
OAS3 | 2'-5'-Oligoadenylate Synthetase 3 | Pull Down | 0.35 | This enzyme is induced by interferons and catalyzes the 2', 5' oligomers of ATP |
The c11orf98 gene has 148 orthologs. [25] The oldest ortholog appeared in invertebrates. Other orthologs were found in birds, reptiles, amphibians, fish, and invertebrates. [26]
Seq # | C11orf98 | Genus, Species | Common Name | Taxonomic Group | Divergence Date (Million Years Ago) [27] | Accession Number [28] | Query Cover | Sequence Length (aa) | Sequence Identity (%) | Sequence Similarity (%) |
0 | MAMMALIA | Homo sapiens | human | Primates | 0 | NP_001273015 | 100 | 123 | 100 | 100 |
1 | Pan Paniscus | bonobo (pygmy chimpanzee) | Primates | 6.7 | XP_008952146.1 | 100 | 123 | 99.2 | 100 | |
2 | Mus musculus | house mouse | Rodentia | 90 | NP_079739.1 | 100 | 123 | 82.93 | 91.1 | |
3 | AVES | Dromaius novaehollandiae | emu | Aves | 312 | XP_025975290 | 96 | 200 | 35.6 | 43.4 |
4 | Apteryx mantelli | North Island brown kiwi | Aves | 312 | XP_013806542 | 90 | 155 | 48.4 | 61.9 | |
5 | REPTILIA | Chelydra serpentina | comman snapping turtle | Reptilia | 312 | KAG6938024.1 | 97 | 127 | 62.2 | 74 |
6 | AMPHIBIAN | Bufo bufo | common toad | Amphibian | 351.8 | XP_040265882.1 | 98 | 135 | 51.5 | 75 |
7 | Ranitomeya imitator | mimic poison frog | Amphibian | 351.8 | CAF5124592.1 | 59 | 97 | 51.2 | 56.6 | |
8 | FISH | Danio rerio | zebrafish | Actinoptergyii (bony fish) | 435 | XP_009298201.1 | 65 | 177 | 33.7 | 43.7 |
9 | Perca flavescens | yellow perch | Actinoptergyii (bony fish) | 435 | XP_028427042.1 | 97 | 129 | 58.3 | 71.2 | |
10 | Rhincodon typus | whale shark | Chondrichthyes | 473 | XP_020392632.1 | 95 | 125 | 59.1 | 73.2 | |
11 | Carcharodon carcharias | great white shark | Chondrichthyes | 473 | XP_041069108.1 | 95 | 130 | 57.6 | 74.2 | |
12 | Callorhinchus milii | elephant shark | Chondrichthyes | 473 | XP_007910732.1 | 61 | 134 | 47.1 | 63.6 | |
13 | INVERTEBRATES | Styela Clava | stalked sea squirt | Chordata | 676 | XP_039269774.1 | 91 | 125 | 36.8 | 51.9 |
14 | Branchiostoma belcheri | belcher's lancelet | Chordata | 684 | XP_019641031.1 | 93 | 127 | 39.9 | 55.8 | |
15 | Priapulus caudatus | penis worm | Priapulimorphida | 797 | XP_014677581.1 | 95 | 157 | 28.7 | 40.7 | |
16 | Owenia fusiformis | tubeworm | Polychaeta | 797 | CAC9620481.1 | 92 | 148 | 34.2 | 50 | |
17 | Lingula anatina | lingula | Brachiopoda | 797 | XP_013399665.1 | 69 | 141 | 35.2 | 55.6 | |
18 | Exaiptasia diaphana | sea anemone | Anthozoa | 824 | XP_020906605.1 | 78 | 110 | 30.6 | 52.4 | |
19 | Actinia tenebrosa | Waratah anemone | Anthozoa | 824 | XP_031558418.1 | 70 | 113 | 36.2 | 57.5 | |
20 | Nematostella vectensis | Starlet sea anemone | Anthozoa | 824 | XP_001639221.1 | 80 | 112 | 34.9 | 54 |
The relative evolution rate for c11orf98 is slower than the rate of fibrinogen alpha, but faster than cytochrome c. [29] This is shown on the graph on the right
On the right is a phylogenetic tree displaying the evolutionary history of the gene.
Currently, the c11orf98 gene is not associated with any disease or medical condition.
C8orf48 is a protein that in humans is encoded by the C8orf48 gene. C8orf48 is a nuclear protein specifically predicted to be located in the nuclear lamina. C8orf48 has been found to interact with proteins that are involved in the regulation of various cellular responses like gene expression, protein secretion, cell proliferation, and inflammatory responses. This protein has been linked to breast cancer and papillary thyroid carcinoma.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
Chromosome 10 open reading frame 67 (C10orf67), also known as C10orf115, LINC01552, and BA215C7.4, is an un-characterized human protein-coding gene. Several studies indicate a possible link between genetic polymorphisms of this and several other genes to chronic inflammatory barrier diseases such as Crohn's Disease and sarcoidosis.
Uncharacterized protein C2orf73 is a protein that in humans is encoded by the C2orf73 gene. The protein is predicted to be localized to the nucleus.
C17orf53 is a gene in humans that encodes a protein known as C17orf53, uncharacterized protein C17orf53. It has been shown to target the nucleus, with minor localization in the cytoplasm. Based on current findings C17orf53 is predicted to perform functions of transport, however further research into the protein could provide more specific evidence regarding its function.
Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.
C15orf39 is a protein that in humans is encoded by the Chromosome 15 open reading frame 15 (C15orf39) gene.
Chromosome 4 open reading frame 51 (C4orf51) is a protein which in humans is encoded by the C4orf51 gene.
Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene. The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.
Uncharacterized protein C17orf78 is a protein encoded by the C17orf78 gene in humans. The name denotes the location of the parent gene, being at the 78th open reading frame, on the 17th human chromosome. The protein is highly expressed in the small intestine, especially the duodenum. The function of C17orf78 is not well defined.
C12orf24 is a gene in humans that encodes a protein known as FAM216A. This gene is primarily expressed in the testis and brain, but has constitutive expression in 25 other tissues. FAM216A is an intracellular protein that has been predicted to reside within the nucleus of cells. The exact function of C12orf24 is unknown. FAM216A is highly expressed in Sertoli cells of the testis as well as different stage spermatids.
C14orf119 is a protein that in humans is encoded by the c14orf119 gene. The c14orf119 protein is predicted to be localized in the nucleus. Additionally, c14orf119 expression is decreased in individuals with systemic lupus erythematosus (SLE) when compared with healthy individual and is increased in individuals with various types of lymphomas when compared to healthy individuals.
RING Finger Protein 227, also known as RNF227 and LINC02581, is a protein which in humans is encoded by the RNF227 gene. According to DNA microarray data, it is found in at least 15 tissues.
Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13. When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal. Isoelectric point was found to be 9.54. The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.
C6orf136 is a protein in humans encoded by the C6orf136 gene. The gene is conserved in mammals, mollusks, as well some porifera. While the function of the gene is currently unknown, C6orf136 has been shown to be hypermethylated in response to FOXM1 expression in Head Neck Squamous Cell Carcinoma (HNSCC) tissue cells. Additionally, elevated expression of C6orf136 has been associated with improved survival rates in patients with bladder cancer. C6orf136 has three known isoforms.
Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.
Chromosome 12 Open Reading Frame 50 (C12orf50) is a protein-encoding gene which in humans encodes for the C12orf50 protein. The accession id for this gene is NM_152589. The location of C12orf50 is 12q21.32. It covers 55.42 kb, from 88429231 to 88373811, on the reverse strand. Some of the neighboring genes to C12orf50 are RPS4XP15, LOC107984542, and C12orf29. RPS4XP15 is upstream C12orf50 and is on the same strand. LOC107984542 and C12orf29 are both downstream. LOC107984542 is on the opposite strand while C12orf29 is on the same strand. C12orf50 has six isoforms. This page is focusing on isoform X1. C12orf50 isoform X1 is 1711 nucleotides long and has a protein with a length of 414 aa.
C2orf80 is a protein that, in humans, is encoded by the c2orf80 gene. The gene c2orf80 also goes by the alias GONDA1. In humans, c2orf80 is exclusively expressed in the brain. While relatively little is known about the function of c2orf80, medical studies have shown a strong association between variations in c2orf80 and IDH-mutant gliomas, 46,XY gonadal dysgenesis, and a possible association with blood pressure.
Transmembrane protein 104 (TMEM104) is a protein that in humans is encoded by the TMEM104 gene. The aliases of TMEM104 are FLJ00021 and FLJ20255. Humans have a 163,255 base pair long gene coding sequence, 4703 base pair long mRNA, and 496 amino acid long protein sequence. In Eukaryotes, the TMEM104 gene is conserved.
Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.