C9orf152 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | C9orf152 , bA470J20.2, chromosome 9 open reading frame 152 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 2442889 HomoloGene: 52276 GeneCards: C9orf152 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Chromosome 9 open reading frame 152 is a protein that in humans is encoded by the C9orf152 gene. [5] [6] The exact function of the protein is not completely understood.
The human gene C9orf152 is located on the long (q) arm of Chromosome 9. [7] Its cytogenetic location is 9q31.1. It has one known alias: bA470J20.2. [8]
The DNA sequence encoding C9orf152 contains a single intron. [7] The final mRNA consists of 2698 base pairs. Nucleotides 66-68 encode an upstream in frame stop codon. [5]
C9orf152 has orthologs in mammals, birds, reptiles and amphibians. No orthologs have been detected in bony fish or in any invertebrates. [7] [9] The following table lists a subset of conserved orthologs.
Scientific name | Common name | Accession number | Sequence length (aa) | Percent identity | Percent similarity |
---|---|---|---|---|---|
Homo sapiens | Human | NP_001013011.2 | 239 | - | - |
Pan troglodytes | Chimpanzee | XP_001145187 | 239 | 98 | 98 |
Tarsius syrichta | Philippine tarsier | XP_008064367 | 237 | 78 | 85 |
Ceratotherium simum simum | Rhinoceros | XP_004423784 | 239 | 78 | 82 |
Sus scrofa | Wild boar | XP_003122117 | 239 | 74 | 83 |
Equus caballus | Horse | XP_001491697 | 239 | 74 | 80 |
Tursiops truncatus | Bottlenose dolphin | XP_004329084 | 234 | 73 | 81 |
Heterocephalus glaber | Naked mole rat | XP_004903816 | 239 | 74 | 84 |
Orcinus orca | Killer whale | XP_004269444 | 231 | 72 | 79 |
Mus musculus | Mouse | NP_848842 | 236 | 62 | 72 |
Rattus norvegicus | Rat | XP_003754080 | 234 | 62 | 70 |
Chelonia mydas | Green sea turtle | XP_007059491 | 267 | 33 | 49 |
Nestor notabilis | Kea | XP_010009525 | 265 | 34 | 49 |
Python bivittatus | Burmese python | XP_007428415 | 234 | 30 | 44 |
Meleagris gallopavo | Wild turkey | XP_010710660 | 267 | 29 | 43 |
Pelodiscus sinensis | Chinese softshell turtle | XP_006120615 | 268 | 29 | 43 |
Haliaeetus albicilla | White tailed eagle | XP_009911401 | 266 | 33 | 48 |
Xenopus tropicalis | Western clawed frog | XP_004915565 | 226 | 31 | 45 |
Differences among shown orthologs suggest a slow rate of evolution. [10]
Chromosome 9 open reading frame 152 contains 239 amino acids. The molecular weight is 26.3 kilodaltons. The protein has a high chance of existing nuclear region of cells. [11] There are likely no transmembrane regions. [12] One isoform exists, containing 194 amino acids. [9] [13]
Within the coding sequence, there are two sumoylation sites [14] [15] [16] and a single serine phosphorylation site. [17]
There are three regions predicted to form alpha helices on the final protein. [18] [19]
C9orf152 is expressed in the bladder, intestine, mammary gland, and trachea and in smaller amounts in the lungs, liver, prostate, uterus, and brain. [20] Within the brain, expression of C9orf152 is limited to the olfactory bulb. [21] Gene expression was found to increase in the presence of stress, including disease and heat stress. [22]
A wide variety of transcription factors interact with the promoter of C9orf152, most notably two olfactory related factors (specifically, a neuron-specific olfactory factor and an olfactory associated zinc finger protein) and a negative glucocorticoid response element. [23]
C11orf49 is a protein coding gene that in humans encodes for the C11orf49 protein. It is heavily expressed in brain tissue and peripheral blood mononuclear cells, with the latter being an important component of the immune system. It is predicted that the C11orf49 protein acts as a kinase, and has been shown to interact with HTT and APOE2.
Gene C11orf16, chromosome 11 open reading frame 16, is a protein in humans that is encoded by the C11orf16 gene. It has 7 exons, and the size of 467 amino acids.
Uncharacterized protein C1orf21, also known as Proliferation-Inducing Protein 13, is a protein that in humans is encoded by the C1orf21 gene. C1orf21 is an intracellular protein that flows between the nucleus and the cytoplasm in the cell. It has been linked with cell growth and reproduction and there has been strong links with various types of cancers. There are no paralogs for this gene, however, many conserved orthologs have been found in all invertebrates. C1orf21 has low to moderate level of expression in most tissues in humans, however, it has the most expression in the skin, lung and prostate.
C9orf64 is a gene located on chromosome 9, that in humans encodes the protein queuosine salvage protein. The function and biological process of the queuosine salvage protein is not well understood by the scientific community, but some evidence from orthologs indicates it may be involved in tRNA processing. The most common mRNA contains 4 coding exons, and it has 2 additional alternatively spliced exons. C9orf64 has been found in 5 different splice variants.
Chromosome 20 open reading frame 111, or C20orf111, is the hypothetical protein that in humans is encoded by the C20orf111 gene. C20orf111 is also known as Perit1, HSPC207, and dJ1183I21.1. It was originally located using genomic sequencing of chromosome 20. The National Center for Biotechnology Information, or NCBI, shows that it is located at q13.11 on chromosome 20, however the genome browser at the University of California-Santa Cruz (UCSC) website shows that it is at location q13.12, and within a million base pairs of the adenosine deaminase locus. It was also found to have an increase in expression in cells undergoing hydrogen peroxide(H
2O
2)-induced apoptosis. After analyzing the amino acid content of C20orf111, it was found to be rich in serine residues.
Shortage In Chiasmata 1, also known as SHOC1, is a protein that in humans is encoded by the SHOC1 gene.
Uncharacterized protein C2orf73 is a protein that in humans is encoded by the C2orf73 gene. The protein is predicted to be localized to the nucleus.
Chromosome 8 open reading frame 58 is an uncharacterised protein that in humans is encoded by the C8orf58 gene. The protein is predicted to be localized in the nucleus.
Uncharacterized protein C17orf50 is a protein which in humans is encoded by the C17orf50 gene.
Chromosome 18 open reading frame 63 is a protein which in humans is encoded by the C18orf63 gene. This protein is not yet well understood by the scientific community. Research has been conducted suggesting that C18orf63 could be a potential biomarker for early stage pancreatic cancer and breast cancer.
Chromosome 3 open reading frame 67 or C3orf67 is a protein that in humans is encoded by the gene C3orf67. The function of C3orf67 is not yet fully understood.
Chromosome 9 open reading frame 43 is a protein that in humans is encoded by the C9orf43 gene. The gene is also known as MGC17358 and LOC257169. C9orf43 contains DUF 4647 and a polyglutamine repeat region although protein function is not well understood.
Chromosome 9 open reading frame 25 (C9orf25) is a domain that encodes the FAM219A gene. The terms FAM219A and C9orf25 are aliases and can be used interchangeably. The function of this gene is not yet completely understood.
Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.
Chromosome 4 open reading frame 51 (C4orf51) is a protein which in humans is encoded by the C4orf51 gene.
Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.
C22orf23 is a protein which in humans is encoded by the C22orf23 gene. Its predicted secondary structure consists of alpha helices and disordered/coil regions. It is expressed in many tissues and highest in the testes and it is conserved across many orthologs.
Uncharacterized protein C17orf78 is a protein encoded by the C17orf78 gene in humans. The name denotes the location of the parent gene, being at the 78th open reading frame, on the 17th human chromosome. The protein is highly expressed in the small intestine, especially the duodenum. The function of C17orf78 is not well defined.
C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.
C4orf19 is a protein which in humans is encoded by the C4orf19 gene.