Uncharacterized protein C17orf78 is a protein encoded by the C17orf78 gene in humans. [1] The name denotes the location of the parent gene, being at the 78th open reading frame, on the 17th human chromosome. The protein is highly expressed in the small intestine, especially the duodenum. [2] The function of C17orf78 is not well defined.
C17orf78 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | C17orf78 , chromosome 17 open reading frame 78 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 3650287; HomoloGene: 82346; GeneCards: C17orf78; OMA:C17orf78 - orthologs | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
C17orf78 (Chromosome 17 Open Reading Frame 78) is found on the long arm cytogenetic band 17q12. [7] The genomic sequence spans from base pair position 37,375,985 to 37,392,708 on the forward strand, and constitutes a length of 16,723 base pairs. [8] The neighboring genes include TADA2A, DUSP14, and ACACA. [9]
C17orf78 has 7 exon regions within its encoding area. [1] C17orf78 also has a total of 6 intron regions spanning its sequence. [10] [7]
C17orf78 has two splice variant isoforms. [11] Isoform 1 is encoded by a mRNA sequence that is 1920 base pairs in length. [12] Isoform 2 derives from a mRNA sequence of 1678 base pairs. [13]
The primary sequence of C17orf78 has been predicted to be 30.55kDa, with an isoelectric point of 9.62. [14]
Uncharacterized protein C17orf78 isoform 1 (C17orf78-204) has a span of 275 amino acids, including all 7 exons. [15] [16] C17orf78 isoform 1 is the principle protein.
Uncharacterized protein C17orf78 isoform 2 (C17orf78-203) has a span of 159 amino acids, constituted from 5 exon regions, which include the 1st, 2nd, 3rd, 6th, and 7th exons of the principle protein. [17] [16]
C17orf78 has high expression in the human small intestine, particularly the duodenum [2] [18] and has been detected in small expression levels in the testes and other tissues. [19] Fetal expression lowers in all tissues over time with development except for the intestines, which shows increasing expression over time. [20] [2]
Predictive analysis of C17orf78 by Psort2 [21] places the primary location in the nucleus because of a nuclear localization signal. C17orf78 is also potentially a transmembrane protein due to the presence of a transmembrane region. [22] [12]
C17orf78 secondary structure has been predicted to have several alpha helices and strands as well as beta sheets. [23] [24]
The Genomatix [25] tool Gene2Promoter found one viable promoter region. The region was found to span from base pairs 37374332 to 37376025.
The mRNA secondary structure for C17orf78 was found by the online tool RNAfold [26] show a moderate affinity for stem-loop (hairpin) structures.
Phosphorylation is predicted to occur at a number of sites on C17orf78. [27] PKC-phosphorylation and CK2 phosphorylation are predicted to have various sites on C17orf78 with high confidence. [28]
N-linked glycosylation is predicted to occur at three locations on C17orf78. [28] Asparagine linked glycosylation was predicted to occur on C17orf78 orthologs with high confidence.
Myristolyation has been predicted to occur on C17orf78 by the ExPASy tool Motif Scan. [28]
C17orf78 orthologs have been identified in mammals, birds, and reptiles. [1] It is a rapidly evolving gene, with around 40 base pairs mutating every 100 million years. [1] There are no known paralogs of this gene in humans. [29]
Scientific Name | Common Name | Taxonomic Group | Date of Divergence (MYA) | Accession Number | Protein Length (aa) | Identity (%) | Similarity (%) |
---|---|---|---|---|---|---|---|
Homo sapiens | Human | Primates | 0 | NP_775896.3 | 275 | 100 | 100 |
Mus musculus | Mouse | Rodentia | 89 | NP_001033021.2 | 290 | 59.7 | 72.0 |
Sus scrofa | Wild Boar | Artiodactyla | 94 | XP_013845397.2 | 287 | 69.7 | 80.8 |
Ovis aries | Sheep | Artiodactyla | 94 | XP_027831111.1 | 280 | 68.3 | 79.4 |
Eptesicus fuscus | Big Brown Bat | Chiroptera | 94 | XP_028013352.1 | 287 | 66.7 | 74.7 |
Mustela erminea | Stoat | Carnivora | 94 | XP_032175349.1 | 300 | 58.8 | 67.7 |
Orycteropus afer afer | Aardvark | Tubulidentata | 102 | XP_007946081.1 | 286 | 71.2 | 80.9 |
Loxodonta africana | African Bush Elephant | Proboscidea | 102 | XP_003414613.1 | 282 | 67.6 | 77.8 |
Elephantulus edwardii | Cape Elephant Shrew | Macroscelidea | 102 | XP_006881725.1 | 312 | 57.6 | 67.5 |
Phascolarctos cinereus | Koala | Diprotodontia | 160 | XP_020847101.1 | 286 | 53.3 | 66.2 |
Vombatus ursinus | Common Wombat | Diprotodontia | 160 | XP_027726443.1 | 245 | 45.6 | 59.8 |
Trachemys scripta elegans | Red-Eared Slider Turtle | Testudines | 318 | XP_034608235.1 | 257 | 31.6 | 46.2 |
Apertyx rowi | Okarito Kiwi | Apterygiformes | 318 | XP_025937346.1 | 270 | 31.3 | 50.2 |
Chelonia mydas | Green Sea Turtle | Testudines | 318 | XP_027675993.1 | 270 | 28.9 | 39.7 |
Terrapene carolina triunguis | Three-Toed Box Turtle | Testudines | 318 | XP_029768387.1 | 299 | 28.4 | 41.0 |
Pelodiscus sinensis | Chinese Softshell Turtle | Testudines | 318 | XP_025035086.1 | 258 | 28.2 | 40.8 |
Melopsittacus undulatus | Budgerigar | Psittaciformes | 318 | XP_012985804.2 | 255 | 26.7 | 41.1 |
Oxyura jamaicensis | Ruddy Duck | Anseriformes | 318 | XP_035199171.1 | 270 | 25.8 | 41.2 |
Chelonoidis abingdonii | Pinta Island Tortoise | Testudines | 318 | XP_032635246.1 | 262 | 24.2 | 36.3 |
Anas platyrhynchos | Mallard | Anseriformes | 318 | XP_021123240.1 | 232 | 23.3 | 34.3 |
Uncharacterized protein C2orf73 is a protein that in humans is encoded by the C2orf73 gene. The protein is predicted to be localized to the nucleus.
Chromosome 6 open reading frame 62 (C6orf62), also known as X-trans-activated protein 12 (XTP12), is a gene that encodes a protein of the same name. The encoded protein is predicted to have a subcellular location within the cytosol.
Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.
C15orf39 is a protein that in humans is encoded by the Chromosome 15 open reading frame 15 (C15orf39) gene.
TMEM44 is a protein that in humans is encoded by the TMEM44 gene. DKFZp686O18124 is a synonym of TMEM44.
Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.
Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.
Cilia- and flagella-associated protein 299 (CFAP299), is a protein that in humans is encoded by the CFAP299 gene. CFAP299 is predicted to play a role in spermatogenesis and cell apoptosis.
C11orf42 is an uncharacterized protein in Homo sapiens that is encoded by the C11orf42 gene. It is also known as chromosome 11 open reading frame 42 and uncharacterized protein C11orf42, with no other aliases. The gene is mostly conserved in mammals, but it has also been found in rodents, reptiles, fish and worms.
Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.
Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene. The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.
C16orf90 or chromosome 16 open reading frame 90 produces uncharacterized protein C16orf90 in homo sapiens. C16orf90's protein has four predicted alpha-helix domains and is mildly expressed in the testes and lowly expressed throughout the body. While the function of C16orf90 is not yet well understood by the scientific community, it has suspected involvement in the biological stress response and apoptosis based on expression data from microarrays and post-translational modification data.
C12orf24 is a gene in humans that encodes a protein known as FAM216A. This gene is primarily expressed in the testis and brain, but has constitutive expression in 25 other tissues. FAM216A is an intracellular protein that has been predicted to reside within the nucleus of cells. The exact function of C12orf24 is unknown. FAM216A is highly expressed in Sertoli cells of the testis as well as different stage spermatids.
Leucine rich single-pass membrane protein 2 is a single-pass membrane protein rich in leucine, that in humans is encoded by the LSMEM2 gene. The LSMEM2 protein is conserved in mammals, birds, and reptiles. In humans, LSMEM2 is found to be highly expressed in the heart, skeletal muscle and tongue.
TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.
Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13. When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal. Isoelectric point was found to be 9.54. The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.
The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.
FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.
C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.
C4orf36 is a protein that in humans is encoded by the c4orf36 gene.
{{cite journal}}
: Cite journal requires |journal=
(help){{cite journal}}
: Cite journal requires |journal=
(help)