C3orf38 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | C3orf38 , chromosome 3 open reading frame 38 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 1914859 HomoloGene: 27867 GeneCards: C3orf38 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Chromosome 3 open reading frame 38 (C3orf38) is a protein which in humans is encoded by the C3orf38 gene.
The C3orf38 gene is located on chromosome 3 (3p11.1) on the forward strand. [5] It spans 18,771 bases from chr3:88,149,959-88,168,729. [5] It contains 3 exons. [6] Common aliases for this gene are MGC26717, LOC285237, and FLJ54270. [7] Some of the genes neighboring C3orf38 include ZNF654, CGGBP1, and LOC105377202. [8]
Protein Name | Gene ID | Transcript Accession | Length (nt) | Length (aa) |
---|---|---|---|---|
uncharacterized protein C3orf38 | 285237 | NM_173824.4 | 2414 | 329 |
uncharacterized protein C3orf38 isoform X1 | 285237 | XM_005264745.5 | 2356 | 328 |
The C3orf38 protein is 329 amino acids in length. [9] A large domain of unknown function, DUF4518, encompasses majority of the C3orf38 protein. [9] This domain is a part of the protein family pfam15008, which is thought to be involved in apoptosis regulation. [10] This pfam15008 is the only member of the cl20886 superfamily. [10] While the C3orf38 protein does not have any abnormal amino acid abundance as a whole, the DUF4518 has a high abundance of histidines and a low abundance of serines, according to compositional analysis. [11] The predicted molecular weight of the entire C3orf38 protein is 37.0 kD and the isoelectric point is 6.01. [12] The DUF4518 contained inside the C3orf38 protein has a predicted molecular weight of 31 kD and an isoelectric point of 6.49. [12]
There have been a number of potential promoters identified for the C3orf38 gene, which are described in the table below. [13]
Promoter | Start | End | Length (bp) | Transcripts |
---|---|---|---|---|
GXP_203118 | 88148634 | 88150046 | 1413 | GXT_23216585, GXT_22791246, GXT_2803824, GXT_26239186 |
GXP_9795962 | 88148768 | 88149807 | 1040 | no transcript assigned; promoter based on comparative genomics |
GXP_9795963 | 88148794 | 88150027 | 1234 | no transcript assigned; promoter based on comparative genomics |
GXP_3194836 | 88149604 | 88150643 | 1040 | GXT_24485561 |
The C3orf38 gene exhibits ubiquitous expression in human tissues. [14]
The C3orf38 protein is expected to be found with the highest confidence in the cytoplasm. [15] This finding is supported by examination of an array of C3orf38 orthologs. [15]
There are several well conserved post translation modification sites found amongst the human C3orf38 protein and its orthologs, which are depicted in the table below. [16] Majority of these PTMs are PKC phosphorylation sites. [16] Additionally, two confirmed active sites are located in the C3orf38 protein. The first is an aldehyde dehydrogenases glutamic acid active site located from amino acids 1-8. [16] The second site is a eukaryotic thiol (cysteine) proteases histidine active site located from amino acids 227-237. [16]
PTM | Protein Location (aa) |
---|---|
Myristyl site | 235-240 |
PKC phosphorylation site | 34-36 |
PKC phosphorylation site | 86-88 |
PKC phosphorylation site | 199-201 |
PKC phosphorylation site | 265-267 |
Orthologs for the C3orf38 protein can be found in mammals, reptiles, birds, amphibians, fish, and invertebrates using BLAST searches. [17] A selection of these orthologs can be found in the ortholog table below. There are no paralogs. [17] Additionally, by comparing sequences of C3orf38 protein with cytochrome C and fibrinogen alpha proteins, a moderate rate of evolution was determined for the C3orf38 protein.
Genus, species | Common Name | Taxonomic Group | Divergence Date (MYA) | Accession Number | Sequence Length (aa) | Sequence Identity (%) | Sequence Similarity (%) | |
---|---|---|---|---|---|---|---|---|
Mammals | Homo sapiens | Human | Primates | 0 | NP_776185.2 | 329 | 100 | 100 |
Pan paniscus | Bonobo | Primates | 6.7 | XP_003831564.1 | 329 | 99.4 | 99.7 | |
Puma concolor | Puma | Carnivora | 96 | XP_025769652.1 | 348 | 79.8 | 86.6 | |
Reptiles | Mauremys reevesii | Reeve's Turtle | Testudines | 312 | XP_039379932.1 | 315 | 55.7 | 70.5 |
Chelonoidis abingdonii | Abingdon Island Giant Tortoise | Testudines | 312 | XP_032650981.1 | 304 | 55.4 | 69.9 | |
Birds | Strigops habroptila | Kakapo | Psittaciformes | 312 | XP_030327387.1 | 309 | 52.1 | 66.3 |
Taeniopygia guttata | Zebra Finch | Passeriformes | 312 | XP_002190058.5 | 306 | 51 | 63.9 | |
Gallus gallus | Chicken | Galliformes | 312 | XP_004938363.2 | 312 | 44.2 | 59.9 | |
Amphibians | Rhinatrema bivittatum | Two-Lined Caecilian | Gymnophiona | 351.8 | XP_029434832.1 | 289 | 49.7 | 64.5 |
Bufo bufo | Common Toad | Anura | 351.8 | XP_040279187.1 | 289 | 43.9 | 62.1 | |
Xenopus tropicalis | Tropical Clawed Frog | Anura | 351.8 | XP_017946806.1 | 261 | 38.6 | 54.8 | |
Fish | Chelmon rostratus | Copperband Butterflyfish | Perciformes | 435 | XP_041807133.1 | 302 | 42.7 | 58.2 |
Coregonus clupeaformis | Lake Whitefish | Salmoniformes | 435 | XP_041700482.1 | 308 | 42.4 | 60.6 | |
Carcharodon carcharias | Great White Shark | Lamniformes | 473 | XP_041066710.1 | 308 | 45 | 59.8 | |
Amblyraja radiata | Thorny Skate | Rajiformes | 473 | XP_032888490.1 | 382 | 32.5 | 46.5 | |
Invertebrates | Lytechinus variegatus | Sea Urchin | Temnopleuroida | 684 | XP_041465399.1 | 312 | 36.4 | 48.3 |
Patiria miniata | Bat Star | Valvatida | 684 | XP_038067113.1 | 294 | 34.1 | 46.2 | |
Cryptotermes secundus | Termite | Blattodea | 797 | XP_023724689.1 | 296 | 30.1 | 48 | |
Crassostrea virginica | Eastern Oyster | Ostreidae | 797 | XP_022335568.1 | 340 | 29.6 | 46.5 | |
Diabrotica virgifera | Western Corn Rootworm | Coleoptera | 797 | XP_028133096.1 | 284 | 26.9 | 43.6 | |
Acropora millepora | Branching Stony Coral | Scleractinia | 824 | XP_029194133.1 | 288 | 32.6 | 50.9 |
Although investigation into the function of the C3orf38 gene is ongoing, a couple studies have granted valuable insights into its role. One study has identified C3orf38 as a candidate proapoptotic gene. [20] Another study identified C3orf38 as a top candidate tumor suppressor gene (TSG). [21]
Of the various proteins C3orf38 protein interacts with, two are particularly interesting seeing as C3orf38 is a candidate proapoptotic and tumor suppressor gene. First, BAG family molecular chaperone regulator 4 (BAG4) is an anti-apoptotic protein that is known to interact with a number of apoptosis and growth-related proteins. [22] Second, DnaJ Heat Shock Protein Family Member B4 (DNAJB4) is a member of the heat shock protein-40 family (Hsp40), a molecular chaperone, and a tumor suppressor (specifically for colorectal carcinoma). [23]
C9orf64 is a gene located on chromosome 9, that in humans encodes the protein queuosine salvage protein. The function and biological process of the queuosine salvage protein is a queuosine-nucleotide N-glycosylase/hydrolase (QNG1) that releases queuine from Q-5'-monophosphate, and this activity is required for the salvage of queuine from exogenous Queuosine by S. pombe and HeLa cells. Some evidence from orthologs indicates it may be involved in tRNA processing and recycling. The most common mRNA contains 4 coding exons, and it has 2 additional alternatively spliced exons. C9orf64 has been found in 5 different splice variants.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
Coiled-coil domain containing protein 180 (CCDC180) is a protein that in humans is encoded by the CCDC180 gene. This protein is known to localize to the nucleus and is thought to be involved in regulation of transcription as are many proteins containing coiled-coil domains. As it is expressed most highly in the testes and is regulated by SRY and SOX transcription factors, it could be involved in sex determination.
Uncharacterized protein C12orf60 is a protein that in humans is encoded by the C12orf60 gene. The gene is also known as LOC144608 or MGC47869. The protein lacks transmembrane domains and helices, but it is rich in alpha-helices. It is predicted to localize in the nucleus.
Chromosome 6 open reading frame 62 (C6orf62), also known as X-trans-activated protein 12 (XTP12), is a gene that encodes a protein of the same name. The encoded protein is predicted to have a subcellular location within the cytosol.
Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.
Chromosome 9 open reading frame 43 is a protein that in humans is encoded by the C9orf43 gene. The gene is also known as MGC17358 and LOC257169. C9orf43 contains DUF 4647 and a polyglutamine repeat region although protein function is not well understood.
Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.
Cilia- and flagella-associated protein 299 (CFAP299), is a protein that in humans is encoded by the CFAP299 gene. CFAP299 is predicted to play a role in spermatogenesis and cell apoptosis.
Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.
C20orf202 is a protein that in humans is encoded by the C20orf202 gene. In humans, this gene encodes for a nuclear protein that is primarily expressed in the lung and placenta.
Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. The function of this protein is still poorly understood.
Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.
C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.
C12orf29 is a protein that in humans is encoded by chromosome 12 open reading frame 29. The gene is ubiquitously expressed in various tissues. The protein has 325 amino acids. The biological process of C12orf29 has been annotated as hematopoietic progenitor cell differentiation. The molecular and cellular functions of C12orf29 gene have not yet well understood by the scientific community.
Chromosome 12 Open Reading Frame 50 (C12orf50) is a protein-encoding gene which in humans encodes for the C12orf50 protein. The accession id for this gene is NM_152589. The location of C12orf50 is 12q21.32. It covers 55.42 kb, from 88429231 to 88373811, on the reverse strand. Some of the neighboring genes to C12orf50 are RPS4XP15, LOC107984542, and C12orf29. RPS4XP15 is upstream C12orf50 and is on the same strand. LOC107984542 and C12orf29 are both downstream. LOC107984542 is on the opposite strand while C12orf29 is on the same strand. C12orf50 has six isoforms. This page is focusing on isoform X1. C12orf50 isoform X1 is 1711 nucleotides long and has a protein with a length of 414 aa.
C4orf19 is a protein which in humans is encoded by the C4orf19 gene.
Chromosome 5 open reading frame 22 (c5orf22) is a protein-coding gene of poorly characterized function in Homo sapiens. The primary alias is unknown protein family 0489 (UPF0489).
Chromosome 12 open reading frame 71 (c12orf71) is a protein which in humans is encoded by c12orf71 gene. The protein is also known by the alias LOC728858.
Chromosome 5 Open Reading Frame 47, or C5ORF47, is a protein which, in humans, is encoded by the C5ORF47 gene. It also goes by the alias LOC133491. The human C5ORF47 gene is primarily expressed in the testis.
{{cite journal}}
: Cite journal requires |journal=
(help)