C20orf144 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | C20orf144 , dJ63M2.6, chromosome 20 open reading frame 144 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | HomoloGene: 76810 GeneCards: C20orf144 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Chromosome 20 open reading frame 144 (c20orf144) is a human protein-encoding gene. [3] The human c20orf144 protein consists of 153 amino acids, with the first 150 amino acids being characterized as part of the Bcl-2 like protein of testis (Bclt) family (pfam 15318). [4]
The c20orf144 gene is located on the plus strand at 20q11.22 and spans 3,293 base pairs. [5] The gene contains two exons. [3] Of the plus strand, 572 nucleotides are antisense to parts of the human genes PXMP4 and NECAB3. [6] Other gene neighbors include ACTL10 and CBFA2T2. [7]
The encoded mRNA is 522 nucleotides in length (Accession: NM_080825) and there are no identified alternative splicings. [8] Human c20orf144 mRNA expression is enriched in the testis, specifically in the early and late spermatids. [9]
The human c20orf144 gene encodes a protein of 153 amino acids in length, and there are three disordered regions (Accession: NP_543015.1). [4] Amino acids 1-150 are a part of the Bclt protein family which is predicted to be involved in apoptosis. [10] The molecular weight is 17.2kDa and the theoretical isoelectric point is 11.47. [11] There are 21 more lysines and arginines, which are positively charged, than there are aspartates and glutamates, which are negatively charged.
The tertiary protein structure, produced by AlphaFold, [12] predicts the presence of 3 α helices, and the absence of β sheets in human c20orf144.
Analysis of the localization of human c20orf144 and many mammalian orthologs predicts localization of c20orf144 in the nucleus, with 78.3% confidence for the human protein. [13]
Modification | Modification Site in Human C20orf144 |
N-Myristoylation [13] [14] | 2G |
Protein Kinase C Phosphorylation [15] | 6S |
Casein Kinase 2 Phosphorylation [15] | 87S |
Non-Specific Phosphorylation [15] | 117S |
O-Glycosylation [16] | 117S |
Protein Kinase C Phosphorylation [15] | 123S |
The evolutionary rate of C20orf144 is comparable to the high rate of evolution of fibrinogen alpha chain, suggesting the protein is evolving quickly.
Orthologs of the c20orf144 gene in Homo sapiens are found in many mammals excluding monotremes. [17] As shown in Table 2, marsupials are the most distantly related organisms to humans in which proteins encoded by human c20orf144 gene orthologs are found, suggesting that C20orf144 first appeared approximately 160 million years ago.
Genus and Species | Common Name | Order | Protein Accession # | Median Date of Divergence (MYA) [18] | Sequence Length | Sequence Identity (%) | Sequence Similarity (%) |
Homo sapiens | Human | Primata | NP_543015.1 | 0 | 153 | 100 | 100 |
Macaca mulatta | Rhesus Monkey | Primata | XP_001105397.1 | 28.9 | 153 | 86.3 | 90.8 |
Piliocolobus tephrosceles | Ugandan Red Colobus | Primata | XP_023076213.1 | 28.9 | 141 | 63.7 | 66.1 |
Jaculus jaculus | Lesser Egyptian Jerboa | Rodentia | XP_045011648.1 | 87 | 176 | 46.4 | 55.8 |
Myodes glareolus | Bank Vole | Rodentia | XP_048287479.1 | 87 | 197 | 42.1 | 51.8 |
Mus musculus | House Mouse | Rodentia | NP_083581.1 | 87 | 197 | 41.4 | 49.8 |
Camelus ferus | Wild Bactrian Camel | Artiodactyla | XP_032318023.1 | 94 | 174 | 54 | 64.4 |
Equus caballus | Domestic Horse | Perissodactyla | XP_023482143.1 | 94 | 178 | 45.7 | 56 |
Monodon monoceros | Narwhal | Artiodactyla | XP_029075207.1 | 94 | 181 | 42.9 | 50.5 |
Physeter catodon | Sperm Whale | Artiodactyla | XP_023984368.1 | 94 | 148 | 40.8 | 48.4 |
Prionailurus bengalensis | Leopard Cat | Carnivora | XP_043458511.1 | 94 | 179 | 52 | 60.9 |
Ursus arctos | Brown Bear | Carnivora | XP_026358671.1 | 94 | 184 | 51.6 | 61.4 |
Eumetopias jubatus | Steller Sea Lion | Carnivora | XP_027974622.1 | 94 | 184 | 47.3 | 58.1 |
Rousettus aegyptiacus | Egyptian Fruit Bat | Chiroptera | XP_016017694.2 | 94 | 175 | 51.4 | 62.7 |
Rhinolophus ferrumenquinum | Greater Horseshoe Bat | Chiroptera | XP_032951343.1 | 94 | 191 | 40.2 | 51.5 |
Pteropus vampyrus | Large Flying Fox | Chiroptera | XP_023377960.1 | 94 | 209 | 40 | 50.5 |
Choloepus didactylus | Southern Two-Toed Sloth | Pilosa | XP_037668100.1 | 99 | 188 | 47.9 | 57.4 |
Gracilinanus agilis | Agile Gracile Mouse Opossum | Didelphimorphia | XP_044517537.1 | 160 | 169 | 37.9 | 49.7 |
Dromiciops gliroides | Monito del Monte | Microbiotheria | XP_043845608.1 | 160 | 170 | 37 | 50.8 |
Sarcophilus harrisii | Tasmanian Devil | Dasyuromorphia | XP_031809718.1 | 160 | 160 | 36.4 | 50 |
In a study of 28 breast cancer patients, missense mutations in c20orf144 were found in approximately 33% of patients, suggesting a potential role for c20orf144 in the development of breast cancer. [19] Furthermore, c20orf144 is listed in primary renal proximal tubule epithelial cells as a top candidate hit in an siRNA screen, which silences targeted genes. [20] The silencing of c20orf144 in cells exposed to Shiga toxin resulted in metabolic activity that was greater than or equal to 90% of that in a typical cell.
CXorf49 is a protein, which in humans is encoded by the gene chromosome X open reading frame 49(CXorf49).
Chromosome 16 open reading frame 95 (C16orf95) is a gene which in humans encodes the protein C16orf95. It has orthologs in mammals, and is expressed at a low level in many tissues. C16orf95 evolves quickly compared to other proteins.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
Glutamate Rich Protein 2 is a protein in humans encoded by the gene ERICH2. This protein is expressed heavily in male tissues specifically in the testes, and proteins are specifically found in the nucleoli fibrillar center and the vesicles of these testicular cells. The protein has multiple protein interactions which indicate that it may play a role in histone modification and proper histone functioning.
Chromosome 8 open reading frame 58 is an uncharacterised protein that in humans is encoded by the C8orf58 gene. The protein is predicted to be localized in the nucleus.
C17orf98 is a protein which in humans is coded by the gene c17orf98. The protein is derived from Homo sapiens chromosome 17. The C17orf98 gene consists of a 6,302 base sequence. Its mRNA has three exons and no alternative splice sites. The protein has 154 amino acids, with no abnormal amino acid levels. C17orf98 has a domain of unknown function (DUF4542) and is 17.6kDa in weight. C17orf98 does not belong to any other families nor does it have any isoforms. The protein has orthologs with high percent similarity in mammals and reptiles. The protein has additional distantly related orthologs across the metazoan kingdom, culminating with the sponge family.
Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.
Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.
LOC101059915 is a protein, which in humans is encoded by the LOC101059915 gene. It is located on the X chromosome and has restricted expression in the testis.
Chromosome 4 open reading frame 51 (C4orf51) is a protein which in humans is encoded by the C4orf51 gene.
Cilia- and flagella-associated protein 299 (CFAP299), is a protein that in humans is encoded by the CFAP299 gene. CFAP299 is predicted to play a role in spermatogenesis and cell apoptosis.
Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene. The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.
Small integral membrane protein 11 is a protein which in humans is encoded by the SMIM11 gene.
C1orf122 is a gene in the human genome that encodes the cytosolic protein ALAESM.. ALAESM is present in all tissue cells and highly up-regulated in the brain, spinal cord, adrenal gland and kidney. This gene can be expressed up to 2.5 times the average gene in its highly expressed tissues. Although the function of C1orf122 is unknown, it is predicted to be used for mitochondria localization.
C12orf24 is a gene in humans that encodes a protein known as FAM216A. This gene is primarily expressed in the testis and brain, but has constitutive expression in 25 other tissues. FAM216A is an intracellular protein that has been predicted to reside within the nucleus of cells. The exact function of C12orf24 is unknown. FAM216A is highly expressed in Sertoli cells of the testis as well as different stage spermatids.
Chromosome 4 open reading frame 54 is a protein that in humans is coded by the c4orf54 gene. This gene is also known as FOPV and LOC285556. This protein is mostly expressed in the nucleus of muscle cells. Orthologs are found in vertebrates but not invertebrates.
Chromosome 4 open reading frame 50 is a protein that in humans is encoded by the C4orf50 gene. The protein localizes in the nucleus. C4orf50 has orthologs in vertebrates but not invertebrates
C13orf42 is a protein which, in humans, is encoded by the gene chromosome 13 open reading frame 42 (C13orf42). RNA sequencing data shows low expression of the C13orf42 gene in a variety of tissues. The C13orf42 protein is predicted to be localized in the mitochondria, nucleus, and cytosol. Tertiary structure predictions for C13orf42 indicate multiple alpha helices.
Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.
Chromosome 5 Open Reading Frame 47, or C5ORF47, is a protein which, in humans, is encoded by the C5ORF47 gene. It also goes by the alias LOC133491. The human C5ORF47 gene is primarily expressed in the testis.