FAM114A1 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | FAM114A1 , Noxp20, family with sequence similarity 114 member A1 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 1915553 HomoloGene: 12259 GeneCards: FAM114A1 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Protein FAM114A1 also known as nervous system overexpressed protein 20 (NOXP20) is a protein that in humans is encoded by the FAM114A1 gene. [4] Orthologs of FAM114A1 can be found in organisms as taxonomically distant from Homo sapiens as Drosophila . However, as expected, human FAM114A! is more like that of primates than any other orthologs. FAM114A1 has one paralog, FAM114A2, which also encodes a protein of unknown function.
FAM114A1 is located on the short arm of Chromosome 4 (4.p14) in humans on the forward strand sense, it starts at base pair 38869354 and ends at 38947365. [4] Its mRNA has 4138 bp. The gene has the following neighbors on the same chromosome:
Genus and species | Common name | Accession number | Seq. length | Seq. identity | Seq. similarity |
Pan troglodytes | Chimpanzee | XP_517149.2 | 563 a | 99% | 99% |
Macaca mulatta | Rhesus macaque | EHH25809.1 | 563 a | 97% | 98% |
Nomascus leucogenys | Northern white-cheeked gibbon | XP_003258632.1 | 562 a | 97% | 98% |
Macaca fascicularis | Crab-eating macaque | EHH53615.1 | 563 a | 96% | 98% |
Pongo abelii | Sumatran orangutan | XP_002814719.1 | 563 a | 98% | 96% |
Equus caballus | Horse | XP_001498667.1 | 562 a | 88% | 93% |
Loxodonta africana | African bush elephant | XP_003411349.1 | 558 a | 85% | 91% |
Heterocephalus glaber | Naked mole rat | EHB16215.1 | 561 a | 86% | 90% |
Cavia porcellus | Guinea pig | XP_003471670.1 | 569 a | 86% | 90% |
Bos taurus | Cow | XP_588946.3 | 563 a | 84% | 90% |
Sus scrofa | Wild boar | XP_003128969.1 | 562 a | 84% | 90% |
Ailuropoda melanoleuca | Giant panda | XP_002928170.1 | 570 a | 83% | 89% |
Canis lupus familiaris | Dog | XP_536261.3 | 560 a | 83% | 88% |
Rattus norvegicus | Rat | XP_573600.2 | 567 a | 78% | 83% |
Mus musculus | Mouse | BAB30694.1 | 569 a | 77% | 83% |
Monodelphis domestica | Gray short-tailed opossum | XP_001374188.1 | 562 a | 72% | 81% |
Meleagris gallopavo | Wild turkey | XP_003205919.1 | 563 a | 64% | 76% |
Gallus gallus | Chicken | XP_423859.3 | 561 a | 64% | 75% |
Taeniopygia guttata | Zebra finch | XP_002189709.1 | 565 a | 61% | 75% |
Anolis carolinensis | Carolina anole | XP_003226293.1 | 539 a | 61% | 75% |
Danio rerio | Zebra fish | NP_001082947.1 | 546 a | 57% | 72% |
Xenopus (silurana) tropicalis | Western clawed frog | XP_002938704.1 | 424 a | 66% | 79% |
NOXP20 is over-expressed in the brain, [6] microarray data [7] using the Allen Brain Atlas provides evidence of that expression. Data from NCBI GEO Profile [8] shows that although FAM114A1 is expressed in the brain, its expression goes beyond the nervous tissue to include most of the tissue types in the human body. GEO Profiles also show that FAM114A1 is more expressed in mesenchyme stem cells than in undifferentiated stem cells. Further experiments [8] have shown that there are certain factors that affect the expression of FAM114A1. One example is the direct relation between the over-expression of CLDN-1 and the over-expression of FAM114A1.
NOXP20 is made up of 563 amino acids and weighs 60742 Da with an iso-electric point of 4.415999. Little is known about the details of this protein, however, there is a good deal of scientific predictions for the protein's structure and function. Like any other protein, this protein undergoes post-translational modifications. The modification that has been proven to be true is phosphorylation on two of the protein's amino acids 196 and 199.
There are several tools available to predict the secondary structure of a protein. One tool that combines the results of few of them is PELE on SDSC Biology WorkBench. [9] According to this tool, the protein's secondary structure is mostly alpha helices and coils with some beta strands around the structure.
There is not proof of any interactions that the FAM114A1 protein has with other proteins in the human body. However, an interaction between FAM114A1 and CDGSH iron sulfur domain 2 was detected in mice. [10] There is 77% identity and 83% similarity between the amino acids of NOXP20 in the two species (Homo sapiens and Mus musculus). Due to the close relation between the two species we can assume that NOXP20 has the same interaction in humans.
The exact function of NOXP20 is still not well understood. However, there has been evidence that the protein carries a caspase recruiting domain on it. [6] Knowing that caspase is involved in apoptosis, this information leads us to believe that NOXP20 could have a role in apoptosis and regulation of cell proliferation.
GPR113 is a gene that encodes the Probable G-protein coupled receptor 113 protein.
Protein KIAA1958 is a protein that in humans is encoded by the KIAA1958 gene. Orthologs of KIAA1958 go as far back in evolution to chordates, although, it is closer in homology to primates than any other orthologs. KIAA1958 has no known paralogs.
NBEAL1 is a protein that in humans is encoded by the NBEAL1 gene. It is found on chromosome 2q33.2 of Homo sapiens.
EVI5L is a protein that in humans is encoded by the EVI5L gene. EVI5L is a member of the Ras superfamily of monomeric guanine nucleotide-binding (G) proteins, and functions as a GTPase-activating protein (GAP) with a broad specificity. Measurement of in vitro Rab-GAP activity has shown that EVI5L has significant Rab2A- and Rab10-GAP activity.
PROSER2, also known as proline and serine rich 2, is a protein that in humans is encoded by the PROSER2 gene. PROSER2, or c10orf47(Chromosome 10 open reading frame 47), is found in band 14 of the short arm of chromosome 10 (10p14) and contains a highly conserved SARG domain. It is a fast evolving gene with two paralogs, c1orf116 and specifically androgen-regulated gene protein isoform 1. The PROSER2 protein has a currently uncharacterized function however, in humans, it may play a role in cell cycle regulation, reproductive functioning, and is a potential biomarker of cancer.
FAM76A is a protein that in Homo sapiens is encoded by the FAM76A gene. Notable structural characteristics of FAM76A include an 83 amino acid coiled coil domain as well as a four amino acid poly-serine compositional bias. FAM76A is conserved in most chordates but it is not found in other deuterostrome phlya such as echinodermata, hemichordata, or xenacoelomorpha—suggesting that FAM76A arose sometime after chordates in the evolutionary lineage. Furthermore, FAM76A is not found in fungi, plants, archaea, or bacteria. FAM76A is predicted to localize to the nucleus and may play a role in regulating transcription.
Chromosome 16 open reading frame 95 (C16orf95) is a gene which in humans encodes the protein C16orf95. It has orthologs in mammals, and is expressed at a low level in many tissues. C16orf95 evolves quickly compared to other proteins.
The coiled-coil domain containing 142 (CCDC142) is a gene which in humans encodes the CCDC142 protein. The CCDC142 gene is located on chromosome 2, spans 4339 base pairs and contains 9 exons. The gene codes for the coiled-coil domain containing protein 142 (CCDC142), whose function is not yet well understood. There are two known isoforms of CCDC142. CCDC142 proteins produced from these transcripts range in size from 743 to 665 amino acids and contain signals suggesting protein movement between the cytosol and nucleus. Homologous CCDC142 genes are found in many animals including vertebrates and invertebrates but not fungus, plants, protists, archea, or bacteria. Although the function of this protein is not well understood, it contains a coiled-coil domain and a RINT1_TIP1 motif located within the coiled-coil domain.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
OCC-1 is a protein, which in humans is encoded by the gene C12orf75. The gene is approximately 40,882 bp long and encodes 63 amino acids. OCC-1 is ubiquitously expressed throughout the human body. OCC-1 has shown to be overexpressed in various colon carcinomas. Novel splice variant of this gene was also detected in various human cancer types; in addition to encoding a novel smaller protein, OCC-1 gene produces a non-protein coding RNA splice variant lncRNA.
C14orf93 is a protein that is encoded in humans by the C14orf93 gene. It is a globular protein with a conserved C-terminus that is localized to the nucleus. While expressed relatively highly in all tissues except nervous tissue, it is expressed particularly highly in T cells and other immune tissues.
Uncharacterized protein Chromosome 16 Open Reading Frame 71 is a protein in humans, encoded by the C16orf71 gene. The gene is expressed in epithelial tissue of the respiratory system, adipose tissue, and the testes. Predicted associated biological processes of the gene include regulation of the cell cycle, cell proliferation, apoptosis, and cell differentiation in those tissue types. 1357 bp of the gene are antisense to spliced genes ZNF500 and ANKS3, indicating the possibility of regulated alternate expression.
CRACD-like protein. previously known as KIAA1211L is a protein that in humans is encoded by the CRACDL gene. It is highly expressed in the cerebral cortex of the brain. Furthermore, it is localized to the microtubules and the centrosomes and is subcellularly located in the nucleus. Finally, CRACDL is associated with certain mental disorders and various cancers.
Chromosome 3 Open Reading Frame 62 (C3orf62), is a protein that in humans is encoded by the C3orf62 gene. C3orf62 is a glycine depleted protein relative to the amount of glycine in proteins in the rest of the genome. C3orf62 has a KKXX-like motif and is predicted to be localized in the nucleus. Expression of C3orf62 remains highest in whole blood.
Chromosome 6 open reading frame 62 (C6orf62), also known as X-trans-activated protein 12 (XTP12), is a gene that encodes a protein of the same name. The encoded protein is predicted to have a subcellular location within the cytosol.
C17orf98 is a protein which in humans is coded by the gene c17orf98. The protein is derived from Homo sapiens chromosome 17. The C17orf98 gene consists of a 6,302 base sequence. Its mRNA has three exons and no alternative splice sites. The protein has 154 amino acids, with no abnormal amino acid levels. C17orf98 has a domain of unknown function (DUF4542) and is 17.6kDa in weight. C17orf98 does not belong to any other families nor does it have any isoforms. The protein has orthologs with high percent similarity in mammals and reptiles. The protein has additional distantly related orthologs across the metazoan kingdom, culminating with the sponge family.
Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.
Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.
C12orf29 is a protein that in humans is encoded by chromosome 12 open reading frame 29. The gene is ubiquitously expressed in various tissues. The protein has 325 amino acids. The biological process of C12orf29 has been annotated as hematopoietic progenitor cell differentiation. The molecular and cellular functions of C12orf29 gene have not yet well understood by the scientific community.
Secernin-3 (SCRN3) is a protein that is encoded by the human SCRN3 gene. SCRN3 belongs to the peptidase C69 family and the secernin subfamily. As a part of this family, the protein is predicted to enable cysteine-type exopeptidase activity and dipeptidase activity, as well as be involved in proteolysis. It is ubiquitously expressed in the brain, thyroid, and 25 other tissues. Additionally, SCRN3 is conserved in a variety of species, including mammals, birds, fish, amphibians, and invertebrates. SCRN3 is predicted to be an integral component of the cytoplasm.