FAM163A | |||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | FAM163A , C1orf76, NDSP, family with sequence similarity 163 member A | ||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | OMIM: 611727 MGI: 3618859 HomoloGene: 18306 GeneCards: FAM163A | ||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||
|
FAM163A, also known as cebelin and neuroblastoma-derived secretory protein (NDSP) is a protein that in humans is encoded by the FAM163A gene. [4] This protein has been implicated in promoting proliferation and anchorage-independent growth of neuroblastoma cancer cells. [5] [6] In addition, this protein has been found to be up-regulated in the lung tissue of chronic smokers. [7] FAM163A is found on human chromosome 1q25.2; its protein product is 167 amino acids long. FAM163A contains a very highly conserved signal peptide sequence, coded for by the first ~37 amino acids in its sequence; albeit only conserved in eukaryotes, the most distant of which being the Japanese Rice Fish.
FAM163A is approximately 2,927 base pairs long, containing five exons. While no domains of unknown function have been documented, the coding region of the gene is very short (~500 base pairs), with an exceptionally long and as-of-yet uncharacterized 3' untranslated region (UTR). FAM163A is located on the positive strand of chromosome 1, in loci126860, near three other genes: TOR1AIP1, TOR1AIP2, and TDRD5. [8]
mRNA levels were tested in 45 neuroblastoma tumor samples; in 43 of these samples, elevated levels of NDSP were found, as well as in five bone marrow samples. NDSP is associated with increased risk for development of cancer metastasis in bone marrow as well as neural tissue. [5] RNA inhibition techniques applied against NDSP decreased cellular proliferation and cancer cell colony formation. Further, this protein has been determined to act as a growth factor through an ERK-mediated pathway. [6]
Several programs can be used to generate possible splice variants of the Fam163A mRNA. The Ensembl database yields one possible splice variant, which coded for the FAM163A protein. [10] NCBI's Aceview yields 23 possible splice variants, but no experimental evidence is associated with these. [11]
The human protein has a molecular weight of 17.6 kilodaltons (kDa), and an isoelectric point of 5.56. [12] When compared across orthologs, these values are well conserved. Lastly the ExPASy program PSORTII predicts a 39.1% chance of the protein's localization in the nucleus; this being the highest probability for any location. [13]
Localization Area | Chances of Localization (%) |
---|---|
Nucleus | 39.1% |
Cytoplasm | 21.7% |
Extracellular Matrix | 17.4% |
Mitochondria | 17.4% |
Cytoskeleton | 4.3% |
The following data was generated using the NCBI BLAST program. [14] An interesting motif in all of these sequences is the exceptional conservation of the signal peptide sequence; Vasudevan, et al.'s studies included bioinformatic analysis that compared a paralogous protein (FAM163B) in humans and the FAM163A ortholog in mice. [5] Their results aligned with the analysis of the orthologs presented below; while many, many more orthologs exist for FAM163A in species not listed, the Japanese Rice Fish is the last orthologous species that shares the signal peptide sequence, with the next closest result having a percent identity of less than 30% and no putative domains of conservation.
Genus and species | Common name | Evolutionary time to human divergence (MYA) | Accession # | Protein sequence length | Sequence identity to human protein (%) | Sequence similarity to human protein (%) |
---|---|---|---|---|---|---|
Homo sapiens | Human | - | NP_775780.1 | 167aa | - | - |
Homo sapiens | Human (FAM163B - Paralog) | - | NP_001073984 | 166aa | 42% | 52% |
Gorilla gorilla gorilla | Gorilla | 8.8 | XP_004028035 | 167aa | 99% | 98% |
Felis catus | Cat | 94.2 | XP_003999284 | 166aa | 92% | 92% |
Pteropus alecto | Black Flying Fox | 94.2 | XP_006907838 | 167aa | 89% | 90% |
Odobenus rosmarus divergens | Pacific Walrus | 94.2 | XP_004398165 | 166aa | 88% | 89% |
Dasypus novemcinctus | 9-Banded Armadillo | 104.2 | XP_004461936 | 165aa | 87% | 88% |
Ochotona princeps | American Pika | 92.3 | XP_004598689 | 165aa | 86% | 89% |
Mus musculus | Mouse | 92.3 | Q8CAA5 | 168aa | 85% | 87% |
Alligator mississippiensis | American Alligator | 296 | XP_006276882 | 161aa | 66% | 74% |
Pelodiscus sinensis | Chinese Soft-Shelled Turtle | 296 | XP_004461936 | 164aa | 64% | 73% |
Gallus gallus | Chicken | 296 | XP_001234382 | 159aa | 61% | 67% |
Ophiophagus hannah | King Cobra | 296 | ETE64717 | 166aa | 53% | 65% |
Danio rerio | Zebrafish | 400.1 | XP_002660900 | 150aa | 50% | 63% |
Xiphophorus maculatus | Southern Platyfish | 400.1 | XP_005800930 | 163aa | 48% | 60% |
Oryzias latipes | Japanese Rice Fish | 400.1 | XP_004067975 | 163aa | 46% | 60% |
FAM163A has only one paralog: FAM163B, located on chromosome 9q34.2. Comparison between the two proteins reveals that the signal peptide sequence is identical; using the CLUSTALW program through SDSC's Biology Workbench, it was possible to visualize the sequences' identity. [15]
FAM163A is ubiquitously expressed at very low levels in most tissues of the body; expression is higher in juveniles, and as previously seen, in chronic smokers' lungs and neuroblastoma cells. [16]
Uncharacterized protein KIAA1109 is a protein that in humans is encoded by the KIAA1109 gene.
HIKESHI is a protein important in lung and multicellular organismal development that, in humans, is encoded by the HIKESHI gene. HIKESHI is found on chromosome 11 in humans and chromosome 7 in mice. Similar sequences (orthologs) are found in most animal and fungal species. The mouse homolog, lethal gene on chromosome 7 Rinchik 6 protein is encoded by the l7Rn6 gene.
MORN1 containing repeat 1, also known as Morn1, is a protein that in humans is encoded by the MORN1 gene.
KIAA1704, also known as LSR7, is a protein that in humans is encoded by the GPALPP1 gene. The function of KIAA1704 is not yet well understood. KIAA1704 contains one domain of unknown function, DUF3752. The protein contains a conserved, uncharged, repeated motif GPALPP(GF) near the N terminus and an unusual, conserved, mixed charge throughout. It is predicted to be localized to the nucleus.
Chromosome 20 open reading frame 111, or C20orf111, is the hypothetical protein that in humans is encoded by the C20orf111 gene. C20orf111 is also known as Perit1, HSPC207, and dJ1183I21.1. It was originally located using genomic sequencing of chromosome 20. The National Center for Biotechnology Information, or NCBI, shows that it is located at q13.11 on chromosome 20, however the genome browser at the University of California-Santa Cruz (UCSC) website shows that it is at location q13.12, and within a million base pairs of the adenosine deaminase locus. It was also found to have an increase in expression in cells undergoing hydrogen peroxide(H
2O
2)-induced apoptosis. After analyzing the amino acid content of C20orf111, it was found to be rich in serine residues.
Glutamine Serine Rich Protein 1 or QSER1 is a protein encoded by the QSER1 gene.
Coiled-coil domain-containing protein 144A is a protein that in humans is encoded by the CCDC144A gene. An alias of this gene is called KIAA0565. There are four members of the CCDC family: CCDC 144A, 144B, 144C and putative CCDC 144 N-terminal like proteins.
Protein FAM214A, also known as protein family with sequence similarity 214, A (FAM214A) is a protein that, in humans, is encoded by the FAM214A gene. FAM214A is a gene with unknown function found at the q21.2-q21.3 locus on Chromosome 15 (human). The protein product of this gene has two conserved domains, one of unknown function (DUF4210) and another one called Chromosome_Seg. Although the function of the FAM214A protein is uncharacterized, both DUF4210 and Chromosome_Seg have been predicted to play a role in chromosome segregation during meiosis.
PROSER2, also known as proline and serine rich 2, is a protein that in humans is encoded by the PROSER2 gene. PROSER2, or c10orf47(Chromosome 10 open reading frame 47), is found in band 14 of the short arm of chromosome 10 (10p14) and contains a highly conserved SARG domain. It is a fast evolving gene with two paralogs, c1orf116 and specifically androgen-regulated gene protein isoform 1. The PROSER2 protein has a currently uncharacterized function however, in humans, it may play a role in cell cycle regulation, reproductive functioning, and is a potential biomarker of cancer.
Chromosome 9 open reading frame 152 is a protein that in humans is encoded by the C9orf152 gene. The exact function of the protein is not completely understood.
C6orf222 is a protein that in humans is encoded by the C6orf222 gene (6p21.31). C6orf222 is conserved in mammals, birds and reptiles with the most distant ortholog being the green sea turtle, Chelonia mydas. The C6orf222 protein contains one mammalian conserved domain: DUF3293. The protein is also predicted to contain a BH3 domain, which has predicted conservation in distant orthologs from the clade Aves.
Shortage In Chiasmata 1, also known as SHOC1, is a protein that in humans is encoded by the SHOC1 gene.
Transmembrane protein 268 is a protein that in humans is encoded by TMEM268 gene. The protein is a transmembrane protein of 342 amino acids long with eight alternative splice variants. The protein has been identified in organisms from the common fruit fly to primates. To date, there has been no protein expression found in organisms simpler than insects.
C9orf135 is a gene that encodes a 229 amino acid protein. It is located on Chromosome 9 of the Homo sapiens genome at 9q12.21. The protein has a transmembrane domain from amino acids 124-140 and a glycosylation site at amino acid 75. C9orf135 is part of the GRCh37 gene on Chromosome 9 and is contained within the domain of unknown function superfamily 4572. Also, c9orf135 is known by the name of LOC138255 which is a description of the gene location on Chromosome 9.1.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
Uncharacterized protein C2orf73 is a protein that in humans is encoded by the C2orf73 gene. The protein is predicted to be localized to the nucleus.
Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.
Uncharacterized protein C17orf78 is a protein encoded by the C17orf78 gene in humans. The name denotes the location of the parent gene, being at the 78th open reading frame, on the 17th human chromosome. The protein is highly expressed in the small intestine, especially the duodenum. The function of C17orf78 is not well defined.
Coiled-Coil Domain Containing 190, also known as C1orf110, the Chromosome 1 Open Reading Frame 110, MGC48998 and CCDC190, is found to be a protein coding gene widely expressed in vertebrates. RNA-seq gene expression profile shows that this gene selectively expressed in different organs of human body like lung brain and heart. The expression product of c1orf110 is often called Coiled-coil domain-containing protein 190 with a size of 302 aa. It may get the name because a coiled-coil domain is found from position 14 to 72. At least 6 spliced variants of its mRNA and 3 isoforms of this protein can be identified, which is caused by alternative splicing in human.
C4orf36 is a protein that in humans is encoded by the c4orf36 gene.