FAM63B is a protein which in humans is encoded by the gene FAM63B. This gene is highly expressed in humans. The FAM63B gene is also highly conserved throughout evolutionary history. The discovered function of FAM63B is an interaction with the kinesin-1 light chain and the transportation of vaccinia virus from the nucleus to the cell periphery.
FAM63B is located at 15q21.3-q22.1, [1] spanning 90,707 base pairs on chromosome 15. [2]
The full name of FAM63B is family with sequence similarity 63, member B. [3] FAM63B is also listed by its alias, KIAA1164, in some publications. [4]
The FAM63B gene encodes a primary transcript that can be alternatively spliced into 9 protein variants. FAM63B variant a is the most common isoform found in humans. [2]
Variant | Length (amino acids) | Exon Count | Molecular Weight (kdal) | Isoelectric Point |
---|---|---|---|---|
a | 621 | 9 | 67.1 | 4.24 |
b | 620 | 9 | 67.0 | 4.24 |
x1 | 639 | 10 | 69.2 | 4.40 |
x2 | 638 | 10 | 69.1 | 4.40 |
x3 | 605 | 10 | 65.2 | 4.41 |
x4 | 587 | 9 | 63.1 | 4.25 |
x5 | 364 | 6 | 38.0 | 4.50 |
x6 | 351 | 8 | 40.2 | 4.72 |
x7 | 342 | 8 | 39.1 | 4.45 |
FAM63B is a member of the Pfam super family, and contains a domain of unknown function (DUF544) that is homologous within the protein family. [2] FAM63B protein variant an also contains a bipartite tryptophan binding motif from W476 to W533. [6] Variant a of the protein also contains a hydrophobic stretch of alanine from 567 to 574 and a mixed charge sequence from residue 598 to 617. [5] FAM63B protein may contain a signal sequence specifying return to the endoplasmic reticulum (KDEL) from residue 607 to 621 in variant a.
The secondary structure of FAM63B is a combination of coils, some α-helices, and few β-sheets. [5] [7] [8] The Phyre 2 program predicts α-helices in 23% of the protein, β-strands in 9% of the protein, and the remaining 59% of the protein as disordered. [7] The disordered regions coincide with the coiled regions predicted by other programs, and this results in the long stretch of coiled protein beginning at the N-terminus. According to the SOUSI program, there is a 16-amino acid-long span from residues 265 to 280 of FAM63B that could be a transmembrane sequence. [9] However, transmembrane sequences generally need to be at least 20 amino acids long in order to be stable in the membrane, so a transmembrane sequence is unlikely. Therefore, FAM63B is not fixed in the membrane of any organelle and is free to move through the cell and between organelles.
Not much is known about the tertiary structure of FAM63B. A predicted folding is shown.
Post-translational modifications of the FAM63B protein. [9]
Post-translational Modification | Site(s) | Impact on Protein |
---|---|---|
Acetylation | Ser3 | Stability, localization, metabolism, apoptosis, ribosome recognition for synthesis |
Lysine Glycation | Lys88, Lys251, Lys280, Lys282, Lys332, Lys393, Lys398, Lys454, Lys547 | Impaired function, changed characteristics |
Phosphorylation | Ser7, Ser21, Ser25, Ser26, Ser62, Ser66, Ser68, Ser72, Ser90, Ser94, Ser111, Ser148, Ser153, Ser158, Ser160, Ser165, Ser170, Ser175, Ser188, Ser193, Ser233, Ser396, Ser440, Ser499, Ser541, Ser558, Ser587, Ser589, Ser590, Ser594, Ser597, Thr48, Thr255, Thr344, Thr453, Tyr505 | Conformation change, turn enzymatic activity on/off |
Picornaviral Cleavage | Glu195, Gln535 | Cleavage of polyprotein, degradation |
O-GlcNAc | Ser3, Ser21, Ser49, Ser62, Ser66, Ser68, Ser80, Ser152, Ser153, Ser158, Ser170, Ser499, Ser575, Ser587, Ser589, Ser590, Thr144, Thr576 | Nucleocytoplasmic location |
FAM63B has predicted NES (nuclear export signals) at Val274 and Leu277. [10] Also, a NLS (nuclear localization signal) is predicted for FAM63B at RKRK at residue 599. [9] In agreement, Reinhardt's method for cytoplasmic/nuclear discrimination predicts FAM63B to be located in the nucleus with a reliability of 76.7%. The presence of both NLS and NES signals and O-GlcNAc post-translational modification of FAM63B supports the protein's location in both the nucleus and cytoplasm and the discovered protein function as a shuttle for vaccinia virus between the nucleus and the cell periphery.
FAM63B has moderately-high to high expression and is constitutively expressed. FAM63B is likely ubiquitously expressed in humans. [11]
Expression of FAM63B is high in the embryonic stem cells and differentiated tissues but low or off in embryoid bodies and other progenitor cells, such as the multipotent mesenchymal stem cells. It is likely that FAM63B is expressed during pluripotency and unipotency but is not important for differentiation, as is occurring in embryoid bodies, mesenchymal stem cells, and other progenitor cells.
The promoter of FAM63B is GXP_5885, located on the positive strand of chromosome 15 from (58770692, 58771462) and is 711 base pairs long. [12]
FAM63B is shown to interact with one protein, KLC-1. [13] KLC-1, kinesin light chain 1, is a protein which recruits kinesin-1 via its cargo binding light chain and contains a bipartite tryptophan binding motif. [13] This motif is present in a vaccinia virus integral membrane protein, A36, that is required for transport of the virus from the perinuclear space to the cell periphery. [13] In the absence of A36, proteins with a bipartite tryptophan binding motif can interact with the kinesin light chain, recruit KLC-1, and promote virus transport from the nucleus to the cytoplasm. [13]
The discovered function of FAM63B protein is a transporter of vaccinia virus in the human genome. FAM63B contains a bipartite tryptophan binding motif between W476 and W533. [13] The motif also contains a Q residue at the +2 position, which was found to be a frequent occurrence in proteins that bind KLC-1 or KLC-2. [13] FAM63B is among proteins studied that can rescue virus transport to the cell periphery when expressed in A36-deficient cells, successfully replacing the cytoplasmic domain A36 of vaccinia. [13]
The specific pathology of FAM63B is unknown.
FAM63B is part of four networks regulated by miRNA, three of which are linked to neuronal differentiation and dopaminergic gene expression. [14] These findings indicate that FAM63B could be used as a biomarker for the detection and treatment of schizophrenia. [14] Furthermore, aberrant methylation of FAM63B may play a role in the development of schizophrenia. [14] FAM63B has also been ranked 13 of 25 on a list of associated genes relevant to arthritis. [15]
FAM63B has one paralog, FAM63A, which is a gene of unknown function. FAM63A gene encodes a protein that is 469 amino acids long and 76% similar to FAM63B. [16]
FAM63B has been found in all multicellular and unicellular eukaryotes, including plants but excluding protists and fungi. The gene has also been found in archaea but not bacteria. [17]
Genus & species | Common Name | Date of Divergence from Humans (MYA) | Accession Number | Sequence Length (amino acids) | Sequence Similarity to Human Protein (%) | Clade |
---|---|---|---|---|---|---|
Pan troglodytes | Chimpanzee | 6.2 | XP_510443.2 | 621 | 100 | Mammalia |
Microtus ochrogaster | Prairie vole | 90.1 | XP_005347720.1 | 597 | 84 | Mammalia |
Ursus maritimus | Polar bear | 95 | XP_008704293.1 | 591 | 97 | Mammalia |
Acinonyx jubatus | Cheetah | 95 | XP_014926357.1 | 488 | 96 | Mammalia |
Pelodiscus sinensis | Green sea turtle | 320.5 | XP_014434471.1 | 408 | 95 | Reptilia |
Zonotrichia albicollis | White-throated sparrow | 320.5 | XP_014123064.1 | 326 | 91 | Aves |
Columba livia | Rock dove | 320.5 | XP_005511195.1 | 340 | 91 | Aves |
Chrysemys picta bellii | Western painted turtle | 320.5 | XP_008162326.1 | 566 | 86 | Reptilia |
Melopsittacus undulatus | Budgerigar | 320.5 | XP_005145999.1 | 472 | 86 | Aves |
Xenopus tropicalis | Western clawed frog | 354.4 | XP_002937714.1 | 354 | 87 | Amphibia |
Latimeria chalumnae | Western Indian Ocean coelacanth | 413.69 | XP_005998789.1 | 652 | 83 | Sarcopterygii |
Lepisosteus oculatus | Spotted gar | 436.8 | XP_015198676.1 | 642 | 82 | Actinopterygii |
Hydra vulgaris | Hydra | 902 | XP_012556960.1 | 507 | 65 | Hydrozoa |
Octopus bimaculoides | California two-spot octopus | 903 | XP_014779548.1 | 981 | 58 | Cephalopoda |
Crassostrea gigas | Pacific oyster | 903 | XP_011440367.1 | 569 | 71 | Bivalvia |
Haemonchus contortus | Barber's pole worm | 903 | CDJ97151.1 | 437 | 55 | Secernentea |
Trichoplax adhaerens | Trichoplax | 936 | XP_002108532.1 | 308 | 75 | Placozoa |
Solanum pennellii | Tomato | 1570.5 | XP_015085752.1 | 686 | 65 | Angiosperms |
Sesamum indicum | Sesame | 1570.5 | XP_011071984.1 | 738 | 65 | Angiosperms |
Thermoplasmatales archaeon | BRNA1 | 4250 | WP_048164282.1 | 1063 | 39 | Archaea |
The most distant homolog of FAM63B is found in Thermoplasmatales archaeon, an archaea that diverged from the human gene 4.25 billion years ago. [17] [18]
FAM63B is a member of the Pfam super family, and contains a domain of unknown function (DUF544) homologous within the protein family. [17] This region of the protein is highly conserved through FAM63B homologs, as is the bipartite tryptophan binding motif of FAM63B and the C-terminus signal sequence.
The phylogenetic tree below shows a time calibration for the evolution of FAM63B.
SV40 large T antigen is a hexamer protein that is a dominant-acting oncoprotein derived from the polyomavirus SV40. TAg is capable of inducing malignant transformation of a variety of cell types. The transforming activity of TAg is due in large part to its perturbation of the retinoblastoma (pRb) and p53 tumor suppressor proteins. In addition, TAg binds to several other cellular factors, including the transcriptional co-activators p300 and CBP, which may contribute to its transformation function. Similar proteins from related viruses are known as large tumor antigen in general.
C2CD4D, or C2 calcium-dependent domain-containing protein 4D is a protein product of the human genome. The gene that codes for this protein is found on chromosome 1, from 150,076,963 to 150,079,657. The gene contains 2 exons and encodes 353 amino acids. Synonyms for C2CD4D are "FAM148D" and NP_001129475. C2CD4D contains a conserved metal binding domain that is a known as Protein kinase C conserved region 2, subgroup 1. This motif is known to be a member of the C2 superfamily, which is present in phospholipases, protein kinases C, and synaptotagmins. The amino acid sequence of C2CD4D can be accessed at Prior to any post translational modification, C2CD4D has a molecular weight of 37.6 kdal. Although scientists have not yet determined where C2CD4D functions within the cell, C2CD4D has a predicted isoelectric point of 11.636 which severely limits the places in which it can be effective. In addition, C2CD4D does not contain any predicted transmembrane domains or any predicted signal peptides.
Transmembrane protein 33 is a protein that in humans, is encoded by the TMEM33 gene, also known as SHINC3. Another name for the TMEM33 protein is DB83.
Coiled-coil domain-containing protein 138, also known as CCDC138, is a human protein encoded by the CCDC138 gene. The exact function of CCDC138 is unknown.
Transmembrane and coiled-coil domain 6, TMCO6, is a protein that in humans is encoded by the TMCO6 gene with aliases of PRO1580, HQ1580 or FLJ39769.1.
GPATCH11 is a protein that in humans is encoded by the G-patch domain containing protein 11 gene. The gene has four transcript variants encoding two functional protein isoforms and is expressed in most human tissues. The protein has been found to interact with several other proteins, including two from a splicing pathway. In addition, GPATCH11 has orthologs in all taxa of the eukarya domain.
Septin 4 is a protein that in humans is encoded by the gene SEPTIN4. The gene is 2,698 base pairs long, contains one gt-ag intron, and is oriented on the minus strand of DNA. The pre-messenger has 2 exons, and the predicted protein is 570 amino acids long. There are currently no experimental structures for the SEPTIN4 gene product with a sequence identity >90%.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
Agnoprotein is a protein expressed by some members of the polyomavirus family from a gene called the agnogene. Polyomaviruses in which it occurs include two human polyomaviruses associated with disease, BK virus and JC virus, as well as the simian polyomavirus SV40.
Uncharacterized protein C12orf60 is a protein that in humans is encoded by the C12orf60 gene. The gene is also known as LOC144608 or MGC47869. The protein lacks transmembrane domains and helices, but it is rich in alpha-helices. It is predicted to localize in the nucleus.
Retrotransposon Gag Like 6 is a protein encoded by the RTL6 gene in humans. RTL6 is a member of the Mart family of genes, which are related to Sushi-like retrotransposons and were derived from fish and amphibians. The RTL6 protein is localized to the nucleus and has a predicted leucine zipper motif that is known to bind nucleic acids in similar proteins, such as LDOC1.
BEND2 is a protein that in humans is encoded by the BEND2 gene. It is also found in other vertebrates, including mammals, birds, and reptiles. The expression of BEND2 in Homo sapiens is regulated and occurs at high levels in the skeletal muscle tissue of the male testis and in the bone marrow. The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in chromatin modification and regulation.
Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.
WD Repeat and Coiled-coiled containing protein (WDCP) is a protein which in humans is encoded by the WDCP gene. The function of the protein is not completely understood, but WDCP has been identified in a fusion protein with anaplastic lymphoma kinase found in colorectal cancer. WDCP has also been identified in the MRN complex, which processes double-stranded breaks in DNA.
ZNF337, also known as zinc finger protein 337, is a protein that in humans is encoded by the ZNF337 gene. The ZNF337 gene is located on human chromosome 20 (20p11.21). Its protein contains 751 amino acids, has a 4,237 base pair mRNA and contains 6 exons total. In addition, alternative splicing results in multiple transcript variants. The ZNF337 gene encodes a zinc finger domain containing protein, however, this gene/protein is not yet well understood by the scientific community. The function of this gene has been proposed to participate in a processes such as the regulation of transcription (DNA-dependent), and proteins are expected to have molecular functions such as DNA binding, metal ion binding, zinc ion binding, which would be further localized in various subcellular locations. While there are no commonly associated or known aliases, an important paralog of this gene is ZNF875
C14orf119 is a protein that in humans is encoded by the c14orf119 gene. The c14orf119 protein is predicted to be localized in the nucleus. Additionally, c14orf119 expression is decreased in individuals with systemic lupus erythematosus (SLE) when compared with healthy individual and is increased in individuals with various types of lymphomas when compared to healthy individuals.
SMIM19, also known as Small Integral Membrane Protein 19, encodes the SMIM19 protein. SMIM19 is a confirmed single-pass transmembrane protein passing from outside to inside, 5' to 3' respectively. SMIM19 has ubiquitously high to medium expression with among varied tissues or organs. The validated function of SMIM19 remains under review because of on sub-cellular localization uncertainty. However, all linked proteins research to interact with SMIM19 are associated with the endoplasmic reticulum (ER), presuming SMIM19 ER association
C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.
Transmembrane epididymal protein 1 is a transmembrane protein encoded by the TEDDM1 gene. TEDDM1 is also commonly known as TMEM45C and encodes 273 amino acids that contains six alpha-helix transmembrane regions. The protein contains a 118 amino acid length family of unknown function. While the exact function of TEDDM1 is not understood, it is predicted to be an integral component of the plasma membrane.
Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.