FAM76A is a protein that in Homo sapiens is encoded by the FAM76A gene. [5] Notable structural characteristics of FAM76A include an 83 amino acid coiled coil domain as well as a four amino acid poly-serine compositional bias. [6] FAM76A is conserved in most chordates but it is not found in other deuterostrome phlya such as echinodermata, hemichordata, or xenacoelomorpha—suggesting that FAM76A arose sometime after chordates in the evolutionary lineage. Furthermore, FAM76A is not found in fungi, plants, archaea, or bacteria. [7] FAM76A is predicted to localize to the nucleus and may play a role in regulating transcription. [8]
FAM76A is located on the (+) strand of the short arm of chromosome 1 (1p35.3), with the genomic sequence starting at 27725979 and ending at 27762915. The coding region is made up of 3462 base pairs and is translated into 341 amino acids. [5] [10]
Genes that flank FAM76A on the telomeric side include IFI6, CHMP1AP1, and RPEP3, while genes that flank FAM76A on the centromeric side include STX12, PPP1R8, and L0C105376894. [5]
In Caenorhabditis elegans , FAM76A is referred to as K04F10.7. [11] Outside of this, FAM76A does not have any significant alternative names.
In Homo sapiens, the FAM76A gene produces 9 different mRNAs, 7 of which are alternatively spliced and 2 of which are unspliced. Of the alternatively spliced mRNAs, isoform 1 is the longest variant of the gene and is the subject of this article. [5]
The molecular weight of FAM76A is 38.4 kDa, making it possible for this protein to diffuse through nuclear pores. [12] The isoelectric point is 9.28. FAM76A does not have any significant positive, negative, or mixed charge clusters. In addition, FAM76A does not have any predicted hydrophobic or transmembrane segments, suggesting that this protein is not found within the cell membrane. [13]
The amino acid composition of FAM76A protein showed amino acid frequencies within 1.5% of that of normal human proteins for all but cysteine, valine, and lysine. Cysteine and lysine have higher frequencies compared to a normal Homo sapiens protein, while valine has a lower frequency compared to a normal Homo sapiens protein. These same amino acid frequency differences are seen in FAM76A orthologs such as Gallus gallus (H. sapiens sequence identity 84%), Serinus canaria (H. sapiens sequence identity 77%), and Crassostrea gigas (H. sapiens sequence identity 57%).
NCBI conserved domains search identified an uncharacterized conserved protein (YqiK) that contains the Band7/PHB/SPFH domain, whose function is unknown and is conserved in various species ranging from humans to bacteria. [10] In Homo sapiens, the Band7/PHB/SPFH domain spans from amino acids 252-326. The molecular weight of this domain is 8.9 kDa, and it has an isoelectric point of 9.23. The Band7/PHB/SPFH domain does not have any amino acids frequency composition that differs from a normal Homo sapiens protein. [13] This domain is yet to be assigned to any domain superfamily.
FAM76A is predicted to only have alpha helices. In total, there are 17 alpha helices predicted, the longest of which contains the Band7/PHB/SPFH domain. [14] From this, only 8 alpha helices are located within conserved regions of FAM76A (see conceptual translation).
FAM76A contains a coiled-coil domain, which is located within the Band7/PHB/SPFH domain. No significant ligand-binding sites or active sites were predicted from I-TASSER. [15] There is no evidence to suggest that FAM76A interacts with other proteins to form a quaternary structure.
The protein subcellular localization prediction tool, PSORT II, predicts FAM76A to be located within the nucleus. This prediction is observed in orthologs such as Gallus gallus and Callorhinchus milii. [16] Further evidence for FAM76A localizing to the nucleus is provided by the presence of a nuclear localization signal. [8]
According to NCBI Geo Profile, FAM76A is expressed in Homo sapiens parathyroid, lymph node, esophagus, and bone marrow tissue. Developmental stages where FAM76A expression is detected include the embryoid body, fetus, and adult. [17]
Allen human brain atlas predictions for FAM76A expression are depicted below. FAM76A appears to have higher expression within the cerebral cortex and lower expression in parts of the reptilian brain such as the pontine tegmentum (see expression table for further details). [18]
Brain Area | Function | FAM76A Expression Level |
---|---|---|
Frontal lobe | Planning, organizing, problem solving, selective attention, personality, and higher cognitive functions | High |
Occipital lobe | Visual processing | High |
Temporal lobe | Auditory processing | High |
Parietal lobe | Sensation, perception, and integration of sensory input | High |
Cerebellum | Coordination of voluntary movements | High |
Hippocampal formation | Memory/spatial coding | Low |
Pontine tegmentum | Sensory and motor function, sleep stage control, and arousal | Low |
Myelencephalon cuneate nucleus | Receive fine touch and proprioceptive information from upper body | Low |
Select data from three experiments involving FAM76A are shown below. In one experiment, CLDN1 over-expression in lung adenocarcinoma cells decreased FAM76A expression. [19] In another experiment, androgen insensitive prostate cancer cells were shown to have reduced expression of FAM76A compared to androgen sensitive cells. [20] Another experiment demonstrated that metaphase II oocyte cells were shown to have more expression of FAM76A compared to control cells. [21]
FAM76A is predicted to undergo a variety of post-translational modifications. Post-translational modifications found within conserved regions include 7 phosphorylation sites, 2 sumoylation sites, and 1 nuclear localization signal. [22] These modifications indicate that FAM76A is localized to the nucleus. Refer to conceptual translation for a visual representation of the aforementioned modifications.
Genomatrix's ElDorado program predicts a promoter for FAM76A that is named GXP_71042 and is 679 base pairs. It is located on chromosome 1, starting at 27725479 and ending at 27726157. GXP_71042 overlaps with the start of the coding sequence of FAM76A. [23] There are several transcription factors that bind to this promoter. Many of the transcription factors that bind to the promoter region of FAM76A have function dealing with blood cells, the immune system, and leukocytes—perhaps suggesting that FAM76A is involved in immune function. It would also appear that the most common matrix families include C2H2 zinc fingers and myeloid zinc fingers, suggesting that these matrix families may be heavily involved in FAM76A transcription.
Common RNA binding proteins within the 3’ UTR of FAM76A include PABPC1, ELAVL1, and PUM2—each with predicted binding frequencies of 32, 18, and 16 times, respectively. [24]
FAM76A was found to have a physical interaction with ELAVL1. The interaction was detected by immunoprecipitation by Abdelmohsen et al., 2009. [25] ELAVL1 is involved in regulating gene expression.
FAM76B is a paralog of FAM76A. It is estimated that FAM76A and FAM76B diverged from each other around 17.5 MYA. [5] Structural similarities that are conserved between FAM76A/B include a coiled coil domain as well as a poly serine compositional bias. [10] FAM76A and FAM76B both exhibit high expression in tissues such as lymph node, whole blood, testis, ovary, brain, kidney, liver, and lung. [26] FAM76B has about 62% sequence identity with FAM76A. [10]
Genus and Species | Common Name | Date of Divergence (MYA) | Amino Acid Sequence Identity (%) |
Homo sapiens | humans | 0 | 100 |
Macaca fascicularis | crab-eating macaque | 29.1 | 95 |
Tarsius syrichta | Philippine tarsier | 67.6 | 85 |
Dipodomys ordii | Ord's kangaroo rat | 90.9 | 85 |
Nannospalax galili | blind mole rat | 90.9 | 88 |
Gallus gallus | red junglefowl | 320.5 | 84 |
Nipponia nippon | crested ibis | 320.5 | 83 |
Egretta garzetta | little egret | 320.5 | 75 |
Anolis carolinensis | Carolina anole | 320.5 | 73 |
Oryzias latipes | Japanese rice fish | 429.6 | 59 |
Callorhinchus milii | Australian ghostshark | 482.9 | 64 |
Crassostrea gigas | pacific oyster | 847 | 57 |
Cryptosporidium parvum Iowa II | N/A | 1724.7 | 27 |
Cryptosporidium hominis | N/A | 1724.7 | 26 |
Shown here is a table of a select number of orthologs for Homo sapiens FAM76A. The table includes closely, intermediately, and distantly related orthologs. Mammals are shown to have greater similarity, while aquatic vertebrates such as actinopterygii/chondrichthyes have lesser similarity. Orthologs of Homo sapiens protein FAM76A are listed above in descending order of date of divergence and then by sequence identity.
FAM76A appears to have a moderate rate of mutation when compared to fibrinogen (fast mutating) and cytochrome c (slow mutating). [7] [27] This suggests that FAM76A has been at least somewhat resistant to mutation during the course of evolution.
FAM76A expression is highest in adrenal tumors, esophageal tumors, and soft tissue/muscle tissue tumors. [5] [28] Copy number gain/loss of FAM76A—along with neighboring genes—has shown to produce detrimental phenotypes. In one case report, a patient with a copy number gain from 1p36.11-34.2 was shown to have developmental delays. [29] Another patient, who had a copy number gain from 1p36.1-35, showed similar delays. [30] In another case report, a patient with a copy number loss of 1p35.3, the exact location of FAM76A, developed macrocephaly. [30]
The MSA, shown below and generated with Biology Workbench CLUSTALW, arranges orthologs by the first letter of genus and then the first two letters of species. [13] There are 3 domains that are highly conserved across orthologs. Two of these domains have an unknown function, while the third domain is a coiled-coil domain. Conservation of these regions was traced back to Cryptosporidium parvum Iowa II, which diverged from Homo sapiens 1724.7 MYA. Conserved region 1 contains mostly polar amino acids; conserved region 2 contains both polar and non-polar amino acids; and the coiled-coil domain contains mostly polar amino acids.
Family with Sequence Similarity 203, Member B (FAM203B) is a protein encoded by the FAM203B gene (8q24.3) in humans. While FAM203B is only found in humans and possibly non-human primates, its paralog, FAM203A, is highly conserved. The FAM203B protein contains two conserved domains of unknown function, DUF383 and DUF384, and no transmembrane domains. This protein has no known function yet, although the homolog of FAM203A in Caenorhabditis elegans (Y54H5A.2) is thought to help regulate the actin cytoskeleton.
Protein FAM214A, also known as protein family with sequence similarity 214, A (FAM214A) is a protein that, in humans, is encoded by the FAM214A gene. FAM214A is a gene with unknown function found at the q21.2-q21.3 locus on Chromosome 15 (human). The protein product of this gene has two conserved domains, one of unknown function (DUF4210) and another one called Chromosome_Seg. Although the function of the FAM214A protein is uncharacterized, both DUF4210 and Chromosome_Seg have been predicted to play a role in chromosome segregation during meiosis.
Coiled-coil domain containing 94 (CCDC94), is a protein that in humans is encoded by the CCDC94 gene. The CCDC94 protein contains a coiled-coil domain, a domain of unknown function (DUF572), an uncharacterized conserved protein (COG5134), and lacks a transmembrane domain.
Family with sequence similarity 167, member A is a protein in humans that is encoded by the FAM167A gene located on chromosome 8. FAM167A and its paralogs are protein encoding genes containing the conserved domain DUF3259, a protein of unknown function. FAM167A has many orthologs in which the domain of unknown function is highly conserved.
Family with sequence similarity 63, member A is a protein that, in humans, is encoded by the FAM63A gene. It is located on the minus strand of chromosome 1 at locus 1q21.3.
Intermediate filament family orphan 1 is a protein that in humans is encoded by the IFFO1 gene. IFFO1 has uncharacterized function and a weight of 61.98 kDa. IFFO1 proteins play an important role in the cytoskeleton and the nuclear envelope of most eukaryotic cell types.
CCDC92, or Limkain beta-2, is a protein which in humans is encoded by the CCDC92 gene. It is likely involved in DNA repair or reduction/oxidation reactions. The gene ubiquitously found in humans and is highly conserved across animals.
The coiled-coil domain containing 142 (CCDC142) is a gene which in humans encodes the CCDC142 protein. The CCDC142 gene is located on chromosome 2, spans 4339 base pairs and contains 9 exons. The gene codes for the coiled-coil domain containing protein 142 (CCDC142), whose function is not yet well understood. There are two known isoforms of CCDC142. CCDC142 proteins produced from these transcripts range in size from 743 to 665 amino acids and contain signals suggesting protein movement between the cytosol and nucleus. Homologous CCDC142 genes are found in many animals including vertebrates and invertebrates but not fungus, plants, protists, archea, or bacteria. Although the function of this protein is not well understood, it contains a coiled-coil domain and a RINT1_TIP1 motif located within the coiled-coil domain.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
FAM221B is a protein that in humans is encoded by the FAM221B gene . FAM221B is also known by the alias C9orf128, is expressed at low level, and is defined by 17 GenBank accessions . It is predicted to function in transcription regulation as a transcription factor.
Uncharacterized protein C12orf60 is a protein that in humans is encoded by the C12orf60 gene. The gene is also known as LOC144608 or MGC47869. The protein lacks transmembrane domains and helices, but it is rich in alpha-helices. It is predicted to localize in the nucleus.
Chromosome 19 open reading frame 18 (c19orf18) is a protein which in humans is encoded by the c19orf18 gene. The gene is exclusive to mammals and the protein is predicted to have a transmembrane domain and a coiled coil stretch. This protein has a function that is not yet fully understood by the scientific community.
Ankycorbin is an ankyrin repeat and coiled-coil domain containing protein that in humans is encoded by the RAI14 gene. It is expressed in a variety of human tissues and is thought to play a role in actin regulation of ectoplasmic specialization, establishment of sperm polarity and sperm adhesion. It may also promote the integrity of Sertoli cell tight junctions at the blood testis barrier.
Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.
ProteinFAM89A is a protein which in humans is encoded by the FAM89A gene. It is also known as chromosome 1 open reading frame 153 (C1orf153). Highest FAM89A gene expression is observed in the placenta and adipose tissue. Though its function is largely unknown, FAM89A is found to be differentially expressed in response to interleukin exposure, and it is implicated in immune responses pathways and various pathologies such as atherosclerosis and glioma cell expression.
FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.
Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.
Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.
KIAA2013, also known as Q8IYS2 or MGC33867, is a single-pass transmembrane protein encoded by the KIAA2013 gene in humans. The complete function of KIAA2013 has not yet been fully elucidated.
Secernin-3 (SCRN3) is a protein that is encoded by the human SCRN3 gene. SCRN3 belongs to the peptidase C69 family and the secernin subfamily. As a part of this family, the protein is predicted to enable cysteine-type exopeptidase activity and dipeptidase activity, as well as be involved in proteolysis. It is ubiquitously expressed in the brain, thyroid, and 25 other tissues. Additionally, SCRN3 is conserved in a variety of species, including mammals, birds, fish, amphibians, and invertebrates. SCRN3 is predicted to be an integral component of the cytoplasm.