FAM71E2, also known as Family With Sequence Similarity 71 Member E2, is a protein that, in humans, is encoded by the FAM71E2 gene. [1] Aliases include C19orf16, Protein FAM71E2, Chromosome 19 open reading frame 16, and Putative Protein FAM71E2. The gene is primarily conserved in mammals, but it is also conserved in two reptile species. [2]
FAM71E2 is located on the minus strand at 9q13.42 and extends from 55,354,908 bp to 55,363,252 bp. The gene is 8,353 bp long, and has 11 exons. [3] [1]
These genes are closest to FAM71E2 on the human genome: [3]
Two alternatively spliced mRNA variants are produced during transcription: aAUG10 and bAUG10. They are both validated alternative polyadenylation sites. [8] However, there are no isoforms of FAM71E2.
Conserved stem loop regions were found on both the 5' and 3' UTR in closely related orthologs. [9] [10] There were no conserved stem loops for distantly related orthologs.
FAM71E2 is 922 amino acids long and has a molecular weight of 10/100,000 pI/Mw. The protein has four different domains: DUF3699, PRK14951, PHA03247, and BASP1. [2] The structure consists of 8 alpha helixes and 1 beta sheet. [11]
This protein is localized in the nucleus. [12] Localization in the nucleus is conserved in all orthologs.
The promoter of FAM71E2 is located between 55363152 and 55364260 on the minus strand and is 1,109 bp long. [13] This promoter was selected based on its main expression in the testes and high CAGE values.
Multiple transcription factor binding sites were found for FAM71E2. They were selected based on relatedness to potential gene function such as SOX11 and estrogen response elements.
FAM71E2 is primarily expressed in male tissues, particularly the testis. [14] [3] There is also lower expression in the brain, mammary gland, prostate, and thymus. [15] FAM71E2 has also been expressed in breast (mammary gland) tumor and normal tissues.
The graph on the right is from a study analyzing the Metaphase II stage oocytes matured in vivo. The goal of this study was to identify genes and deduced pathways from human oocyte that can help us understand oogenesis, folliculogenesis, fertilization, and embryonic development. [16] The control consisted of RNA from 10 different normal human tissues: skeletal muscle, kidney, lung, colon, liver, spleen, breast, brain, heart, and stomach. The results from this study indicate that expression of FAM71E2 in oocytes is very low compared to that of normal adult tissue from various parts of the body. Human protein atlas supports these observations since there was no expression during the earliest phase of development (embryoid body). However, Human protein atlas also showed there was very minimal expression in the fetus.
This study indicates that there is a very slight decrease in FAM71E2 expression in estrogen receptor knockdown samples. [17] This study may also support the Human protein atlas information stating FAM71E2 has slight expression in Breast (mammary glad) tumors.
This study was conducted by looking at mantle cell lymphoma cells depleted for the transcription factor SOX11. What is interesting is that FAM71E2 is expressed higher in the SOX11 depleted cells than the control, even though there are SOX11 transcription factors in FAM71E2. It may be possible that these transcription factors exist but are simply not transcribed. Further research on this topic should be conducted.
Paralogs
FAM71 has many paralogs, especially from FAM71. The paralogs are sorted by similarity. The paralogs in the table were selected based on their e-value and relevance to the FAM71 family. E-value range: 0 to 3e^-11. Similarity range: 100% to 51%.
Select Paralogs of FAM71E2 | ||
Protein | E-value | Similarity range |
FAM71C | 8.00E-22 | 56 |
FAM71D | 3.00E-21 | 52 |
HSD-51 | 5.00E-21 | 55 |
FAM71B | 5.00E-21 | 55 |
FAM71A | 2.00E-20 | 55 |
FAM71F1 | 1.00E-11 | 51 |
FAM71F2 | 3.00E-11 | 51 |
FAM71E1 | 2.00-11 | 51 |
Orthologs
Select Orthologs of FAM71E2 | ||||||
Genus and species | Common name | accession | sequence length | Percent Identity | Percent Similarity | Date of divergence (MYA) |
Homo sapiens | Humans | NP_001138874.1 | 922 | 100 | 100 | 0 |
Papio anubus | Olive baboon | XP_003916175.2 | 865 | 79.54 | 82 | 28.1 |
Galeopterus variegatus | Sunda flying lemur | XP_008589520.1 | 876 | 60.74 | 70 | 82 |
Ictidomys tridecemlineatus | Thirteen-lined ground squirrel | XP_021576662.1 | 1028 | 55.11 | 70 | 88 |
Vulpes vulpes | Red fox | XP_004777071.1 | 826 | 54.23 | 65 | 94 |
Pteropus vampyrus | Large flying fox | XP_023378500.1 | 937 | 52.92 | 64 | 94 |
Vicugna pacos | Alpaca | XP_006216382.1 | 865 | 53.7 | 64 | 94 |
Chrysemys pica bellii | Western painted turtle | XP_008174220.1 | 217 | 50 | 67 | 320 |
Pogona vitticeps | Central bearded dragon | XP_020634337.1 | 689 | 33.67 | 55 | 320 |
There are several interacting proteins with FAM71E2. One protein interaction program predicted NOTCH2NL, P60369, ALB, and MTUS2 interact with FAM71E2. [18] NOTCH2NL might have a role in the Notch signaling pathway as well as regulating neutrophil differentiation. P60369 is a hair keratin-associated protein. ALB functions as a regulator of colloidal osmotic pressure of blood, as well as a major zinc transporter. MTUS2 main function is to bind microtubules.
Another protein interaction program predicted BOD1L2, FAM200A, CCT8L2, OR9G1, and AMPD3 interact with FAM71E2. [19] BOD1L2 may have a role in biorientation via mitotic spindles. CCT8L2 assists folding proteins after ATP hydrolysis. OR9G1 functions as an odorant receptor. AMPD3 functions in energy metabolism. FAM200A has no known function.
Based on expression data, there are several topics that can be explored to learn more about the exact function of FAM71E2.
Transmembrane protein 151B is a protein that in humans is encoded by the TMEM151B gene.
Family with sequence similarity 63, member A is a protein that, is encoded by the FAM63A gene in humans,. It is located on the minus strand of chromosome 1 at locus 1q21.3.
WD repeat-containing protein 90 is a protein that, in humans, is encoded by the WDR90 gene (16p13.3). This human protein is 1750 amino acids, and has a molecular weight of 187.7 kDa. It contains multiple WD40 repeat domains and one domain of unknown function. This protein is conserved all the way back to invertebrates. Proteins containing WD transducin repeating domains have been found to play a role in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control, autophagy and apoptosis.
C8orf48 is a protein that in humans is encoded by the C8orf48 gene. C8orf48 is a nuclear protein specifically predicted to be located in the nuclear lamina. C8orf48 has been found to interact with proteins that are involved in the regulation of various cellular responses like gene expression, protein secretion, cell proliferation, and inflammatory responses. This protein has been linked to breast cancer and papillary thyroid carcinoma.
TMEM156 is a gene that encodes the transmembrane protein 156 (TMEM156) in Homo sapiens. It has the clone name of FLJ23235.
OCC-1 is a protein, which in humans is encoded by the gene C12orf75. The gene is approximately 40,882 bp long and encodes 63 amino acids. OCC-1 is ubiquitously expressed throughout the human body. OCC-1 has shown to be overexpressed in various colon carcinomas. Novel splice variant of this gene was also detected in various human cancer types; in addition to encoding a novel smaller protein, OCC-1 gene produces a non-protein coding RNA splice variant lncRNA.
Chromosome 10 open reading frame 67 (C10orf67), also known as C10orf115, LINC01552, and BA215C7.4, is an un-characterized human protein-coding gene. Several studies indicate a possible link between genetic polymorphisms of this and several other genes to chronic inflammatory barrier diseases such as Crohn's Disease and sarcoidosis.
Glutamate Rich Protein 2 is a protein in humans encoded by the gene ERICH2. This protein is expressed heavily in male tissues specifically in the testes, and proteins are specifically found in the nucleoli fibrillar center and the vesicles of these testicular cells. The protein has multiple protein interactions which indicate that it may play a role in histone modification and proper histone functioning.
Uncharacterized protein Chromosome 16 Open Reading Frame 71 is a protein in humans, encoded by the C16orf71 gene. The gene is expressed in epithelial tissue of the respiratory system, adipose tissue, and the testes. Predicted associated biological processes of the gene include regulation of the cell cycle, cell proliferation, apoptosis, and cell differentiation in those tissue types. 1357 bp of the gene are antisense to spliced genes ZNF500 and ANKS3, indicating the possibility of regulated alternate expression.
Cardiac-enriched FHL2-interacting protein (CEFIP) is a protein encoded by the gene C10orf71 on chromosome 10 open reading frame 71. It is primarily understood that this gene is moderately expressed in muscle tissue and cardiac tissue.
Uncharacterized protein C12orf60 is a protein that in humans is encoded by the C12orf60 gene. The gene is also known as LOC144608 or MGC47869. The protein lacks transmembrane domains and helices, but it is rich in alpha-helices. It is predicted to localize in the nucleus.
FAM71E1, also known as Family With Sequence Similarity 71 Member E1, is a protein that in humans is encoded by the FAM71E1 gene. It is thought to be ubiquitously expressed at low levels throughout the body, and it is conserved in vertebrates, particularly mammals and some reptiles. The protein is localized to the nucleus and can be exported to the cytoplasm.
Chromosome 9 open reading frame 43 is a protein that in humans is encoded by the C9orf43 gene. The gene is also known as MGC17358 and LOC257169. C9orf43 contains DUF 4647 and a polyglutamine repeat region although protein function is not well understood.
Testis expressed 55 (TEX55) is a human protein that is encoded by the C3orf30 gene located on the forward strand of human chromosome three, open reading frame 30 (3q13.32). TEX55 is also known as Testis-specific conserved, cAMP-dependent type II PK anchoring protein (TSCPA), and uncharacterized protein C3orf30.
ProteinFAM89A is a protein which in humans is encoded by the FAM89A gene. It is also known as chromosome 1 open reading frame 153 (C1orf153). Highest FAM89A gene expression is observed in the placenta and adipose tissue. Though its function is largely unknown, FAM89A is found to be differentially expressed in response to interleukin exposure, and it is implicated in immune responses pathways and various pathologies such as atherosclerosis and glioma cell expression.
Family with Sequence Similarity 155 Member B is a protein in humans that is encoded by the FAM155B gene. It belongs to a family of proteins whose function is not yet well understood by the scientific community. It is a transmembrane protein that is highly expressed in the heart, thyroid, and brain.
C3orf56 is a protein encoding gene found on chromosome 3. Although, the structure and function of the protein is not well understood, it is known that the C3orf56 protein is exclusively expressed in metaphase II of oocytes and degrades as the oocyte develops towards the blastocyst stage. Degradation of the C3orf56 protein suggests that this gene plays a role in the progression from maternal to embryonic genome and in embryonic genome activation.
C6orf136 is a protein in humans encoded by the C6orf136 gene. The gene is conserved in mammals, mollusks, as well some porifera. While the function of the gene is currently unknown, C6orf136 has been shown to be hypermethylated in response to FOXM1 expression in Head Neck Squamous Cell Carcinoma (HNSCC) tissue cells. Additionally, elevated expression of C6orf136 has been associated with improved survival rates in patients with bladder cancer. C6orf136 has three known isoforms.
Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.
Chromosome 5 Open Reading Frame 47, or C5ORF47, is a protein which, in humans, is encoded by the C5ORF47 gene. It also goes by the alias LOC133491. The human C5ORF47 gene is primarily expressed in the testis.