NALF2 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | NALF2 , CXorf63, TED, TMEM28, bB57D9.1, family with sequence similarity 155 member B, FAM155B, NALCN channel auxiliary factor 2 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 3648377 HomoloGene: 83276 GeneCards: NALF2 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Family with Sequence Similarity 155 Member B is a protein in humans that is encoded by the FAM155B gene. It belongs to a family of proteins whose function is not yet well understood by the scientific community. It is a transmembrane protein that is highly expressed in the heart, thyroid, and brain.
FAM155B is located on the X chromosome at the position Xq13.1. [5] It is found on the positive strand from nucleotides 69504326-69532508. Genetic neighbors include the gene EDA (ectodysplasin A) downstream and a long intergenic non-protein coding RNA upstream. [6] The full gene transcript sequence of this gene is 3528 bp long with a coding sequence of 1418 bp. [7] This transcript includes 3 exons and 2 introns. [8]
Aliases of FAM155B are TMEM28, cXorf63, and TED. [6]
The FAM155B gene has 2 known isoforms.
The transcript of isoform 1 is 4685 bp long with a coding sequence of 1022 bp. It has a 5' UTR consisting of 79 bp and a 3' UTR with 3583 bp. [9]
The transcript of isoform 2 is much shorter at 975 bp with a coding sequence of 878 bp. It also has a 5' UTR of 79 bp, but the 3' UTR is simply 17 bp. [10]
The protein sequence of FAM155B is 472 amino acids long. [7] The molecular weight is predicted to be 52.5 kDa and the isoelectric point is estimated to be 8.2. [11] The most prominent amino acids are Leucine (L) and Proline (P) as they account for 11.4% and 10% of the protein respectively. In terms of amino acid grouping, the most common ones are AGP, LVIFM, and KRED at 24.8%, 22.2%, and 21%. This protein lacks charged segments or charge clusters. [12]
There are 2 protein isoforms for FAM155B. Isoform 1 is 340 amino acids long [9] while isoform 2 is composed of 292 amino acids. [10]
FAM155B has two transmembrane domains which suggests that it is a multi-pass membrane protein. [6] It also has a cytoplasmic domain located right between the two transmembrane domains and three compositionally biased regions. A cysteine-rich region is located at the beginning of the sequence followed by a proline-rich region and a histidine-rich region much later in the sequence. [13]
This protein is predicted to have 9 beta sheets, 16 alpha helices, and 2 transmembrane helices. [14]
In terms of membrane topology, the N- and C-termini appear to be located extracellularly while the protein sequence between the transmembrane domains appears to be located in the cytoplasmic region. [14]
The only regulatory element is a promoter region that is located about 2303 bp upstream from the transcriptional start site. There is one DNAse hypersensitivity cluster and one CpG island associated with the promoter region. There is low signal strength among cell lines among both H3K4me3 and H3K27Ac tracks. The H3k4me1 track exhibit relatively higher signals in the NHEK and H1-hESC cell lines. [15] Multiple transcription factors binding sites are associated with the promoter such as ERE, C2H2, and TFIIB. [16]
There are no known miRNA target sites in FAM155B. Many stem-loop structures are predicted from the 3' UTR of human FAM155B and close orthologs. [17] This indicates some conservation of secondary mRNA structures among different species.
Human FAM155B along with closely related orthologs are most likely to be localized in the endoplasmic reticulum with a prediction of 34.8%. Following this, the protein is likely to be found in the plasma membrane with a prediction of 21.7%. [18]
FAM155B has two proposed N-Glycosylation sites at Asparagine 120 and 193. [19] There do not appear to be any significant O-Glycosylation sites. [20] There are two CKII phosphorylation sites at Threonine 302 and 469 along with two PKC phosphorylation sites at 283 and 349. [21] A sumoylation analysis identified two motifs with low probability scores of 0.5 and 0.13. [22] Since the probability scores are low, this indicates that FAM155B is unlikely to have any true sumoylation motifs.
Highest expression is observed in the heart, thyroid, and brain tissues from RNA sequencing of human tissues. Specifically, fetal brain tissue and the cerebellum are observed to have higher expression than the brain as a whole. In regards to fetal tissue, high expression of the heart was observed at 11 weeks while the stomach and intestine are expressed highly at 20 weeks. The kidney is also expressed relatively highly at 10 weeks. [6] This implies that expression in tissues like the stomach, intestine, and kidneys decreases as fetal development continues.
FAM155B has many orthologs found exclusively within the phylum Metazoa. This indicates that FAM155B isn't conserved in life forms such as plants, protists, fungi, bacteria, and archaea. Within Metazoa, very few orthologs are found outside of the subphylum Vertebrata—implying that the majority of FAM155B orthologs are vertebrates. The most distant ortholog detected is Actinia tenebrosa (Australian Red Waratah Sea Anemone). Closely related orthologs include primates, rodents, and African mammals while moderately related orthologs include birds, reptiles, and amphibians. The distantly related orthologs include bony fish and the sea anemone.
There is one human paralog for this gene, FAM155A. The amino acid identity between the two paralogs is 46.29%. Like its paralog, FAM155A is also conserved in animals within the phylum Metazoa. However, FAM155A is conserved in more invertebrates than FAM155B which implies that the original gene may have split in an invertebrate ancestor about 824 MYA.
This gene seems to have first appeared in Cnidaria about 824 MYA. It then appeared in bony fish about 435 MYA. It is first noted in vertebrates (specifically amphibians) around 351.8 MYA and about 159 MYA in mammals.
The function of FAM155B is poorly understood by the scientific community. However, it may be involved in immune function as it has been found to interact with elements of the immune system.
Predicted functional partners of FAM155B include C3, SH2B3, and C1R—all of which are associated with immune functions. C3 and C1R are involved in the complement system while SH2B3 is a protein which links the T-cell receptor signal to the phospholipase GRB2 and PI3K. FAM155B may also interact with AGBL4 and AGBL5 which are metallocarboxypeptidases that mediate protein deglutamylation. [23]
FAM155B is differentially expressed under many circumstances which indicates that it may be associated with various diseases. Significantly decreased expression of FAM155B has been noted in MCF7 human breast cancer cell lines when the estrogen receptor is silenced. Another study observed that this gene was overexpressed in breast cancer cell lines under normal circumstances which reinforced the association with breast cancer. [24] In addition, FAM155B was found to be one of the key candidate genes distinctive to the B-type Raf (BRAF) kinase mutation in papillary thyroid cancer. It was differently expressed in the wild-type compared to the mutant form [25]
Transmembrane protein 98 is a single-pass membrane protein that in humans is encoded by the TMEM98 gene. The function of this protein is currently unknown. TMEM98 is also known as UNQ536/PRO1079.
E3 ubiquitin-protein ligase RNF128 is an enzyme that in humans is encoded by the RNF128 gene.
Family with Sequence Similarity 78-Member B (FAM78B) is a protein of unknown function in humans that is encoded by the FAM78B gene (1q24.1). It has orthologous genes and predicted proteins in vertebrates and several invertebrates, but not in arthropods. It has a nuclear localization signal in the protein sequence and a miRNA target region in the mRNA sequence.
Transmembrane protein 251, also known as C14orf109 or UPF0694, is a protein that in humans is encoded by the TMEM251 gene. One notable feature of this protein is the presence of proline residues on one of its predicted transmembrane domains., which is a determinant of the intramitochondrial sorting of inner membrane proteins.
TMEM156 is a gene that encodes the transmembrane protein 156 (TMEM156) in Homo sapiens. It has the clone name of FLJ23235.
C9orf135 is a gene that encodes a 229 amino acid protein. It is located on Chromosome 9 of the Homo sapiens genome at 9q12.21. The protein has a transmembrane domain from amino acids 124-140 and a glycosylation site at amino acid 75. C9orf135 is part of the GRCh37 gene on Chromosome 9 and is contained within the domain of unknown function superfamily 4572. Also, c9orf135 is known by the name of LOC138255 which is a description of the gene location on Chromosome 9.1.
Fanconi Anemia Opposite Strand Transcript protein is a predicted protein that in humans is encoded by the FANCD2OS gene. The name is derived from mRNA transcribed from the strand complementary to the FANCD2 gene.
CRACD-like protein. previously known as KIAA1211L is a protein that in humans is encoded by the CRACDL gene. It is highly expressed in the cerebral cortex of the brain. Furthermore, it is localized to the microtubules and the centrosomes and is subcellularly located in the nucleus. Finally, CRACDL is associated with certain mental disorders and various cancers.
WD Repeat and Coiled-coiled containing protein (WDCP) is a protein which in humans is encoded by the WDCP gene. The function of the protein is not completely understood, but WDCP has been identified in a fusion protein with anaplastic lymphoma kinase found in colorectal cancer. WDCP has also been identified in the MRN complex, which processes double-stranded breaks in DNA.
TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.
SMIM19, also known as Small Integral Membrane Protein 19, encodes the SMIM19 protein. SMIM19 is a confirmed single-pass transmembrane protein passing from outside to inside, 5' to 3' respectively. SMIM19 has ubiquitously high to medium expression with among varied tissues or organs. The validated function of SMIM19 remains under review because of on sub-cellular localization uncertainty. However, all linked proteins research to interact with SMIM19 are associated with the endoplasmic reticulum (ER), presuming SMIM19 ER association
FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.
Transmembrane protein 101 (TMEM101) is a protein that in humans is encoded by the TMEM101 gene. The TMEM101 protein has been demonstrated to activate the NF-κB signaling pathway. High levels of expression of TMEM101 have been linked to breast cancer.
Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.
Major facilitator superfamily domain containing 6 like (MFSD6L) is a protein encoded by the MFSD6L gene in humans. The MFSD6L protein is a transmembrane protein that is part of the major facilitator superfamily (MFS) that uses chemiosmotic gradients to facilitate the transport of small solutes across cell membranes.
GPATCH2L is a protein that is encoded by the GPATCH2L human gene located at 14q24.3. In humans, the length of mRNA in GPATCH2L (NM_017926) is 14,021 base pairs and the gene spans bases is 62,422 nt between chr14: 76,151,922 - 76,214,343. GPATCH2L is on the positive strand. IFT43 is the gene directly before GPATCH2L on the positive strand and LOC105370575 is the uncharacterized gene on the negative strand, which is approximately one and a half the size of GPATCH2L. Known aliases for GPATCH2L contain C14orf118, FLJ20689, FLJ10033, and KIAA1152. GPATCH2L produces 28 distinct introns, 17 different mRNAs, 14 alternatively spliced variants, and 3 unspliced forms. It has 5 probable alternative promoters, 7 validated polyadenylation sites, and 6 predicted promoters of varying lengths.
Transmembrane protein 212 is a protein that in humans is encoded by the TMEM212 gene. The protein consists of 5 transmembrane domains and localizes in the plasma membrane and endoplasmic reticulum. TMEM212 has orthologs in vertebrates but not invertebrates. TMEM212 has been associated with sporadic Parkinson's disease, facial processing, and adiposity in African Americans.
Transmembrane protein 248, also known as C7orf42, is a gene that in humans encodes the TMEM248 protein. This gene contains multiple transmembrane domains and is composed of seven exons.TMEM248 is predicted to be a component of the plasma membrane and be involved in vesicular trafficking. It has low tissue specificity, meaning it is ubiquitously expressed in tissues throughout the human body. Orthology analyses determined that TMEM248 is highly conserved, having homology with vertebrates and invertebrates. TMEM248 may play a role in cancer development. It was shown to be more highly expressed in cases of colon, breast, lung, ovarian, brain, and renal cancers.
Maestro heat-like repeat-containing protein family member 9 (MROH9) is a protein which in humans is encoded by the MROH9 gene. The word ‘maestro’ itself is an acronym, standing for male-specific transcription in the developing reproductive organs (MRO). MRO genes belong to the MROH family, which includes MROH9.
Leucine-rich repeat-containing protein 74A (LRRC74A), is a protein encoded by the LRRC74A gene. The protein LRRC74A is localized in the cytoplasm. It has a calculated molecular weight of approximately 55 kDa. The LRRC74A protein is nominally expressed in the testis, salivary gland, and pancreas.