NBEAL1 is a protein that in humans is encoded by the NBEAL1 gene. [1] It is found on chromosome 2q33.2 of Homo sapiens.
Neurobeachin-like protein 1 | |||||||
---|---|---|---|---|---|---|---|
Identifiers | |||||||
Symbol | NBEAL1 | ||||||
Alt. names | ALS2CR16 ALS2CR17 | ||||||
NCBI gene | 65065 | ||||||
HGNC | 20681 | ||||||
OMIM | 609816 | ||||||
UniProt | Q6ZS30 | ||||||
Other data | |||||||
Locus | Chr. 2 q33.2 | ||||||
|
Through the different domains of this protein, the function of NBEAL1 is predicted to be involved in the following cellular mechanisms: vesicle trafficking, membrane dynamics, receptor signaling, pre-mRNA processing, signal transduction and cytoskeleton assembly. [2] [3] [4] NBEAL1 is also known as Amytorophic Lateral Sclerosis 2 Chromosomal Region, ALS2CR16 and ALS2CR17. [1]
The mRNA for this protein consists of 9058 base pairs in a linear sequence with the coding sequence begins at base pair number 334 and extends until base pair number 8418. [5] The translated protein is a total 56 exons that constitute a final length of 2694 amino acids. [6] There are currently 9 known isoforms within humans. [3]
Neurobeachin-like1 contains five domains: DUF4704, DUF4800, PH_BEACH, Beach, and WD40 repeats. [6]
DUF4704 is a domain of unknown function. While the function of this domain is unknown, it is conserved within neurobeachin proteins in eukaryotes. [4] It begins at amino acid 859 and spans until number 1115. [3]
DUF4800 is a domain of unknown function. It begins at amino acid 1580, spanning until 1833. [3] While it is uncharacterized in function, it is found within eukaryotes. [7]
Spanning from amino acid 1886 until amino acid 1983, this domain is referred to as a Pleckstrin Homology domain in the BEACH domain. [8] It has a PH because the fold of this domain is similar to the PH domain, but is not identical in the sequence of the canonical PH domains. The PH_BEACH domain is not able to bind phospholipids. [9]
The Beige and Chediak-Higashi (BEACH) domain is one of the most significant domains within this protein. This domain is highly conserved roughly 280 amino acid domain, present in nine different human BEACH domains. [10] It located after the PH_BEACH domain in the sequence. While not much is understood on the exact function of BDCP proteins within the BEACH domain, it is known that they serve many purposes within cellular mechanisms: vesicular transport, apoptosis, membrane dynamics and receptor signaling. [10] This protein family is of great clinical importance currently because mutations in this domain have been identified in multiple human disorders. For example, neurobeachin-like1 is upregulated in glioma: as the pathological grade of the glioma increases, the expression of neurobeachin-like1 is decreased. [2] In NBEAL1, this follows the PH_BEACH domain, beginning at amino acid 2005 and ending at amino acid 2284. [3]
NBEAL1 has one WD40 domain within NBEAL1. From amino acid 2409 to 2682 is the entire WD40 domain. Within the domain, from 2406 to 2439, there is a structural motif WD40 repeat. The WD40 domain is found in a number of eukaryotic proteins that have multiple functions. These include, but are not limited to, adaptor/regulatory modules in signal transduction, pre-mRNA processing, and cytoskeleton assembly. [3]
The secondary structure of NBEAL1 is predicted to be a combination of alpha helices, beta sheets and random coils. [13]
I-TASSER was used to predict a 3D structure of NBEAL1. [14] Since NBEAL1 is longer in amino acid length than allowed for input, it was split in half to predict the structure of the whole protein.
The following document illustrates the different post-translational modifications.
Using the EST abundance profile through Unigene, NBEAL1 expression was discovered based on both body sites and health states. [15] NBEAL1 shows expression in the brain, embryonic tissue, eye, intestine, kidney, liver, lung, mammary glands, ovaries, pancreas, pharynx, placenta, prostate, skin, stomach, testis, thyroid, and trachea. Based on transcripts per million, expression is highest in the stomach at 62 transcripts per million, with pancreas and trachea being next with their transcripts per million being 37 and 38, respectively. The lowest transcripts per million in the brain, eye, placenta and testis, all at 4 per million. When looking at the breakdown by different health states, NBEAL1 is highly expressed in multiple tumors. [15] Again, the abundance was highest in gastrointestinal tumors, correlating to the high expression of NBEAL1 within the stomach. However, NBEAL1 expression is not seen in pancreatic tumors, which may signify something about its function within the pancreas. The abundance also differs in developmental stages, the highest being the fetal stage with 21 transcripts per million and the adult at 14 transcripts per million.
The function of NBEAL1 is not yet well understood by the scientific community. However, given the function of the different domains and disease associations, it is predicted that the NBEAL1 protein may be involved in a variety of functions. As of now they include, but are not limited to, protein-protein interactions, vesicle trafficking, membrane dynamics, receptor signaling, apoptosis, adaptor/regulatory modules in signal transduction, pre-mRNA processing, and cytoskeleton assembly. [3] [2]
This protein has been associated with NBEAL1 are Amyotrophic Lateral Sclerosis, Juvenile and Adenocarcinoma, [1] although the function in these diseases has not yet been identified.
Neurobeachin-like1 is a highly conserved protein. It has orthologs found in many life forms, including but not limited to: reptiles, birds, amphibians, mammals, fish, and a few invertebrates. The following table presents some of the orthologs found using searches in BLAST [16] and BLAT. [17]
Scientific Name | Common Name | Accession Number | Sequence Length | Percent Identity |
---|---|---|---|---|
Homo sapiens | Human | NP_001107604.1 | 2694 | - |
Pan troglodytes | Chimpanzee | XP_525997.3 | 2694 | 99 |
Gorilla gorillagorilla | Western Lowland Gorilla | XP_018878299.1 | 2694 | 99 |
Mus musculus | Mouse | NP_77560 | 2688 | 98 |
Cerocebus tays | Sooty angabey | XP_011903312 | 2678 | 97 |
Canis lupus familiaris | Dog | XP_545603.3 | 2693 | 93 |
Ailuropoda melanoleuca | Giant Panda | XP_019655126.1 | 2693 | 93 |
Trichechus manatus latirostris | West Indian Manatee | XP_004378299 | 2682 | 93 |
Tursiops truncatus | Common Bottlenose Dolphin | XP_019794654.1 | 2682 | 92.7 |
Eptesicus cuscus | Big Brown Bat | XP_008144758.1 | 2722 | 92 |
Zonotrichia albicollis | White Throated Sparrow | XP_014120514.1 | 2707 | 80 |
Gallus gallus | Chicken | XP_004942730.1 | 2725 | 78.7 |
Python bivattatus | Burmese Python | XP_007422078.1 | 2687 | 79 |
Xenopus tropicalis | Western Clawed Frog | XP_012826463.1 | 2687 | 74 |
Callorhinchus milii | Australian Ghostshark | XP_007888887.1 | 2749 | 70.2 |
Danio rerio | Zebrafish | XP_009300392 | 2723 | 66.3 |
Octopus bimaculoides | California two-spot octopus | XP_014777916.1 | 2584 | 41.2 |
Daphnia magna | Planktonic crustacean | KZS03729 | 2734 | 34.4 |
Drosophila busckii | Fruit fly | XP_017842328.1 | 2722 | 34 |
According to GeneCards, NBEAL1 has a few paralogs: NBEAL2, WDFY3, NBEA, LRBA, Lysosomal trafficking regulator (LYST), and WDFY3. [19] The table below summarizes the paralogs of NBEAL1.
Gene Name | Species | Accession Number | Sequence Length | Percent Identity |
---|---|---|---|---|
NBEAL2 | Homo sapiens | NP_055990 | 2754 | 46 |
NBEA | Homo sapiens | NP_056483.3 | 2946 | 22.8 |
LRBA | Homo sapiens | NP_996717.2 | 2863 | 22.5 |
WDFY3 | Homo sapiens | XP_016863397.1 | 3544 | 21.8 |
LYST | Homo sapiens | NP_001288294.1 | 3801 | 19.3 |
WD repeat-containing protein 90 is a protein that, in humans, is encoded by the WDR90 gene (16p13.3). This human protein is 1750 amino acids, and has a molecular weight of 187.7 kDa. It contains multiple WD40 repeat domains and one domain of unknown function. This protein is conserved all the way back to invertebrates. Proteins containing WD transducin repeating domains have been found to play a role in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control, autophagy and apoptosis.
Ankyrin repeat domain-containing protein 24 is a protein in humans that is coded for by the ANKRD24 gene. The gene is also known as KIAA1981. The protein's function in humans is currently unknown. ANKRD24 is in the protein family that contains ankyrin-repeat domains.
Zinc finger protein 684 is a protein that in humans is encoded by the ZNF684 gene.
The coiled-coil domain containing 142 (CCDC142) is a gene which in humans encodes the CCDC142 protein. The CCDC142 gene is located on chromosome 2, spans 4339 base pairs and contains 9 exons. The gene codes for the coiled-coil domain containing protein 142 (CCDC142), whose function is not yet well understood. There are two known isoforms of CCDC142. CCDC142 proteins produced from these transcripts range in size from 743 to 665 amino acids and contain signals suggesting protein movement between the cytosol and nucleus. Homologous CCDC142 genes are found in many animals including vertebrates and invertebrates but not fungus, plants, protists, archea, or bacteria. Although the function of this protein is not well understood, it contains a coiled-coil domain and a RINT1_TIP1 motif located within the coiled-coil domain.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
OCC-1 is a protein, which in humans is encoded by the gene C12orf75. The gene is approximately 40,882 bp long and encodes 63 amino acids. OCC-1 is ubiquitously expressed throughout the human body. OCC-1 has shown to be overexpressed in various colon carcinomas. Novel splice variant of this gene was also detected in various human cancer types; in addition to encoding a novel smaller protein, OCC-1 gene produces a non-protein coding RNA splice variant lncRNA.
BEND2 is a protein that in humans is encoded by the BEND2 gene. It is also found in other vertebrates, including mammals, birds, and reptiles. The expression of BEND2 in Homo sapiens is regulated and occurs at high levels in the skeletal muscle tissue of the male testis and in the bone marrow. The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in chromatin modification and regulation.
Chromosome 9 open reading frame 25 (C9orf25) is a domain that encodes the FAM219A gene. The terms FAM219A and C9orf25 are aliases and can be used interchangeably. The function of this gene is not yet completely understood.
C1orf122 is a gene in the human genome that encodes the cytosolic protein ALAESM.. ALAESM is present in all tissue cells and highly up-regulated in the brain, spinal cord, adrenal gland and kidney. This gene can be expressed up to 2.5 times the average gene in its highly expressed tissues. Although the function of C1orf122 is unknown, it is predicted to be used for mitochondria localization.
C7orf50 is a gene in humans that encodes a protein known as C7orf50. This gene is ubiquitously expressed in the kidneys, brain, fat, prostate, spleen, among 22 other tissues and demonstrates low tissue specificity. C7orf50 is conserved in chimpanzees, Rhesus monkeys, dogs, cows, mice, rats, and chickens, along with 307 other organisms from mammals to fungi. This protein is predicted to be involved with the import of ribosomal proteins into the nucleus to be assembled into ribosomal subunits as a part of rRNA processing. Additionally, this gene is predicted to be a microRNA (miRNA) protein coding host gene, meaning that it may contain miRNA genes in its introns and/or exons.
ProteinFAM89A is a protein which in humans is encoded by the FAM89A gene. It is also known as chromosome 1 open reading frame 153 (C1orf153). Highest FAM89A gene expression is observed in the placenta and adipose tissue. Though its function is largely unknown, FAM89A is found to be differentially expressed in response to interleukin exposure, and it is implicated in immune responses pathways and various pathologies such as atherosclerosis and glioma cell expression.
The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.
C2orf72 is a gene in humans that encodes a protein currently named after its gene, C2orf72. It is also designated LOC257407 and can be found under GenBank accession code NM_001144994.2. The protein can be found under UniProt accession code A6NCS6.
Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.
C2orf80 is a protein that in humans is encoded by the c2orf80 gene. The gene c2orf80 also goes by the alias GONDA1. In humans, c2orf80 is exclusively expressed in the brain. While relatively little is known about the function of c2orf80, medical studies have shown a strong association between variations in c2orf80 and IDH-mutant gliomas, 46,XY gonadal dysgenesis, and a possible association with blood pressure.
Chromosome 5 open reading frame 22 (c5orf22) is a protein-coding gene of poorly characterized function in Homo sapiens. The primary alias is unknown protein family 0489 (UPF0489).
bMERB domain containing 1 is a gene expressed in humans which has broad expression across the brain. This gene codes for bMERB1 domain-containing protein 1 isoform 1. It is predicted that this gene is involved in actin cytoskeleton regulation, microtubule regulation and glial cell migration.
THAP domain-containing protein 3 (THAP3) is a protein that, in Homo sapiens (humans), is encoded by the THAP3 gene. The THAP3 protein is as known as MGC33488, LOC90326, and THAP domain-containing, apoptosis associated protein 3. This protein contains the Thanatos-associated protein (THAP) domain and a host-cell factor 1C binding motif. These domains allow THAP3 to influence a variety of processes, including transcription and neuronal development. THAP3 is ubiquitously expressed in H. sapiens, though expression is highest in the kidneys.
Secernin-3 (SCRN3) is a protein that is encoded by the human SCRN3 gene. SCRN3 belongs to the peptidase C69 family and the secernin subfamily. As a part of this family, the protein is predicted to enable cysteine-type exopeptidase activity and dipeptidase activity, as well as be involved in proteolysis. It is ubiquitously expressed in the brain, thyroid, and 25 other tissues. Additionally, SCRN3 is conserved in a variety of species, including mammals, birds, fish, amphibians, and invertebrates. SCRN3 is predicted to be an integral component of the cytoplasm.