BEND2 (protein)

Last updated
BEND2
Identifiers
Aliases
External IDs GeneCards:
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

n/a

n/a

RefSeq (protein)

n/a

n/a

Location (UCSC)n/an/a
PubMed searchn/an/a
Wikidata
View/Edit Human

BEND2 is a protein that in humans is encoded by the BEND2 gene. [1] It is also found in other vertebrates, including mammals, birds, and reptiles. [1] The expression of BEND2 in Homo sapiens is regulated and occurs at high levels in the skeletal muscle tissue of the male testis and in the bone marrow. [2] [3] [4] The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in chromatin modification and regulation. [5]

Contents

Gene

Common aliases

BEND2 stands for BEN domain containing 2 and is also known as CXorf20 (HGNC ID: 28509). [1] [6] [7]

Locus and size

The locus for BEND2 is on the minus strand of the X chromosome at Xp22.13. The gene is approximately 58 kilobases in length. [1]

mRNA

A screenshot of the NCBI Gene page for BEND2 showing alternative splicing of the BEND2 mRNA transcript. Both exons (dark rectangles) and introns are shown. BEND2 Alternative Transcripts.png
A screenshot of the NCBI Gene page for BEND2 showing alternative splicing of the BEND2 mRNA transcript. Both exons (dark rectangles) and introns are shown.

Alternative splicing

BEND2 contains 14 exons which undergo alternative splicing to create five transcript variants that vary from 4,720 base pairs (bp) to 2,144 bp in the mature mRNA. [1] [7] [3] The longest and most complete transcript of the gene, variant 1, encodes isoform 1 of the BEND2 protein (NP_699177.2). [1]

5' and 3'UTR

The untranslated regions (UTR) flanking the coding sequence of BEND2 at the 5' and 3' end of the mature mRNA molecule contain sites for RNA-binding proteins, including RBMX, pum2, and EIF4B as well as microRNA binding sites. The 5'UTR also contains an upstream in-frame stop codon and the 3'UTR contains a polyadenylation signal sequence.

Protein (Isoform 1)

Annotated image of the tertiary and secondary structure of BEND2 based on I-TASSER prediction. Annotated BEND2 I-TASSER.png
Annotated image of the tertiary and secondary structure of BEND2 based on I-TASSER prediction.

Molecular weight and internal composition

The predicted molecular weight is 87.9 kDal. [10] [11]

The predicted isoelectric point is pH 5.07. [12]

The internal composition is enriched for serine residues. [10]

Isoforms

Corresponding to the five alternative transcripts of BEND2, the protein encoded by this gene is found in two isoforms (1 and 2) as well as three predicted structures (X1, X2, and X3). These isoforms range from 813 to 645 amino acids in length. [1] Isoform 1 is 799 amino acids in length. [13]

Subcellular location

The presence of nuclear localization signals within the amino acid sequence or primary structure of the BEND2 protein leads to a prediction of subcellular localization in the nucleus. [14] The pat7 [(P-X(1-3)-(3-4K/R)] signal and a nuclear bipartite signal are both found near the N-terminus of the protein. [14] [15]

Post-translational modifications of BEND2. Grey diamonds indicate glycation and red diamonds show SUMOlation sites. N-terminus acetylation and SUMO interactions sites are marked at the front of the protein. Two nuclear localization signals at the same site near a domain of unknown function (Unk). BEND2 Diagram.png
Post-translational modifications of BEND2. Grey diamonds indicate glycation and red diamonds show SUMOlation sites. N-terminus acetylation and SUMO interactions sites are marked at the front of the protein. Two nuclear localization signals at the same site near a domain of unknown function (Unk).

Structure

The secondary structure for BEND2 is unclear, in particular at the N-terminus, which is poorly conserved between orthologs. The C-terminus contains two BEN domains, which are predicted to form a series of alpha helices. [1] [5]

Post-translational modifications

Based on its primary structure, BEND2 is predicted to undergo N-terminus acetylation, glycation of several lysine residues, SUMOlation, a SUMO interaction at the N-terminus, S-palmitoylation, and extensive phosphorylation. [16]

Interacting Proteins

BEND2 is found to interact with the following proteins through experimental yeast two-hybrid screens or pull down assays.

Experiment typeProteinProtein FunctionAssociated diseases
Two-hybrid screenAtaxin 1(ATXN1) [17] Chromatin-binding factor; RNA metabolism Spinocerebellar ataxia 1/spinocerebellar degeneration
Two-hybrid screenSplicing factor 3A subunit 2(SF3A2) [18] Activation of U2 snRNP; microtubule-binding protein
Two-hybrid screenLIM Homeobox 2 (LHX2) [18] Transcriptional regulator for cell differentiation; sequence-specific DNA binding Schizencephaly
Two-hybrid screenProline Rich 20D (PRR20D) [18] Unknown function
Pull down assayAmyolid precursor protein (APP) [19] Cell surface receptor in neurons; cleaved to form transcriptional activators Cerebral amyloid angiopathy; Alzheimer's disease
BEN domains of two D. melanogaster Insensitive proteins with their DNA target site. BEN domains of D. melanogaster Insensitive proteins.png
BEN domains of two D. melanogaster Insensitive proteins with their DNA target site.

BEN Domains (protein feature)

BEND2 has two BEN domains at its C-terminus. [1] BEN domains are found in a diverse array of proteins and are predicted to be important for chromatin remodeling as well as for the recruitment of chromatin-modifying factors utilized during the process of transcriptional regulation of gene expression. [5] BEN domains are predicted to form four alpha helices that allow this domain to interact with its DNA target. [5] [20]

Dai et al. 2013 showed that the Drosophila melanogaster Insensitive (Insv) gene and corresponding protein has no domains of known chemical function yet it contains a single BEN domain. They illustrated the activity of the Insv protein in transcriptional regulation of genes and obtained a crystal structure of two Insv BEN domains interacting with their DNA target site. [20]

Expression

NCBI GEO Profile GDS3113 / 227904 showing tissue expression data for BEND2. GDS3113 Microarray.png
NCBI GEO Profile GDS3113 / 227904 showing tissue expression data for BEND2.

Tissue expression pattern

The expression of the BEND2 gene is regulated and it is therefore not ubiquitously expressed in the human body. High expression occurs in the testis and in the bone marrow. [21] The NCBI EST profile for this gene shows expression only in the testis and in the muscle. [22]

Transcriptional regulation of expression

The promoter regulating expression of BEND2 (GXP_2567556) is 1255 base pairs in length and is located directly upstream of the BEND2 gene. It regulates transcription of all five transcriptional variants of BEND2. [23] Genomatix's MatInspector program predicted 418 transcription factor binding sites within the BEND2 promoter, including for SRY, neurogenin, interferon regulatory factor-3 (IRF-3), Ikaros2, and TCF/LEF-1.

Homology

Paralogs

The BEND2 protein has no known paralogs within the human genome. [24]

BEN-domain containing gene family

The BEND2 gene belongs to a family of human genes known as "BEN-domain containing”. This includes BANP (BEND1), BEND3, BEND4, BEND5, BEND6, BEND7, NACC1 (BEND8), and NACC2 (BEND9). The loci for these genes are spread throughout the human genome. [25] Each of these genes contains between one and four BEN domains. Except for at these motifs, the genes of the BEN family do not have similar sequences.

Orthologs

The BEND2 gene is conserved across evolutionary time as it has 114 known orthologs in a wide range of vertebrate species including mammals, birds, crocodilia, and amphibians. [26] The BEND2 protein has 42 known orthologs. [27] The C-terminus of the protein, the location of its BEN domains, is highly conserved; however, the N-terminus is not well conserved, even within the order of Primates.

Genus/speciesCommon nameOrderDate of divergence from H. sapiens (mya)Accession numberSequence lengthWhole sequence identity C-terminus identity
Homo sapiens HumanPrimates0NP_699177.27991.0001.000
Pongo abelii OrangutanPrimates15.76--7840.9210.854
Macaca nemestrina Southern pig-tailed macaquePrimates29.44XP_011733709.18230.6940.828
Vicugna pacos AlpacaArtiodactyla96XP_015106214.17400.4330.512
Ceratotherium simum simum White rhinocerosPerissodactyla96XP_014646569.18640.4120.527
Loxodonta africana African bush elephantProboscidea105XP_010594135.18290.3820.489
Canis lupus familiaris DogCarnivora96XP_013967473.19000.3620.445
Ailuropoda melanoleuca Giant panadaCarnivora96XP_019665441.18520.3530.460
Rhinolophus sinicus Chinese horseshoe batChiroptera96XP_019610944.18080.3450.459
Dasypus novemcinctus Nine-banded armadilloCingulata105XP_012377569.18860.3420.500
Trichechus manatus latirostris ManateeSirenia105XP_012412857.19500.3350.475
Chrysochloris asiatica Cape golden moleAfrosoricida105XP_006835746.16830.3300.443
Oryctolagus cuniculus European rabbitLagomorpha90XP_017205124.18110.3050.438
Monodelphis domestica Gray short-tailed opossumDidelphimorphia159XP_007500895.17280.3030.443
Ornithorhynchus anatinus PlatypusMonotremata177XP_007668655.17150.3020.429
Gavialis gangeticus Fish-eating crocodileCrocodilia312XP_019380828.16970.3090.458
Chelonia mydas Green sea turtleTestudines312XP_007070584.17490.2970.453
Apteryx australis mantelli North Island brown kiwiApterygiformes312XP_013807123.16470.2950.444
Columba livia Rock doveColumbiformes312XP_005509980.16680.2870.442
Pygoscelis adeliae Adelie penguinSphenisciformes312XP_009323754.16570.2820.458
Nanorana parkeri Tibet frogAnura352XP_018417228.15860.2600.376

Function

BEND2 is predicted to be a DNA-binding protein due to the presence of BEN domains at its C-terminus, a hypothesis supported by its localization to the nucleus, the transcription factors found in its promoter region, and the nature of the proteins it interacts with. Though the precise function of the BEND2 protein is not yet well understood by the scientific community, BEN domains have been found to be important regulators of transcription. [20]

Clinical significance

The diseases that have been linked to BEND2 are related to the central nervous system though expression of the gene is not highly observed in these tissues.

Related Research Articles

<span class="mw-page-title-main">NBEAL1</span> Protein-coding gene in the species Homo sapiens

NBEAL1 is a protein that in humans is encoded by the NBEAL1 gene. It is found on chromosome 2q33.2 of Homo sapiens.

WD repeat-containing protein 90 is a protein that, in humans, is encoded by the WDR90 gene (16p13.3). This human protein is 1750 amino acids, and has a molecular weight of 187.7 kDa. It contains multiple WD40 repeat domains and one domain of unknown function. This protein is conserved all the way back to invertebrates. Proteins containing WD transducin repeating domains have been found to play a role in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control, autophagy and apoptosis.

<span class="mw-page-title-main">ANKRD24</span> Protein-coding gene in the species Homo sapiens

Ankyrin repeat domain-containing protein 24 is a protein in humans that is coded for by the ANKRD24 gene. The gene is also known as KIAA1981. The protein's function in humans is currently unknown. ANKRD24 is in the protein family that contains ankyrin-repeat domains.

The coiled-coil domain containing 142 (CCDC142) is a gene which in humans encodes the CCDC142 protein. The CCDC142 gene is located on chromosome 2, spans 4339 base pairs and contains 9 exons. The gene codes for the coiled-coil domain containing protein 142 (CCDC142), whose function is not yet well understood. There are two known isoforms of CCDC142. CCDC142 proteins produced from these transcripts range in size from 743 to 665 amino acids and contain signals suggesting protein movement between the cytosol and nucleus. Homologous CCDC142 genes are found in many animals including vertebrates and invertebrates but not fungus, plants, protists, archea, or bacteria. Although the function of this protein is not well understood, it contains a coiled-coil domain and a RINT1_TIP1 motif located within the coiled-coil domain.

<span class="mw-page-title-main">ERICH2</span> Protein-coding gene in the species Homo sapiens

Glutamate Rich Protein 2 is a protein in humans encoded by the gene ERICH2. This protein is expressed heavily in male tissues specifically in the testes, and proteins are specifically found in the nucleoli fibrillar center and the vesicles of these testicular cells. The protein has multiple protein interactions which indicate that it may play a role in histone modification and proper histone functioning.

Leukocyte Receptor Cluster Member 9 is an uncharacterized protein encoded by the LENG9 gene. In humans, LENG9 is predicted to play a role in fertility and reproductive disorders associated with female endometrium structures.

<span class="mw-page-title-main">C2orf73</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C2orf73 is a protein that in humans is encoded by the C2orf73 gene. The protein is predicted to be localized to the nucleus.

<span class="mw-page-title-main">CRACD-like protein</span>

CRACD-like protein. previously known as KIAA1211L is a protein that in humans is encoded by the CRACDL gene. It is highly expressed in the cerebral cortex of the brain. Furthermore, it is localized to the microtubules and the centrosomes and is subcellularly located in the nucleus. Finally, CRACDL is associated with certain mental disorders and various cancers.

The Family with sequence similarity 149 member B1 is an uncharacterized protein encoded by the human FAM149B1 gene, with one alias KIAA0974. The protein resides in the nucleus of the cell. The predicted secondary structure of the gene contains multiple alpha-helices, with a few beta-sheet structures. The gene is conserved in mammals, birds, reptiles, fish, and some invertebrates. The protein encoded by this gene contains a DUF3719 protein domain, which is conserved across its orthologues. The protein is expressed at slightly below average levels in most human tissue types, with high expression in brain, kidney, and testes tissues, while showing relatively low expression levels in pancreas tissues.

<span class="mw-page-title-main">C21orf58</span> Protein-coding gene in the species Homo sapiens

Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.

<span class="mw-page-title-main">WDR53</span> Protein-coding gene in the species Homo sapiens

WD repeat containing protein 53 (WDR53) is a protein encoded by the WDR53 gene that has been identified in the human genome by the Human Genome Project but has, at the moment, lacked experimental procedures to understand the function. It is located on chromosome 3 at location 3q29 in Homo sapiens. It has short up and down stream untranslated regions as well as WD40 repeat regions which have been linked to various functions.

<span class="mw-page-title-main">TMEM128</span>

TMEM128, also known as Transmembrane Protein 128, is a protein that in humans is encoded by the TMEM128 gene. TMEM128 has three variants, varying in 5' UTR's and start codon location. TMEM128 contains four transmembrane domains and is localized in the Endoplasmic Reticulum membrane. TMEM128 contains a variety of regulation at the gene, transcript, and protein level. While the function of TMEM128 is poorly understood, it interacts with several proteins associated with the cell cycle, signal transduction, and memory.

TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.

<span class="mw-page-title-main">C6orf136</span> Protein-coding gene in the species Homo sapiens

C6orf136 is a protein in humans encoded by the C6orf136 gene. The gene is conserved in mammals, mollusks, as well some porifera. While the function of the gene is currently unknown, C6orf136 has been shown to be hypermethylated in response to FOXM1 expression in Head Neck Squamous Cell Carcinoma (HNSCC) tissue cells. Additionally, elevated expression of C6orf136 has been associated with improved survival rates in patients with bladder cancer. C6orf136 has three known isoforms.

<span class="mw-page-title-main">FAM120AOS</span> Protein-coding gene in the species Homo sapiens

FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.

<span class="mw-page-title-main">FAM166C</span>

Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.

<span class="mw-page-title-main">C5orf22</span> Protein-coding gene in the species Homo sapiens

Chromosome 5 open reading frame 22 (c5orf22) is a protein-coding gene of poorly characterized function in Homo sapiens. The primary alias is unknown protein family 0489 (UPF0489).

<span class="mw-page-title-main">THAP3</span> Protein in Humans

THAP domain-containing protein 3 (THAP3) is a protein that, in Homo sapiens (humans), is encoded by the THAP3 gene. The THAP3 protein is as known as MGC33488, LOC90326, and THAP domain-containing, apoptosis associated protein 3. This protein contains the Thanatos-associated protein (THAP) domain and a host-cell factor 1C binding motif. These domains allow THAP3 to influence a variety of processes, including transcription and neuronal development. THAP3 is ubiquitously expressed in H. sapiens, though expression is highest in the kidneys.

<span class="mw-page-title-main">C13orf46</span> C13of46 Gene and Protein

Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.

References

  1. 1 2 3 4 5 6 7 8 9 "BEND2 BEN domain containing 2 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2017-02-03.
  2. Thierry-Mieg, Danielle; Thierry-Mieg, Jean. "AceView: Gene:BEND2, a comprehensive annotation of human, mouse and worm genes with mRNAs or ESTsAceView". www.ncbi.nlm.nih.gov. Retrieved 2017-02-03.
  3. 1 2 "Genatlas sheet". genatlas.medecine.univ-paris5.fr. Retrieved 2017-02-03.
  4. "BEN domain containing 2 (BEND2)". www.ncbi.nlm.nih.gov. Retrieved 2017-02-03.
  5. 1 2 3 4 "BEN domain (IPR018379) < InterPro < EMBL-EBI". www.ebi.ac.uk. Retrieved 2017-02-03.
  6. "BEND2 Symbol Report | HUGO Gene Nomenclature Committee". www.genenames.org. Retrieved 2017-02-03.
  7. 1 2 "BEND2 Gene". www.genecards.org. Retrieved 2017-02-03.
  8. "BEND2 BEN domain containing 2 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2017-02-21.
  9. 1 2 Roy A, Kucukural A, Zhang Y (April 2010). "I-TASSER: a unified platform for automated protein structure and function prediction". Nature Protocols. 5 (4): 725–38. doi:10.1038/nprot.2010.5. PMC   2849174 . PMID   20360767.
  10. 1 2 "Statistical analysis of protein sequence (SAPS)". SDSC Biology Workbench- Protein Tools. Retrieved 4 April 2017.
  11. "AAStats". SDSC Biology Workbench- Protein Tools. Retrieved 4 April 2017.
  12. "PI". SDSC Biology Workbench- Protein Tools. Retrieved 4 April 2017.
  13. "BEN domain-containing protein 2 isoform 1 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2017-02-03.
  14. 1 2 "PSORT: Protein Subcellular Localization Prediction Too". GenScript. 24 November 1999.
  15. Boisvert M, Bouchard-Lévesque V, Fernandes S, Tijssen P (October 2014). "Classic nuclear localization signals and a novel nuclear localization motif are required for nuclear transport of porcine parvovirus capsid proteins". Journal of Virology. 88 (20): 11748–59. doi:10.1128/JVI.01717-14. PMC   4178750 . PMID   25078698.
  16. "NetAcet, NetPhos, GPS, GPS-Lipids, GPS-SUMOylation, and NetGlycate". ExPASY.
  17. Lim J, Hao T, Shaw C, Patel AJ, Szabó G, Rual JF, et al. (May 2006). "A protein-protein interaction network for human inherited ataxias and disorders of Purkinje cell degeneration". Cell. 125 (4): 801–14. doi: 10.1016/j.cell.2006.03.032 . PMID   16713569. S2CID   13709685.
  18. 1 2 3 "IntAct- search for BEND2". EMBL-EBI. Retrieved 19 April 2017.
  19. Oláh J, Vincze O, Virók D, Simon D, Bozsó Z, Tõkési N, et al. (September 2011). "Interactions of pathological hallmark proteins: tubulin polymerization promoting protein/p25, beta-amyloid, and alpha-synuclein". The Journal of Biological Chemistry. 286 (39): 34088–100. doi: 10.1074/jbc.M111.243907 . PMC   3190826 . PMID   21832049.
  20. 1 2 3 Dai Q, Ren A, Westholm JO, Serganov AA, Patel DJ, Lai EC (March 2013). "The BEN domain is a novel sequence-specific DNA-binding domain conserved in neural transcriptional repressors". Genes & Development. 27 (6): 602–14. doi:10.1101/gad.213314.113. PMC   3613608 . PMID   23468431.
  21. "GDS3113 / 227904". www.ncbi.nlm.nih.gov. Retrieved 2017-04-25.
  22. National Center for Biotechnology Information. "EST Profile - Hs.403802". www.ncbi.nlm.nih.gov. Retrieved 2017-04-25.
  23. "ElDorado: Annotation and Analysis". Genomatix. 2017. Archived from the original on 2001-02-24. Retrieved 2017-04-29.
  24. "Human genome, search for BEND2 isoform 1 protein". BLAT.
  25. "BEN domain containing (BEND) Gene Family | HUGO Gene Nomenclature Committee". www.genenames.org. Retrieved 2017-02-21.
  26. "Homo sapiens BEN domain containing 2 (BEND2), transcript variant 1, mR - Nucleotide - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2017-02-21.
  27. "Gene: BEND2 (ENSG00000177324) - Summary - Homo sapiens - Ensembl genome browser 87". useast.ensembl.org. Retrieved 2017-02-21.
  28. PDQ Pediatric Treatment Editorial Board (2002-01-01). "Childhood Cancer Genomics (PDQ®): Health Professional Version". PDQ Cancer Information Summaries. Bethesda (MD): National Cancer Institute (US). PMID   27466641.
  29. Sturm D, Orr BA, Toprak UH, Hovestadt V, Jones DT, Capper D, et al. (February 2016). "New Brain Tumor Entities Emerge from Molecular Classification of CNS-PNETs". Cell. 164 (5): 1060–1072. doi:10.1016/j.cell.2016.01.015. PMC   5139621 . PMID   26919435.
  30. Fukumura K, Kawazu M, Kojima S, Ueno T, Sai E, Soda M, et al. (June 2016). "Genomic characterization of primary central nervous system lymphoma". Acta Neuropathologica. 131 (6): 865–75. doi:10.1007/s00401-016-1536-2. PMID   26757737. S2CID   928277.
  31. Bahi-Buisson N, Girard B, Gautier A, Nectoux J, Fichou Y, Saillour Y, et al. (January 2010). "Epileptic encephalopathy in a girl with an interstitial deletion of Xp22 comprising promoter and exon 1 of the CDKL5 gene". American Journal of Medical Genetics. Part B, Neuropsychiatric Genetics. 153B (1): 202–7. doi:10.1002/ajmg.b.30974. PMID   19455595. S2CID   8211913.
  32. Stobbe G, Liu Y, Wu R, Hudgings LH, Thompson O, Hisama FM (January 2014). "Diagnostic yield of array comparative genomic hybridization in adults with autism spectrum disorders". Genetics in Medicine. 16 (1): 70–7. doi: 10.1038/gim.2013.78 . PMID   23765050.