Sal-like protein 4(SALL4) is a transcription factor encoded by a member of the Spalt-like (SALL) gene family, SALL4. [5] [6] The SALL genes were identified based on their sequence homology to Spalt, which is a homeotic gene originally cloned in Drosophila melanogaster that is important for terminal trunk structure formation in embryogenesis and imaginal disc development in the larval stages. [7] [8] There are four human SALL proteins (SALL1, 2, 3, and 4) with structural homology and playing diverse roles in embryonic development, kidney function, and cancer. [9] The SALL4 gene encodes at least three isoforms, termed A, B, and C, through alternative splicing, with the A and B forms being the most studied. SALL4 can alter gene expression changes through its interaction with many co-factors and epigenetic complexes. [10] It is also known as a key embryonic stem cell (ESC) factor.
SALL4 contains one zinc finger in its amino (N-) terminus and three clusters of zinc fingers that each coordinates zinc with two cysteines and two histidines (Cys2His2-type) that potentially confer nucleic acid binding activity. SALL4B lacks two of the zinc finger clusters found in the A isoform. Although it remains unclear which zinc finger cluster is responsible for SALL4’s DNA binding property
Different SALL family members can form hetero- or homodimers via their conserved glutamine (Q)-rich region. [11] SALL4 has at least one canonical nuclear localization signal (NLS) with the K-K/R-X-K/R motif in the N-terminal portion of the protein shared among both A and B isoforms (residues 64–67). [12] One report has suggested that with a mutated NLS sequence, SALL4 cannot localize to the nucleus. [12] Through a 12-amino acid sequence in its N-terminus (N-12a.a.), SALL4 binds to retinoblastoma binding protein 4 (RBBP4), a subunit of the nucleosome remodeling and histone deacetylation (NuRD) complex, which also contains chromodomain-helicase-DNA binding proteins (CHD3/4 or Mi-2a/b), metastasis-associated proteins (MTA), methyl-CpG-binding domain proteins (MBD2 or MBD3), and histone deacetylases (HDAC1 and HDAC2). [13] [14] [15] [16] This association allows SALL4 to act as a transcriptional repressor. Accordingly, SALL4 has been shown to localize to heterochromatin regions in cells, for which its last zinc finger cluster (shared between SALL4A and B) is necessary. [17] Beside the NuRD complex, SALL4 is reportedly able to bind to other epigenetic modifiers such as histone lysine-specific demethylase 1 (LSD1), which is frequently associated with the NuRD complex and subsequently gene repression. [18] In addition, SALL4 can also activate gene expression via the recruitment of the mixed lineage leukemia (MLL) protein, which is a homolog of Drosophila Trithorax and yeast Set1 proteins and has histone 3 lysine 4 (H3K4) trimethylation activity. [19] This interaction is best characterized in the co-regulation of HOXA9 gene by SALL4 and MLL in leukemic cells. [19]
In mouse ESCs, Sall4 was found to bind the essential stem cell factor, octamer-binding transcription factor 4 (Oct4), in two separate unbiased mass spectrometry (spec) screens [20] [21] Sall4 can also bind other important pluripotency proteins such as Nanog and sex determining region Y (SRY)-box 2 protein (Sox2). [22] [23] Together these proteins can affect each other’s expression patterns as well as their own, thus forming a mESC-specific transcriptional regulatory circuit. [24] SALL4 has also been reported to bind T-box 5 protein (Tbx5) in cardiac tissues as well as genetically interact with Tbx5 in mouse limb development. [25] Other binding partners of SALL4 include promyelocytic leukemia zinc finger protein (PLZF) in sperm precursor cells, [26] Rad50 during DNA damage repair, [27] and b-catenin downstream of the Wnt signaling pathway. [28] Since most of these interactions were identified by mass-spec or co-immunoprecipitation, whether they are direct are unknown. Through chromatin immunoprecipitation (ChIP) followed by next-generation sequencing or microarray, some SALL4 targets have been identified. [29] A key verified target gene encodes the enzyme phosphatidylinositol-3,4,5-trisphosphate 3-phosphatase (PTEN). PTEN is a tumor suppressor that keeps uncontrolled cell growth in check through inducing programmed cell death, or apoptosis. SALL4 binds the PTEN promoter and recruits the NuRD complex to mediate its repression, thus leads to proliferation of cells. [16]
In mouse embryos, SALL4 expression is detectable as early as the two-cell stage. Its expression persists through 8- and 16-cell stages to the blastocyst, where it is found in some cells of the trophectoderm and inner cell mass (ICM), from which mouse ESCs are derived. [30] SALL4 is an important factor for maintaining the “stemness” of ESCs of both mouse and human origin, since loss of Sall4 leads to differentiation of these pluripotent cells down the trophectoderm lineage. [17] [30] [31] This is possibly due to down-regulation of Pou5f1 (encoding Oct4) expression and up-regulation of caudal-type homeobox 2 ( Cdx2 ) gene expression. [31] Sall4 is part of the transcriptional regulatory network that includes other pluripotent factors such as Oct4, Nanog, and Sox2 [32] [33] Because of its important role in early development, genetically mutated mice without functioning SALL4 die early on at the peri-implantation stage, while heterozygous mice have neural, kidney, heart defects and limb abnormalities. [17] [25] [34]
The various SALL4-null mouse models mimic human mutations in the SALL4 gene, which were shown to cause developmental problems in patients with Okihiro/Duane-Radial-ray syndrome. [35] [36] These individuals frequently have family history of hand malformation and eye movement disorders.
SALL4 expression is low to undetectable in most adult tissues with the exception of germ cells and human blood progenitor cells. [35] [37] However, SALL4 is re-activated and mis-regulated in various cancers [38] [39] such as acute myeloid leukemia (AML), [28] B-cell acute lymphocytic leukemia (B-ALL), [40] germ cell tumors, [41] gastric cancer, [42] breast cancer, [43] hepatocellular carcinoma (HCC), [44] [45] lung cancer, [46] and glioma. [47] In many of these cancers, SALL4 expression was compared in tumor cells to the normal tissue counterpart, e.g. it is expressed in nearly half of primary human endometrial cancer samples, but not in normal or hyperplastic endometrial tissue samples. [48] Often, SALL4 expression is correlated with worse survival and poor prognosis such as in HCC, [44] or with metastasis such as in endometrial cancer, [48] colorectal carcinoma, [49] and esophageal squamous cell carcinoma. [50] It is unclear how SALL4 expression is de-regulated in malignant cells, but DNA hypomethylation in its intron 1 region has been observed in B-ALL. [40]
In breast cancer, Signal transducer and activator of transcription 3 (STAT3) has been reported to directly activate SALL4 expression. [51] Furthermore, canonical Wnt signaling has been proposed to activate SALL4 gene expression in both development [52] [53] and in cancer. [28] In leukemia, the mechanism of SALL4 function is better characterized; mice with over-expression of human SALL4 develop myelodysplatic syndromes (MDS)-like symptoms and eventually AML. [28] This is consistent with high level of SALL4 expression correlating with high-risk MDS patients. [54] [55] Further elucidating its tumorigenesis function, knocking down SALL4 expression with short hairpin-RNA in leukemic cells or treating these cells with a peptide that mimics the N-12aa of SALL4 to inhibit its interaction with the NuRD complex both result in cell death. [13] [44] These suggest the primary cancer-maintaining property of SALL4 is mediated through its transcriptional repressing function. These observations have led to growing interest in SALL4 as both a diagnostic tool as well as target in cancer therapy. For example, in solid tumors such as germ cell tumors, SALL4 protein expression has become a standard diagnostic biomarker. [56]
The 2016 version of this article was updated by an external expert under a dual publication model. The corresponding academic peer reviewed article was published in Gene and can be cited as: Hiro Tatetsu; Nikki R Kong; Gao Chong; Giovanni Amabile; Daniel G Tenen; Li Chai (15 June 2016). "SALL4, the missing link between stem cells, development and cancer". Gene . Gene Wiki Review Series. 584 (2): 111–9. doi:10.1016/J.GENE.2016.02.019. ISSN 0378-1119. PMC 4823161 . PMID 26892498. Wikidata Q28273464. |
Homeobox protein NANOG(hNanog) is a transcriptional factor that helps embryonic stem cells (ESCs) maintain pluripotency by suppressing cell determination factors. hNanog is encoded in humans by the NANOG gene. Several types of cancer are associated with NANOG.
T-box transcription factor T, also known as Brachyury protein, is encoded for in humans by the TBXT gene. Brachyury functions as a transcription factor within the T-box family of genes. Brachyury homologs have been found in all bilaterian animals that have been screened, as well as the freshwater cnidarian Hydra.
GATA-binding factor 1 or GATA-1 is the founding member of the GATA family of transcription factors. This protein is widely expressed throughout vertebrate species. In humans and mice, it is encoded by the GATA1 and Gata1 genes, respectively. These genes are located on the X chromosome in both species.
Duane-radial ray syndrome, also known as Okihiro syndrome, is a rare autosomal dominant disorder that primarily affects the eyes and causes abnormalities of bones in the arms and hands. This disorder is considered to be a SALL4-related disorder due to the SALL4 gene mutations leading to these abnormalities. It is diagnosed by clinical findings on a physical exam as well as genetic testing and imaging. After being diagnosed, there are other evaluations that one may go through in order to determine the extent of the disease. There are various treatments for the symptoms of this disorder.
Twist-related protein 1 (TWIST1) also known as class A basic helix–loop–helix protein 38 (bHLHa38) is a basic helix-loop-helix transcription factor that in humans is encoded by the TWIST1 gene.
Tumor protein p63, typically referred to as p63, also known as transformation-related protein 63 is a protein that in humans is encoded by the TP63 gene.
Telomerase reverse transcriptase is a catalytic subunit of the enzyme telomerase, which, together with the telomerase RNA component (TERC), comprises the most important unit of the telomerase complex.
SRY -box 2, also known as SOX2, is a transcription factor that is essential for maintaining self-renewal, or pluripotency, of undifferentiated embryonic stem cells. Sox2 has a critical role in maintenance of embryonic and neural stem cells.
Transcription factor SOX-9 is a protein that in humans is encoded by the SOX9 gene.
Transcriptional regulator ATRX also known as ATP-dependent helicase ATRX, X-linked helicase II, or X-linked nuclear protein (XNP) is a protein that in humans is encoded by the ATRX gene.
GATA3 is a transcription factor that in humans is encoded by the GATA3 gene. Studies in animal models and humans indicate that it controls the expression of a wide range of biologically and clinically important genes.
Paired-like homeodomain transcription factor 2 also known as pituitary homeobox 2 is a protein that in humans is encoded by the PITX2 gene.
Forkhead box C1, also known as FOXC1, is a protein which in humans is encoded by the FOXC1 gene.
Sal-like 1 (Drosophila), also known as SALL1, is a protein which in humans is encoded by the SALL1 gene. As the full name suggests, it is one of the human versions of the spalt (sal) gene known in Drosophila.
T-box transcription factor TBX3 is a protein that in humans is encoded by the TBX3 gene.
Cohesin subunit SA-2 (SA2) is a protein that in humans is encoded by the STAG2 gene. SA2 is a subunit of the Cohesin complex which mediates sister chromatid cohesion, homologous recombination and DNA looping. In somatic cells cohesin is formed of SMC3, SMC1, RAD21 and either SA1 or SA2 whereas in meiosis, cohesin is formed of SMC3, SMC1B, REC8 and SA3.
Homeobox protein CDX-1 is a protein in humans that is encoded by the CDX1 gene. CDX1 is expressed in the developing endoderm and its expression persists in the intestine throughout adulthood. CDX1 protein expression varies along the intestine, with high expression in intestinal crypts and diminishing expression along intestinal villi.
PHD finger protein 6 is a protein that in humans is encoded by the PHF6 gene.
Histone-lysine N-methyltransferase 2D (KMT2D), also known as MLL4 and sometimes MLL2 in humans and Mll4 in mice, is a major mammalian histone H3 lysine 4 (H3K4) mono-methyltransferase. It is part of a family of six Set1-like H3K4 methyltransferases that also contains KMT2A, KMT2B, KMT2C, KMT2F, and KMT2G.
Angiogenic factor with G patch and FHA domains 1 is a protein that in humans is encoded by the AGGF1 gene.