SON | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | SON , BASS1, C21orf50, DBP-5, NREBP, SON3, SON DNA binding protein, TOKIMS, SON DNA and RNA binding protein | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | OMIM: 182465; MGI: 98353; HomoloGene: 10551; GeneCards: SON; OMA:SON - orthologs | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
SON protein is a protein that in humans is encoded by the SON gene. [5] [6]
SON is the name that has been given to a large Ser/Arg (SR)-related protein, which is a splicing co-factor that contributes to an efficient splicing within cell cycle progression. [7] It is also known as BASS1 (Bax antagonist selected in saccharomyces 1) or NRE-binding protein (Negative regulatory element-binding protein). The most common gene name of this splicing protein is SON, but C21orf50, DBP5, KIAA1019 and NREBP can also be used as synonyms. [8]
The protein encoded by SON gene binds to a specific DNA sequence upstream of the upstream regulatory sequence of the core promoter and second enhancer of human hepatitis B virus (HBV). Through this binding, it represses HBV core promoter activity, transcription of HBV genes, and production of HBV virions. The protein shows sequence similarities with other DNA-binding structural proteins such as gallin, oncoproteins of the MYC family, and the oncoprotein MOS. It may also be involved in protecting cells from apoptosis and in pre-mRNA splicing. [6] Mutation in SON gene is associated with ZTTK syndrome. [9]
The sequence length of the SON protein consists in 2426 aminoacids and its sequence status is totally completed. Its molecular weight is 263,830 daltons (Da) and its domain contains 8 types of repeats which are distributed in 3 regions. This protein is mostly located in nuclear speckles. Its higher expression is seen in leukocyte and heart cells. [8] [10]
SON protein is essential for maintaining the subnuclear organization of the factors that are processed in the nucleus highlighting its direct role in pre-mRNA splicing. [11] [ page needed ]
Splicing is the process through which pre-mRNA is transformed into mRNA. The pre-mRNA which has just been transcribed contains sequences called introns and exons. Introns are non-active nucleotide sequences that must be removed in order for the exons (active sequences) to be joined together forming mRNA. The controlled process of splicing takes place in the spliceosome, a complex that brings together pre-mRNA and a variety of binding proteins. These proteins together with the splicing factors (which are not found in the spliceosome) are in charge of recognizing the 5' ("donor") splice site, 3' ("acceptor") splice site, and branch point sequence within the intron. The SON protein is known to be one of these binding proteins. [11] [ page needed ]
Although there is a lack of knowledge about its exact splicing control in the progression of the cell cycle and it has remained largely unexplored, it’s certain that this splicing-associated protein is necessary for the maintenance of the embryonic stem cells because it influences the splicing of pluripotency regulators. [7] [12]
SON plays an important role in the mRNA processing. Nevertheless, this process is still a little uncertain and this is why in a future it will be interesting to understand how exactly this protein interacts with the spliceosomal complex, its exact molecular function in the context of splicing. Not only the SON protein interferes in the splicing but also makes complex mechanisms such as the RNA post-transcriptional to cooperate with the splicing-mRNA processing. [13]
Human embryonic stem cells are able to undergo the process of differentiation into specific and relevant cells. To maintain the pluripotency of the embryonic stem cells, transcription factors and epigenetic modifiers play an important role despite the fact that little is known about the regulation of pluripotency throughout the process of splicing. The factor SON is identified as essential for the maintenance of this pluripotency. It is confirmed that SON regulates the splicing process of transcripts (RNAm) that will encode the gens that are going to regulate the pluripotency of the embryonic human cells. [14]
On the one hand, SON protein is required to maintain the genome stability in order to ensure an efficient RNA processing of affected genes. It also facilitates the interaction of SR proteins with RNA polymerase II and is required for processing of weak constitutive splice sites, having also strong implications in cancer and other human diseases. [7] [10]
On the other side, a deficiency or knockdown of SON protein causes various and severe defects in mitotic division arrangement, chromosome alignment and microtubule dynamics when spindle pole separation takes place. [7]
But as we could read in the article called “SON protein regulates GATA-2 through transcriptional control of the microRNA 23a-27-24-a clúster”, SON protein has even more functions in the organism. It has been found that these proteins may regulate the hematopoietic cells differentiation. They have a specific job in hematopoietic process, which is based on activating other proteins called GATA. As these ones are finally activated, the cell differentiation starts normally. [15]
A recent study suggested that SON may be a novel therapeutic molecular target for pancreatic cancer as the results of a recent study show that this protein is very important as far as proliferation, survival and tumorigenicity of cancer cells are concerned. Specifically, these results revealed that the serine-arginine-rich protein involved in the RNA splicing process, could suppress pancreatic cell tumorigenicity. [13]
The therapeutic implications of the SON gene within virus-host interactions, particularly in the context of viral infections, remain insufficiently defined. Although the SON gene is recognized for its engagement in diverse cellular processes like mRNA splicing, DNA repair, and cell cycle regulation, its precise involvement in the host's response to viral infections and its therapeutic applications remain ambiguous [16] ). Ongoing research is dedicated to unraveling the intricate interactions between host genes, including SON, and HIV-1. This pursuit aims to enhance our comprehension of the dynamics between viruses and hosts, with the potential to unveil novel targets for therapeutic interventions.
SON plays a crucial role in mRNA splicing, a vital process for gene expression. Certain viruses depend on the host's cellular machinery, including the splicing apparatus, for their replication. Disrupting host factors involved in RNA processing could potentially impede viral replication. Aberrations in splicing processes may lead to abnormal protein production and contribute to disease. [17] Given SON's involvement in RNA processing, it emerges as a promising target for comprehending and treating conditions associated with splicing abnormalities.
In the event that SON is identified as a contributor to either promoting or inhibiting viral replication, there is potential for exploring genetic or epigenetic strategies. This could involve manipulating SON expression or activity to influence the viral life cycle. [18]
Host factors also play a role in shaping the immune response to viral infections. If SON is implicated in pathways related to the immune system, modulating its activity could have implications for enhancing the host's capacity to control viral infections. [19]
The cell nucleus is a membrane-bound organelle found in eukaryotic cells. Eukaryotic cells usually have a single nucleus, but a few cell types, such as mammalian red blood cells, have no nuclei, and a few others including osteoclasts have many. The main structures making up the nucleus are the nuclear envelope, a double membrane that encloses the entire organelle and isolates its contents from the cellular cytoplasm; and the nuclear matrix, a network within the nucleus that adds mechanical support.
Ribonucleic acid (RNA) is a polymeric molecule that is essential for most biological functions, either by performing the function itself or by forming a template for the production of proteins. RNA and deoxyribonucleic acid (DNA) are nucleic acids. The nucleic acids constitute one of the four major macromolecules essential for all known forms of life. RNA is assembled as a chain of nucleotides. Cellular organisms use messenger RNA (mRNA) to convey genetic information that directs synthesis of specific proteins. Many viruses encode their genetic information using an RNA genome.
A retrovirus is a type of virus that inserts a DNA copy of its RNA genome into the DNA of a host cell that it invades, thus changing the genome of that cell. After invading a host cell's cytoplasm, the virus uses its own reverse transcriptase enzyme to produce DNA from its RNA genome, the reverse of the usual pattern, thus retro (backward). The new DNA is then incorporated into the host cell genome by an integrase enzyme, at which point the retroviral DNA is referred to as a provirus. The host cell then treats the viral DNA as part of its own genome, transcribing and translating the viral genes along with the cell's own genes, producing the proteins required to assemble new copies of the virus. Many retroviruses cause serious diseases in humans, other mammals, and birds.
Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, proteins or non-coding RNA, and ultimately affect a phenotype. These products are often proteins, but in non-protein-coding genes such as transfer RNA (tRNA) and small nuclear RNA (snRNA), the product is a functional non-coding RNA. The process of gene expression is used by all known life—eukaryotes, prokaryotes, and utilized by viruses—to generate the macromolecular machinery for life.
Transcription is the process of copying a segment of DNA into RNA. The segments of DNA transcribed into RNA molecules that can encode proteins produce messenger RNA (mRNA). Other segments of DNA are transcribed into RNA molecules called non-coding RNAs (ncRNAs).
Regulation of gene expression, or gene regulation, includes a wide range of mechanisms that are used by cells to increase or decrease the production of specific gene products. Sophisticated programs of gene expression are widely observed in biology, for example to trigger developmental pathways, respond to environmental stimuli, or adapt to new food sources. Virtually any step of gene expression can be modulated, from transcriptional initiation, to RNA processing, and to the post-translational modification of a protein. Often, one gene regulator controls another, and so on, in a gene regulatory network.
A primary transcript is the single-stranded ribonucleic acid (RNA) product synthesized by transcription of DNA, and processed to yield various mature RNA products such as mRNAs, tRNAs, and rRNAs. The primary transcripts designated to be mRNAs are modified in preparation for translation. For example, a precursor mRNA (pre-mRNA) is a type of primary transcript that becomes a messenger RNA (mRNA) after processing.
RNA-binding proteins are proteins that bind to the double or single stranded RNA in cells and participate in forming ribonucleoprotein complexes. RBPs contain various structural motifs, such as RNA recognition motif (RRM), dsRNA binding domain, zinc finger and others. They are cytoplasmic and nuclear proteins. However, since most mature RNA is exported from the nucleus relatively quickly, most RBPs in the nucleus exist as complexes of protein and pre-mRNA called heterogeneous ribonucleoprotein particles (hnRNPs). RBPs have crucial roles in various cellular processes such as: cellular function, transport and localization. They especially play a major role in post-transcriptional control of RNAs, such as: splicing, polyadenylation, mRNA stabilization, mRNA localization and translation. Eukaryotic cells express diverse RBPs with unique RNA-binding activity and protein–protein interaction. According to the Eukaryotic RBP Database (EuRBPDB), there are 2961 genes encoding RBPs in humans. During evolution, the diversity of RBPs greatly increased with the increase in the number of introns. Diversity enabled eukaryotic cells to utilize RNA exons in various arrangements, giving rise to a unique RNP (ribonucleoprotein) for each RNA. Although RBPs have a crucial role in post-transcriptional regulation in gene expression, relatively few RBPs have been studied systematically.It has now become clear that RNA–RBP interactions play important roles in many biological processes among organisms.
In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protein-coding genes and non-coding genes.
Heterogeneous nuclear ribonucleoprotein A1 is a protein that in humans is encoded by the HNRNPA1 gene. Mutations in hnRNP A1 are causative of amyotrophic lateral sclerosis and the syndrome multisystem proteinopathy.
Y box binding protein 1 also known as Y-box transcription factor or nuclease-sensitive element-binding protein 1 is a protein that in humans is encoded by the YBX1 gene. YBX1 is an RNA binding protein that stabilises messenger RNAs modified with N6-methyladenosine.
DNA damage-binding protein 1 is a protein that in humans is encoded by the DDB1 gene.
HIV Tat-specific factor 1 is a protein that in humans is encoded by the HTATSF1 gene.
T-box transcription factor TBX3 is a protein that in humans is encoded by the TBX3 gene.
Nuclear factor 1 X-type is a protein that in humans is encoded by the NFIX gene. NFI-X3, a splice variant of NFIX, regulates Glial fibrillary acidic protein and YKL-40 in astrocytes.
Cleavage and polyadenylation specificity factor subunit 4 is a protein that in humans is encoded by the CPSF4 gene.
Rev is a transactivating protein that is essential to the regulation of HIV-1 protein expression. A nuclear localization signal is encoded in the rev gene, which allows the Rev protein to be localized to the nucleus, where it is involved in the export of unspliced and incompletely spliced mRNAs. In the absence of Rev, mRNAs of the HIV-1 late (structural) genes are retained in the nucleus, preventing their translation.
HBx is a hepatitis B viral protein. It is 154 amino acids long and interferes with transcription, signal transduction, cell cycle progress, protein degradation, apoptosis and chromosomal stability in the host. It forms a heterodimeric complex with its cellular target protein, and this interaction dysregulates centrosome dynamics and mitotic spindle formation. It interacts with DDB1 redirecting the ubiquitin ligase activity of the CUL4-DDB1 E3 complexes, which are intimately involved in the intracellular regulation of DNA replication and repair, transcription and signal transduction.
Hepatitis B virus (HBV) is a partially double-stranded DNA virus, a species of the genus Orthohepadnavirus and a member of the Hepadnaviridae family of viruses. This virus causes the disease hepatitis B.
Protein ZGRF1 is a protein encoded in the human by the ZGRF1 gene also known as C4orf21, that has a weight of 236.6 kDa. The ZGRF1 gene product localizes to the cell nucleus and promotes DNA repair by stimulating homologous recombination. This gene shows relatively low expression in most human tissues, with increased expression in situations of chemical dependence. ZGRF1 is orthologous to nearly all eukaryotes. Functional domains of this protein link it to a series of helicases, most notably the AAA_12 and AAA_11 domains.