Ubiquitin family | |||||||||
---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||
Symbol | Ubiquitin | ||||||||
Pfam | PF00240 | ||||||||
InterPro | IPR029071 | ||||||||
SMART | SM00213 | ||||||||
|
Ubiquitin-like proteins (UBLs) are a family of small proteins involved in post-translational modification of other proteins in a cell, usually with a regulatory function. The UBL protein family derives its name from the first member of the class to be discovered, ubiquitin (Ub), best known for its role in regulating protein degradation through covalent modification of other proteins. Following the discovery of ubiquitin, many additional evolutionarily related members of the group were described, involving parallel regulatory processes and similar chemistry. UBLs are involved in a widely varying array of cellular functions including autophagy, protein trafficking, inflammation and immune responses, transcription, DNA repair, RNA splicing, and cellular differentiation. [1] [2] [3]
Ubiquitin itself was first discovered in the 1970s and originally named "ubiquitous immunopoietic polypeptide". [4] Subsequently, other proteins with sequence similarity to ubiquitin were occasionally reported in the literature, but the first shown to share the key feature of covalent protein modification was ISG15, discovered in 1987. [5] A succession of reports in the mid 1990s is recognized as a turning point in the field, [6] with the discovery of SUMO (small ubiquitin-like modifier, also known as Sentrin or SENP1) reported around the same time by a variety of investigators in 1996, [7] NEDD8 in 1997, [8] and Apg12 in 1998. [9] A systematic survey has since identified over 10,000 distinct genes for ubiquitin or ubiquitin-like proteins represented in eukaryotic genomes. [10]
Members of the UBL family are small, non-enzymatic proteins that share a common structure exemplified by ubiquitin, which has 76 amino acid residues arranged into a "beta-grasp" protein fold consisting of a five-strand antiparallel beta sheet surrounding an alpha helix. [1] [11] [12] The beta-grasp fold is widely distributed in other proteins of both eukaryotic and prokaryotic origin. [13] Collectively, ubiquitin and ubiquitin-like proteins are sometimes referred to as "ubiquitons". [3]
UBLs can be divided into two categories depending on their ability to be covalently conjugated to other molecules. UBLs that are capable of conjugation (sometimes known as Type I) have a characteristic sequence motif consisting of one to two glycine residues at the C-terminus, through which covalent conjugation occurs. Typically, UBLs are expressed as inactive precursors and must be activated by proteolysis of the C-terminus to expose the active glycine. [1] [12] Almost all such UBLs are ultimately linked to another protein, but there is at least one exception; ATG8 is linked to phosphatidylethanolamine. [1] UBLs that do not exhibit covalent conjugation (Type II) often occur as protein domains genetically fused to other domains in a single larger polypeptide chain, and may be proteolytically processed to release the UBL domain [1] or may function as protein-protein interaction domains. [11] UBL domains of larger proteins are sometimes known as UBX domains. [14]
Ubiquitin is, as its name suggests, ubiquitous in eukaryotes; it is traditionally considered to be absent in bacteria and archaea, [11] though a few examples have been described in archaea. [15] UBLs are also widely distributed in eukaryotes, but their distribution varies among lineages; for example, ISG15, involved in the regulation of the immune system, is not present in lower eukaryotes. [1] Other families exhibit diversification in some lineages; a single member of the SUMO family is found in the yeast genome, but there are at least four in vertebrate genomes, which show some functional redundancy, [1] [2] and there are at least eight in the genome of the model plant Arabidopsis thaliana . [16]
The human genome encodes at least eight families of UBLs, not including ubiquitin itself, that are considered Type I UBLs and are known to covalently modify other proteins: SUMO, NEDD8, ATG8, ATG12, URM1, UFM1, FAT10, and ISG15. [1] One additional protein, known as FUBI, is encoded as a fusion protein in the FAU gene, and is proteolytically processed to generate a free glycine C-terminus, but has not been experimentally demonstrated to form covalent protein modifications. [1]
Plant genomes are known to encode at least seven families of UBLs in addition to ubiquitin: SUMO, RUB (the plant homolog of NEDD8), ATG8, ATG12, MUB, UFM1, and HUB1, as well as a number of Type II UBLs. [17] Some UBL families and their associated regulatory proteins in plants have undergone dramatic expansion, likely due to both whole genome duplication and other forms of gene duplication; the ubiquitin, SUMO, ATG8, and MUB families have been estimated to account for almost 90% of plants' UBL genes. [18] Proteins associated with ubiquitin and SUMO signaling are highly enriched in the genomes of embryophytes. [15]
In comparison to eukaryotes, prokaryotic proteins with relationships to UBLs are phylogenetically restricted. [19] [20] Prokaryotic ubiquitin-like protein (Pup) occurs in some actinobacteria and has functions closely analogous to ubiquitin in labeling proteins for proteasomal degradation; however it is intrinsically disordered and its evolutionary relationship to UBLs is unclear. [19] A related protein UBact in some Gram-negative lineages has recently been described. [21] By contrast, the protein TtuB in bacteria of the genus Thermus does share the beta-grasp fold with eukaryotic UBLs; it is reported to have dual functions as both a sulfur carrier protein and a covalently conjugated protein modification. [19] In archaea, the small archaeal modifier proteins (SAMPs) share the beta-grasp fold and have been shown to play a ubiquitin-like role in protein degradation. [19] [20] Recently, a seemingly complete set of genes corresponding to a eukaryote-like ubiquitin pathway was identified in an uncultured archaeon in 2011, [22] [23] [24] and at least three lineages of archaea—"Euryarchaeota", Thermoproteota (formerly Crenarchaeota), and "Aigarchaeota"—are believed to possess such systems. [15] [25] [26] In addition, some pathogenic bacteria have evolved proteins that mimic those in eukaryotic UBL pathways and interact with UBLs in the host cell, interfering with their signaling function. [27] [28]
Regulation of UBLs that are capable of covalent conjugation in eukaryotes is elaborate but typically parallel for each member of the family, best characterized for ubiquitin itself. The process of ubiquitination is a tightly regulated three-step sequence: activation, performed by ubiquitin-activating enzymes (E1); conjugation, performed by ubiquitin-conjugating enzymes (E2); and ligation, performed by ubiquitin ligases (E3). The result of this process is the formation of a covalent bond between the C-terminus of ubiquitin and a residue (typically a lysine) on the target protein. Many UBL families have a similar three-step process catalyzed by a distinct set of enzymes specific to that family. [1] [29] [30] Deubiquitination or deconjugation - that is, removal of ubiquitin from a protein substrate - is performed by deubiquitinating enzymes (DUBs); UBLs can also be degraded through the action of ubiquitin-specific proteases (ULPs). [31] The range of UBLs on which these enzymes can act is variable and can be difficult to predict. Some UBLs, such as SUMO and NEDD8, have family-specific DUBs and ULPs. [32]
Ubiquitin is capable of forming polymeric chains, with additional ubiquitin molecules covalently attached to the first, which in turn is attached to its protein substrate. These chains may be linear or branched, and different regulatory signals may be sent by differences in the length and branching of the ubiquitin chain. [31] Although not all UBL families are known to form chains, SUMO, NEDD8, and URM1 chains have all been experimentally detected. [1] Additionally, ubiquitin can itself be modified by UBLs, known to occur with SUMO and NEDD8. [31] [33] The best-characterized intersections between distinct UBL families involve ubiquitin and SUMO. [34] [35]
UBLs as a class are involved in a very large variety of cellular processes. Furthermore, individual UBL families vary in the scope of their activities and the diversity of the proteins to which they are conjugated. [1] The best known function of ubiquitin is identifying proteins to be degraded by the proteasome, but ubiquitination can play a role in other processes such as endocytosis and other forms of protein trafficking, transcription and transcription factor regulation, cell signaling, histone modification, and DNA repair. [11] [12] [36] Most other UBLs have similar roles in regulating cellular processes, usually with a more restricted known range than that of ubiquitin itself. SUMO proteins have the widest variety of cellular protein targets after ubiquitin [1] and are involved in processes including transcription, DNA repair, and the cellular stress response. [33] NEDD8 is best known for its role in regulating cullin proteins, which in turn regulate ubiquitin-mediated protein degradation, [2] though it likely also has other functions. [37] Two UBLs, ATG8 and ATG12, are involved in the process of autophagy; [38] both are unusual in that ATG12 has only two known protein substrates and ATG8 is conjugated not to a protein but to a phospholipid, phosphatidylethanolamine. [1]
The evolution of UBLs and their associated suites of regulatory proteins has been of interest since shortly after they were recognized as a family. [39] Phylogenetic studies of the beta-grasp protein fold superfamily suggest that eukaryotic UBLs are monophyletic, indicating a shared evolutionary origin. [13] UBL regulatory systems - including UBLs themselves and the cascade of enzymes that interact with them - are believed to share a common evolutionary origin with prokaryotic biosynthesis pathways for the cofactors thiamine and molybdopterin; the bacterial sulfur transfer proteins ThiS and MoaD from these pathways share the beta-grasp fold with UBLs, while sequence similarity and a common catalytic mechanism link pathway members ThiF and MoeB to ubiquitin-activating enzymes. [13] [17] [11] Interestingly, the eukaryotic protein URM1 functions as both a UBL and a sulfur-carrier protein, and has been described as a molecular fossil establishing this evolutionary link. [11] [40]
Comparative genomics surveys of UBL families and related proteins suggest that UBL signaling was already well-developed in the last eukaryotic common ancestor and ultimately originates from ancestral archaea, [15] a theory supported by the observation that some archaeal genomes possess the necessary genes for a fully functioning ubiquitination pathway. [25] [18] Two different diversification events within the UBL family have been identified in eukaryotic lineages, corresponding to the origin of multicellularity in both animal and plant lineages. [15]
Ubiquitin is a small regulatory protein found in most tissues of eukaryotic organisms, i.e., it is found ubiquitously. It was discovered in 1975 by Gideon Goldstein and further characterized throughout the late 1970s and 1980s. Four genes in the human genome code for ubiquitin: UBB, UBC, UBA52 and RPS27A.
Ubiquitin-like modifier activating enzyme 1 (UBA1) is an enzyme which in humans is encoded by the UBA1 gene. UBA1 participates in ubiquitination and the NEDD8 pathway for protein folding and degradation, among many other biological processes. This protein has been linked to X-linked spinal muscular atrophy type 2, neurodegenerative diseases, and cancers.
Deubiquitinating enzymes (DUBs), also known as deubiquitinating peptidases, deubiquitinating isopeptidases, deubiquitinases, ubiquitin proteases, ubiquitin hydrolases, or ubiquitin isopeptidases, are a large group of proteases that cleave ubiquitin from proteins. Ubiquitin is attached to proteins in order to regulate the degradation of proteins via the proteasome and lysosome; coordinate the cellular localisation of proteins; activate and inactivate proteins; and modulate protein-protein interactions. DUBs can reverse these effects by cleaving the peptide or isopeptide bond between ubiquitin and its substrate protein. In humans there are nearly 100 DUB genes, which can be classified into two main classes: cysteine proteases and metalloproteases. The cysteine proteases comprise ubiquitin-specific proteases (USPs), ubiquitin C-terminal hydrolases (UCHs), Machado-Josephin domain proteases (MJDs) and ovarian tumour proteases (OTU). The metalloprotease group contains only the Jab1/Mov34/Mpr1 Pad1 N-terminal+ (MPN+) (JAMM) domain proteases.
In molecular biology, SUMOproteins are a family of small proteins that are covalently attached to and detached from other proteins in cells to modify their function. This process is called SUMOylation. SUMOylation is a post-translational modification involved in various cellular processes, such as nuclear-cytosolic transport, transcriptional regulation, apoptosis, protein stability, response to stress, and progression through the cell cycle.
NEDD8 is a protein that in humans is encoded by the NEDD8 gene. This ubiquitin-like (UBL) protein becomes covalently conjugated to a limited number of cellular proteins, in a process called NEDDylation similar to ubiquitination. Human NEDD8 shares 60% amino acid sequence identity to ubiquitin. The primary known substrates of NEDD8 modification are the cullin subunits of cullin-based E3 ubiquitin ligases, which are active only when NEDDylated. Their NEDDylation is critical for the recruitment of E2 to the ligase complex, thus facilitating ubiquitin conjugation. NEDD8 modification has therefore been implicated in cell cycle progression and cytoskeletal regulation.
CDC34 is a gene that in humans encodes the protein Ubiquitin-conjugating enzyme E2 R1. This protein is a member of the ubiquitin-conjugating enzyme family, which catalyzes the covalent attachment of ubiquitin to other proteins.
NEDD8-activating enzyme E1 regulatory subunit is a protein that in humans is encoded by the NAE1 gene.
NEDD8-activating enzyme E1 catalytic subunit is a protein that in humans is encoded by the UBA3 gene.
Autophagy related 12 is a protein that in humans is encoded by the ATG12 gene.
NEDD8-conjugating enzyme Ubc12 is a protein that in humans is encoded by the UBE2M gene.
Autophagy related 7 is a protein in humans encoded by ATG7 gene. Related to GSA7; APG7L; APG7-LIKE.
Archaea is a domain of single-celled organisms. These microorganisms lack cell nuclei and are therefore prokaryotes. Archaea were initially classified as bacteria, receiving the name archaebacteria, but this term has fallen out of use.
Cullins are a family of hydrophobic scaffold proteins which provide support for ubiquitin ligases (E3). All eukaryotes appear to have cullins. They combine with RING proteins to form Cullin-RING ubiquitin ligases (CRLs) that are highly diverse and play a role in myriad cellular processes, most notably protein degradation by ubiquitination.
Autophagy-related protein 8 (Atg8) is a ubiquitin-like protein required for the formation of autophagosomal membranes. The transient conjugation of Atg8 to the autophagosomal membrane through a ubiquitin-like conjugation system is essential for autophagy in eukaryotes. Even though there are homologues in animals, this article mainly focuses on its role in lower eukaryotes such as Saccharomyces cerevisiae.
Ubiquitin-like 1-activating enzyme E1B (UBLE1B) also known as SUMO-activating enzyme subunit 2 (SAE2) is an enzyme that in humans is encoded by the UBA2 gene.
Ubiquitin-related modifier-1 (URM1) is a ubiquitin-like protein that modifies proteins in the yeast ubiquitin-like urmylation pathway. Structural comparisons and phylogenetic analysis of the ubiquitin superfamily has indicated that Urm1 has the most conserved structural and sequence features of the common ancestor of the entire superfamily.
Lokiarchaeota is a proposed phylum of the Archaea. The phylum includes all members of the group previously named Deep Sea Archaeal Group, also known as Marine Benthic Group B. Lokiarchaeota is part of the superphylum Asgard containing the phyla: Lokiarchaeota, Thorarchaeota, Odinarchaeota, Heimdallarchaeota, and Helarchaeota. A phylogenetic analysis disclosed a monophyletic grouping of the Lokiarchaeota with the eukaryotes. The analysis revealed several genes with cell membrane-related functions. The presence of such genes support the hypothesis of an archaeal host for the emergence of the eukaryotes; the eocyte-like scenarios.
Asgard or Asgardarchaeota is a proposed superphylum consisting of a group of archaea that contain eukaryotic signature proteins. It appears that the eukaryotes, the domain that contains the animals, plants, and fungi, emerged within the Asgard, in a branch containing the Heimdallarchaeota. This supports the two-domain system of classification over the three-domain system.
The two-domain system is a biological classification by which all organisms in the tree of life are classified into two big domains, Bacteria and Archaea. It emerged from development of knowledge of archaea diversity and challenges to the widely accepted three-domain system that defines life into Bacteria, Archaea, and Eukarya. It was preceded by the eocyte hypothesis of James A. Lake in the 1980s, which was largely superseded by the three-domain system, due to evidence at the time. Better understanding of archaea, especially of their roles in the origin of eukaryotes through symbiogenesis with bacteria, led to the revival of the eocyte hypothesis in the 2000s. The two-domain system became more widely accepted after the discovery of a large group (superphylum) of archaea called Asgard in 2017, which evidence suggests to be the evolutionary root of eukaryotes, implying that eukaryotes are members of the domain Archaea.
Arabidopsis SUMO-conjugation enzyme (AtSCE1) is an enzyme that is a member of the small ubiquitin-like modifier (SUMO) post-translational modification pathway. This process, and the SCE1 enzyme with it, is highly conserved across eukaryotes yet absent in prokaryotes. In short, this pathway results in the attachment of a small polypeptide through an isopeptide bond between modifying enzyme and the ε-amino group of a lysine residue in the substrate. In plants, the 160 amino acid SCE1 enzyme was first characterized in 2003. One functional gene copy, SCE1a, was found on chromosomes 3.