PAS fold | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||
Symbol | PAS | ||||||||||
Pfam | PF00989 | ||||||||||
Pfam clan | CL0183 | ||||||||||
ECOD | 223.1.1 | ||||||||||
InterPro | IPR013767 | ||||||||||
SMART | PAS | ||||||||||
PROSITE | PDOC50112 | ||||||||||
SCOP2 | 2phy / SCOPe / SUPFAM | ||||||||||
CDD | cd00130 | ||||||||||
|
A Per-Arnt-Sim (PAS) domain is a protein domain found in all kingdoms of life. [2] Generally, the PAS domain acts as a molecular sensor, whereby small molecules and other proteins associate via binding of the PAS domain. [3] [4] [5] Due to this sensing capability, the PAS domain has been shown as the key structural motif involved in protein-protein interactions of the circadian clock, and it is also a common motif found in signaling proteins, where it functions as a signaling sensor. [6] [7]
PAS domains are found in a large number of organisms from bacteria to mammals. The PAS domain was named after the three proteins in which it was first discovered: [8]
Since the initial discovery of the PAS domain, a large quantity of PAS domain binding sites have been discovered in bacteria and eukaryotes. A subset called PAS LOV proteins are responsive to oxygen, light and voltage. [9]
Although the PAS domain exhibits a degree of sequence variability, the three-dimensional structure of the PAS domain core is broadly conserved. [10] This core consists of a five-stranded antiparallel β-sheet and several α-helices. Structural changes, as a result of signaling, predominantly originate within the β-sheet. These signals propagate via the α-helices of the core to the covalently-attached effector domain. [11] In 1998, the PAS domain core architecture was first characterized in the structure of photoactive yellow protein (PYP) from Halorhodospira halophila . [10] In many proteins, a dimer of PAS domains is required, whereby one binds a ligand and the other mediates interactions with other proteins. [5]
The PAS domains that are known share less than 20% average pairwise sequence identity, meaning they are surprisingly dissimilar. [10] PAS domains are frequently found on proteins with other environmental sensing mechanisms. Also, many PAS domains are attached to photoreceptive cells. [12]
Often in the bacterial kingdom, PAS domains are positioned at the amino terminus of signaling proteins such as sensor histidine kinases, cyclic-di-GMP syntheses and hydrolases, and methyl-accepting chemotaxis proteins. [10]
In the presence of light, White Collar-1 (WC-1) and White Collar-2 (WC-2) dimerizes via mediation by the PAS domains, which activates translation of FRQ. [13]
In the presence of light, CLK and CYC attach via a PAS domain, activating the translation of PER, which then associates to Tim via the PER PAS domain. The following genes contain PAS binding domains: PER, Tim, CLK, CYC.
A PAS domain is found in the ZTL and NPH1 genes. These domains are very similar to the PAS domain found in the Neurospora circadian-associated protein WC-1. [14]
The circadian clock that is currently understood for mammals begins when light activates BMAL1 and CLK to bind via their PAS domains. That activator complex regulates Per1, Per2, and Per3 which all have PAS domains that are used to bind to cryptochromes 1 and 2 (CRY 1,2 family). The following mammalian genes contain PAS binding domains: Per1, Per2, Per3, Cry1, Cry2, Bmal, Clk, Pasd1.
Within Mammals, both PAS domains play important roles. PAS A is responsible for the protein-protein interactions with other PAS domain proteins, while PAS B has a more versatile role. It mediates interactions with chaperonins and other small molecules like dioxin, but PAS B domains in NPAS2, a homolog of the Drosophila clk gene, and the hypoxia inducible factor (HIF) also help to mediate ligand binding. [12] Furthermore, PAS domains containing the NPAS2 protein have been shown to be a substitute for the Clock gene in mutant mice who lack the Clock gene completely. [15]
The PAS domain also directly interacts with BHLH. It is typically located on the C-Terminus of the BHLH protein. PAS domains containing BHLH proteins form a BHLH-Pas protein, typically found and encoded in HIF, which require both the PAS domain and BHLH domain and the Clock gene. [16] [17] [18]
GAF domain | |
---|---|
Identifiers | |
Symbol | GAF |
Pfam clan | CL0161 |
ECOD | 223.1.1 |
These cGMP-binding domains are found in diverse phototransducing proteins across eukaryotes and eubacteria. They are present in plant and cyanobacterial phytochromes, vertebrate and invertebrate cGMP-stimulated phosphodiesterases (PDEs) and some non-photosynthetic eubacteria. [19] [20] [21]
Cache domain | |
---|---|
Identifiers | |
Symbol | Cache |
Pfam clan | CL0165 |
ECOD | 223.1.1 |
These extracellular signaling domains are homologous to PAS domains but distinct. [22] They are common to animal calcium (Ca2+) channel subunits and certain prokaryotic chemotaxis receptors and play a role in small-molecule recognition across various species, suggesting a conserved mechanism of ligand binding. [23] As opposite to the intracellular PAS and GAF domains, they show a long extra N-terminal alpha helix. [22]
Hpt domain | |||||||||
---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||
Symbol | Hpt | ||||||||
Pfam | PF01627 | ||||||||
ECOD | 601.3.1 | ||||||||
InterPro | IPR036641 | ||||||||
|
Also known as histidine phosphotransfer domains and histidine phosphotransferases, these domains are protein domains involved in the "phosphorelay" form of two-component regulatory systems. [20]
HAMP | |||||||||
---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||
Symbol | HAMP | ||||||||
Pfam | PF00672 | ||||||||
Pfam clan | CL0681 | ||||||||
ECOD | 4168.1.1 | ||||||||
InterPro | IPR003660 | ||||||||
|
The HAMP domain (present in Histidine kinases, Adenylate cyclases, Methyl accepting proteins and Phosphatases) [24] is an approximately 50-amino acid alpha-helical region that forms a dimeric, four-helical coiled coil. [25]
A protein kinase is a kinase which selectively modifies other proteins by covalently adding phosphates to them (phosphorylation) as opposed to kinases which modify lipids, carbohydrates, or other molecules. Phosphorylation usually results in a functional change of the target protein (substrate) by changing enzyme activity, cellular location, or association with other proteins. The human genome contains about 500 protein kinase genes and they constitute about 2% of all human genes. There are two main types of protein kinase. The great majority are serine/threonine kinases, which phosphorylate the hydroxyl groups of serines and threonines in their targets. Most of the others are tyrosine kinases, although additional types exist. Protein kinases are also found in bacteria and plants. Up to 30% of all human proteins may be modified by kinase activity, and kinases are known to regulate the majority of cellular pathways, especially those involved in signal transduction.
A basic helix–loop–helix (bHLH) is a protein structural motif that characterizes one of the largest families of dimerizing transcription factors. The word "basic" does not refer to complexity but to the chemistry of the motif because transcription factors in general contain basic amino acid residues in order to facilitate DNA binding.
The aryl hydrocarbon receptor is a protein that in humans is encoded by the AHR gene. The aryl hydrocarbon receptor is a transcription factor that regulates gene expression. It was originally thought to function primarily as a sensor of xenobiotic chemicals and also as the regulator of enzymes such as cytochrome P450s that metabolize these chemicals. The most notable of these xenobiotic chemicals are aromatic (aryl) hydrocarbons from which the receptor derives its name.
The ARNT gene encodes the aryl hydrocarbon receptor nuclear translocator protein that forms a complex with ligand-bound aryl hydrocarbon receptor (AhR), and is required for receptor function. The encoded protein has also been identified as the beta subunit of a heterodimeric transcription factor, hypoxia-inducible factor 1 (HIF1). A t(1;12)(q21;p13) translocation, which results in a TEL–ARNT fusion protein, is associated with acute myeloblastic leukemia. Three alternatively spliced variants encoding different isoforms have been described for this gene.
CLOCK is a gene encoding a basic helix-loop-helix-PAS transcription factor that is known to affect both the persistence and period of circadian rhythms.
Soluble guanylyl cyclase (sGC) is one of the gasoreceptors for nitric oxide, NO. It is soluble, i.e. completely intracellular. Most notably, this enzyme is involved in vasodilation. In humans, it is encoded by the genes GUCY1A2, GUCY1A3, GUCY1B2 and GUCY1B3.
An E-box is a DNA response element found in some eukaryotes that acts as a protein-binding site and has been found to regulate gene expression in neurons, muscles, and other tissues. Its specific DNA sequence, CANNTG, with a palindromic canonical sequence of CACGTG, is recognized and bound by transcription factors to initiate gene transcription. Once the transcription factors bind to the promoters through the E-box, other enzymes can bind to the promoter and facilitate transcription from DNA to mRNA.
Period (per) is a gene located on the X chromosome of Drosophila melanogaster. Oscillations in levels of both per transcript and its corresponding protein PER have a period of approximately 24 hours and together play a central role in the molecular mechanism of the Drosophila biological clock driving circadian rhythms in eclosion and locomotor activity. Mutations in the per gene can shorten (perS), lengthen (perL), and even abolish (per0) the period of the circadian rhythm.
Neuronal PAS domain protein 2 (NPAS2) also known as member of PAS protein 4 (MOP4) is a transcription factor protein that in humans is encoded by the NPAS2 gene. NPAS2 is paralogous to CLOCK, and both are key proteins involved in the maintenance of circadian rhythms in mammals. In the brain, NPAS2 functions as a generator and maintainer of mammalian circadian rhythms. More specifically, NPAS2 is an activator of transcription and translation of core clock and clock-controlled genes through its role in a negative feedback loop in the suprachiasmatic nucleus (SCN), the brain region responsible for the control of circadian rhythms.
Aryl hydrocarbon receptor nuclear translocator-like 2, also known as Arntl2, Mop9, Bmal2, or Clif, is a gene.
Rev-Erb beta (Rev-Erbβ), also known as nuclear receptor subfamily 1 group D member 2 (NR1D2), is a member of the Rev-Erb protein family. Rev-Erbβ, like Rev-Erbα, belongs to the nuclear receptor superfamily of transcription factors and can modulate gene expression through binding to gene promoters. Together with Rev-Erbα, Rev-Erbβ functions as a major regulator of the circadian clock. These two proteins are partially redundant. Current research suggests that Rev-Erbβ is less important in maintaining the circadian clock than Rev-Erbα; knock-out studies of Rev-Erbα result in significant circadian disruption but the same has not been found with Rev-Erbβ. Rev-Erbβ compensation for Rev-Erbα varies across tissues, and further research is needed to elucidate the separate role of Rev-Erbβ.
Basic helix-loop-helix ARNT-like protein 1 or aryl hydrocarbon receptor nuclear translocator-like protein 1 (ARNTL), or brain and muscle ARNT-like 1 is a protein that in humans is encoded by the BMAL1 gene on chromosome 11, region p15.3. It's also known as MOP3, and, less commonly, bHLHe5, BMAL, BMAL1C, JAP3, PASD3, and TIC.
The CHASE domain is an extracellular protein domain, which is found in transmembrane receptor from bacteria, lower eukaryotes and plants. It has been named CHASE because of its presence in diverse receptor-like proteins with histidine kinase and nucleotide cyclase domains. The CHASE domain is 200-230 amino acids long and always occurs N-terminally in extracellular or periplasmic locations, followed by an intracellular tail housing diverse enzymatic signalling domains such as histidine kinase, adenyl cyclase, GGDEF-type nucleotide cyclase and EAL-type phosphodiesterase domains, as well as non-enzymatic domains such PAS, GAF, phosphohistidine and response regulatory domains. The CHASE domain is predicted to bind diverse low molecular weight ligands, such as the cytokinin-like adenine derivatives or peptides, and mediate signal transduction through the respective receptors.
In molecular biology, a response regulator is a protein that mediates a cell's response to changes in its environment as part of a two-component regulatory system. Response regulators are coupled to specific histidine kinases which serve as sensors of environmental changes. Response regulators and histidine kinases are two of the most common gene families in bacteria, where two-component signaling systems are very common; they also appear much more rarely in the genomes of some archaea, yeasts, filamentous fungi, and plants. Two-component systems are not found in metazoans.
In molecular biology, the HAMP domain is an approximately 50-amino acid alpha-helical region that forms a dimeric, four-helical coiled coil. It is found in bacterial sensor and chemotaxis proteins and in eukaryotic histidine kinases. The bacterial proteins are usually integral membrane proteins and part of a two-component signal transduction pathway. One or several copies of the HAMP domain can be found in association with other domains, such as the histidine kinase domain, the bacterial chemotaxis sensory transducer domain, the PAS repeat, the EAL domain, the GGDEF domain, the protein phosphatase 2C-like domain, the guanylate cyclase domain, or the response regulatory domain. In its most common setting, the HAMP domain transmits conformational changes in periplasmic ligand-binding domains to cytoplasmic signalling kinase and methyl-acceptor domains and thus regulates the phosphorylation or methylation activity of homodimeric receptors.
Cycle (cyc) is a gene in Drosophila melanogaster that encodes the CYCLE protein (CYC). The Cycle gene (cyc) is expressed in a variety of cell types in a circadian manner. It is involved in controlling both the sleep-wake cycle and circadian regulation of gene expression by promoting transcription in a negative feedback mechanism. The cyc gene is located on the left arm of chromosome 3 and codes for a transcription factor containing a basic helix–loop–helix (bHLH) domain and a PAS domain. The 2.17 kb cyc gene is divided into 5 coding exons totaling 1,625 base pairs which code for 413 aminos acid residues. Currently 19 alleles are known for cyc. Orthologs performing the same function in other species include basic helix-loop-helix ARNT-like protein 1 (ARNTL) and Aryl hydrocarbon receptor nuclear translocator-like 2 (ARNTL2).
Paul Hardin is an American scientist in the field of chronobiology and a pioneering researcher in the understanding of circadian clocks in flies and mammals. Hardin currently serves as a distinguished professor in the biology department at Texas A&M University. He is best known for his discovery of circadian oscillations in the mRNA of the clock gene Period (per), the importance of the E-Box in per activation, the interlocked feedback loops that control rhythms in activator gene transcription, and the circadian regulation of olfaction in Drosophila melanogaster. Born in a suburb of Chicago, Matteson, Illinois, Hardin currently resides in College Station, Texas, with his wife and three children.
Transcription-translation feedback loop (TTFL) is a cellular model for explaining circadian rhythms in behavior and physiology. Widely conserved across species, the TTFL is auto-regulatory, in which transcription of clock genes is regulated by their own protein products.
dClock (clk) is a gene located on the 3L chromosome of Drosophila melanogaster. Mapping and cloning of the gene indicates that it is the Drosophila homolog of the mouse gene CLOCK (mClock). The Jrk mutation disrupts the transcription cycling of per and tim and manifests dominant effects.
Regulator of CO Metabolism (RcoM) is a heme-containing transcription factor found in bacteria that senses carbon monoxide (CO). In the presence of carbon monoxide, this protein upregulates expression of genes involved in carbon monoxide oxidation or carbon monoxide stress response. RcoM is functionally related to another heme-containing transcription factor, CooA, but RcoM shares no structural relationship with CooA. RcoM is composed of an N-terminal Per-Arnt-Sim (PAS) domain and a C-terminal LytTR domain. The PAS domain binds a single molecule of heme and the LytTR domain binds to DNA upstream of carbon monoxide oxidation genes. The RcoM homolog from Paraburkholderia xenovorans is known to be dimeric and binds heme using a histidine and a methionine ligand in the Fe(II) oxidation state. Carbon monoxide replaces the methionine ligand and binds directly to the heme to active RcoM for DNA binding. Relative to other heme-containing proteins, RcoM has an extraordinarily high CO affinity, with a Kd < 100 pM, allowing this protein to sense very low levels of carbon monoxide.