RegulonDB

Last updated
RegulonDB
01RegulonDB.png
Content
DescriptionTranscriptional regulation of Escherichia coli K-12
Organisms Escherichia coli K-12
Contact
Research center Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México
AuthorsGama-Castro et al.
Primary citationGama-Castro et al. (2015) [1]
Release date2017
Access
Website RegulonDB
Miscellaneous
Version9.4

RegulonDB is a database of the regulatory network of gene expression in Escherichia coli K-12. [1] [2] RegulonDB also models the organization of the genes in transcription units, operons and regulons. A total of 120 sRNAs with 231 total interactions which all together regulate 192 genes are also included. RegulonDB was founded in 1998 and also contributes data to the EcoCyc database.

Contents

Transcription factors and sensory-response units

In bacteria, such as E. coli, genes, are regulated by sequence elements in promoters and related binding sites). RegulonDB provides a database of such regulatory elements, their binding sites and the transcription factors that bind to these sites in E. coli. RegulonDB 9.0 includes 184 experimentally determined transcription factors (TFs) as well as 120 computationally predicted TFs, that is, a total of 304.

The complete repertoire of 189 genetic sensory-response units (GENSOR units) are reported, integrating their signal, regulatory interactions, and metabolic pathways. A total of 78 GENSOR units have their four components highlighted; 119 include the genetic switch and the response, and 2 contain only the genetic switch.

A total of 103 TFs have a known effector in RegulonDB, including 25 two-component systems. There were enough sites to build a motif for 93 TFs to infer 16,207 predicted TF binding sites. This set of predicted binding sites corresponds to 12,574 TF → gene regulatory interactions; this represents a recovery of 52% of the 1592 annotated regulatory interactions in the database for the 93 TFs for which RegulonDB has a position-weight matrix (PWM). If only TFs with a good-quality PWM are taken into account, the total number of predicted TF → gene interactions is 8,714, recovering 672 (57%) of annotated interactions for this TF subset. Semi-automatic curation produced a total of 3,195 regulatory interactions for 199 TFs.

Definitions

Check the glossary for all definitions.

Transcription unit (TU)

A transcription unit is a set of one or more genes transcribed from a single promoter. A TU may also include regulatory protein binding sites affecting this promoter and a terminator. A complex operon with several promoters contains, therefore, several transcription units. A transcription unit must include all the genes in an operon.

Promoters and terminators

A promoter is defined in RegulonDB as the nucleotide sequence 60 bases upstream and 20 downstream from the precise initiation of transcription or +1. Terminators are regions where transcription ends, and RNA Polymerase unbinds from DNA.

Binding site

The TFs binding sites are physical DNA sites recognized by transcription factors within a genome, including enhancer, upstream activator (UAS) and operator sites that may bind repressors or activators.

Graphic display in RegulonDB

The graphic display of an operon contains all the genes of its different transcription units, as well as all the regulatory elements involved in the transcription and regulation of those TUs. An operon is here conceived as a structural unit encompassing all genes and regulatory elements. An operon with several promoters located near each other may also have dual binding sites, indicating that such a site can activate one particular promoter, but repress a second one. In the same page, the collection of the different TUs is displayed below the operon. The graphic display of an operon contains all the genes of its different transcription units, as well as all the regulatory elements involved in the transcription and regulation of those TUs. The graphic display of a TU will always contain only one promoter -when known- with the binding sites that regulate its activity, followed by the transcribed genes. Note that dual sites are frequently displayed at a TU as repressors or activators. This is because the site will have a particular effect on the promoter of that TU.

Related Research Articles

In genetics, an operon is a functioning unit of DNA containing a cluster of genes under the control of a single promoter. The genes are transcribed together into an mRNA strand and either translated together in the cytoplasm, or undergo splicing to create monocistronic mRNAs that are translated separately, i.e. several strands of mRNA that each encode a single gene product. The result of this is that the genes contained in the operon are either expressed together or not at all. Several genes must be co-transcribed to define an operon.

<span class="mw-page-title-main">Lac repressor</span>

The lac repressor (LacI) is a DNA-binding protein that inhibits the expression of genes coding for proteins involved in the metabolism of lactose in bacteria. These genes are repressed when lactose is not available to the cell, ensuring that the bacterium only invests energy in the production of machinery necessary for uptake and utilization of lactose when lactose is present. When lactose becomes available, it is firstly converted into allolactose by β-Galactosidase (lacZ) in bacteria. The DNA binding ability of lac repressor bound with allolactose is inhibited due to allosteric regulation, thereby genes coding for proteins involved in lactose uptake and utilization can be expressed.

In molecular biology and genetics, transcriptional regulation is the means by which a cell regulates the conversion of DNA to RNA (transcription), thereby orchestrating gene activity. A single gene can be regulated in a range of ways, from altering the number of copies of RNA that are transcribed, to the temporal control of when the gene is transcribed. This control allows the cell or organism to respond to a variety of intra- and extracellular signals and thus mount a response. Some examples of this include producing the mRNA that encode enzymes to adapt to a change in a food source, producing the gene products involved in cell cycle specific activities, and producing the gene products responsible for cellular differentiation in multicellular eukaryotes, as studied in evolutionary developmental biology.

A sigma factor is a protein needed for initiation of transcription in bacteria. It is a bacterial transcription initiation factor that enables specific binding of RNA polymerase (RNAP) to gene promoters. It is homologous to archaeal transcription factor B and to eukaryotic factor TFIIB. The specific sigma factor used to initiate transcription of a given gene will vary, depending on the gene and on the environmental signals needed to initiate transcription of that gene. Selection of promoters by RNA polymerase is dependent on the sigma factor that associates with it. They are also found in plant chloroplasts as a part of the bacteria-like plastid-encoded polymerase (PEP).

<i>lac</i> operon Set genes encoding proteins and enzymes for lactose metabolism

The lactose operon is an operon required for the transport and metabolism of lactose in E. coli and many other enteric bacteria. Although glucose is the preferred carbon source for most bacteria, the lac operon allows for the effective digestion of lactose when glucose is not available through the activity of beta-galactosidase. Gene regulation of the lac operon was the first genetic regulatory mechanism to be understood clearly, so it has become a foremost example of prokaryotic gene regulation. It is often discussed in introductory molecular and cellular biology classes for this reason. This lactose metabolism system was used by François Jacob and Jacques Monod to determine how a biological cell knows which enzyme to synthesize. Their work on the lac operon won them the Nobel Prize in Physiology in 1965.

A transcriptional activator is a protein that increases transcription of a gene or set of genes. Activators are considered to have positive control over gene expression, as they function to promote gene transcription and, in some cases, are required for the transcription of genes to occur. Most activators are DNA-binding proteins that bind to enhancers or promoter-proximal elements. The DNA site bound by the activator is referred to as an "activator-binding site". The part of the activator that makes protein–protein interactions with the general transcription machinery is referred to as an "activating region" or "activation domain".

<span class="mw-page-title-main">Regulator gene</span>

A regulator gene, regulator, or regulatory gene is a gene involved in controlling the expression of one or more other genes. Regulatory sequences, which encode regulatory genes, are often at the five prime end (5') to the start site of transcription of the gene they regulate. In addition, these sequences can also be found at the three prime end (3') to the transcription start site. In both cases, whether the regulatory sequence occurs before (5') or after (3') the gene it regulates, the sequence is often many kilobases away from the transcription start site. A regulator gene may encode a protein, or it may work at the level of RNA, as in the case of genes encoding microRNAs. An example of a regulator gene is a gene that codes for a repressor protein that inhibits the activity of an operator.

In molecular biology, an inducer is a molecule that regulates gene expression. An inducer functions in two ways; namely:

Cis-regulatory elements (CREs) or Cis-regulatory modules (CRMs) are regions of non-coding DNA which regulate the transcription of neighboring genes. CREs are vital components of genetic regulatory networks, which in turn control morphogenesis, the development of anatomy, and other aspects of embryonic development, studied in evolutionary developmental biology.

Gene structure is the organisation of specialised sequence elements within a gene. Genes contain most of the information necessary for living cells to survive and reproduce. In most organisms, genes are made of DNA, where the particular DNA sequence determines the function of the gene. A gene is transcribed (copied) from DNA into RNA, which can either be non-coding (ncRNA) with a direct function, or an intermediate messenger (mRNA) that is then translated into protein. Each of these steps is controlled by specific sequence elements, or regions, within the gene. Every gene, therefore, requires multiple sequence elements to be functional. This includes the sequence that actually encodes the functional protein or ncRNA, as well as multiple regulatory sequence regions. These regions may be as short as a few base pairs, up to many thousands of base pairs long.

<span class="mw-page-title-main">Amino acid synthesis</span> The set of biochemical processes by which amino acids are produced

Amino acid synthesis is the set of biochemical processes by which the amino acids are produced. The substrates for these processes are various compounds in the organism's diet or growth media. Not all organisms are able to synthesize all amino acids. For example, humans can synthesize 11 of the 20 standard amino acids. These 11 are called the non-essential amino acids).

The L-arabinose operon, also called the ara or araBAD operon, is an operon required for the breakdown of the five-carbon sugar L-arabinose in Escherichia coli. The L-arabinose operon contains three structural genes: araB, araA, araD, which encode for three metabolic enzymes that are required for the metabolism of L-arabinose. AraB (ribulokinase), AraA, AraD produced by these genes catalyse conversion of L-arabinose to an intermediate of the pentose phosphate pathway, D-xylulose-5-phosphate.

In molecular genetics, a regulon is a group of genes that are regulated as a unit, generally controlled by the same regulatory gene that expresses a protein acting as a repressor or activator. This terminology is generally, although not exclusively, used in reference to prokaryotes, whose genomes are often organized into operons; the genes contained within a regulon are usually organized into more than one operon at disparate locations on the chromosome. Applied to eukaryotes, the term refers to any group of non-contiguous genes controlled by the same regulatory gene.

The gal operon is a prokaryotic operon, which encodes enzymes necessary for galactose metabolism. Repression of gene expression for this operon works via binding of repressor molecules to two operators. These repressors dimerize, creating a loop in the DNA. The loop as well as hindrance from the external operator prevent RNA polymerase from binding to the promoter, and thus prevent transcription. Additionally, since the metabolism of galactose in the cell is involved in both anabolic and catabolic pathways, a novel regulatory system using two promoters for differential repression has been identified and characterized within the context of the gal operon.

The fnr gene of Escherichia coli encodes a transcriptional activator (FNR) which is required for the expression of a number of genes involved in anaerobic respiratory pathways. The FNR protein of E. coli is an oxygen – responsive transcriptional regulator required for the switch from aerobic to anaerobic metabolism.

"Type III mutants, originally frdB, were designated fnr because they were defective in fumarate and nitrate reduction and impaired in their ability to produce gas." - Lambden and Guest, 1976 Journal of General Microbiology97, 145-160

αr35 is a family of bacterial small non-coding RNAs with representatives in a reduced group of Alphaproteobacteria from the order Hyphomicrobiales. The first member of this family (Smr35B) was found in a Sinorhizobium meliloti 1021 locus located in the symbiotic plasmid B (pSymB). Further homology and structure conservation analysis have identified full-length SmrB35 homologs in other legume symbionts, as well as in the human and plant pathogens Brucella anthropi and Agrobacterium tumefaciens, respectively. αr35 RNA species are 139-142 nt long and share a common secondary structure consisting of two stem loops and a well conserved rho independent terminator. Most of the αr35 transcripts can be catalogued as trans-acting sRNAs expressed from well-defined promoter regions of independent transcription units within intergenic regions of the Alphaproteobacterial genomes.

<i>gab</i> operon

The gab operon is responsible for the conversion of γ-aminobutyrate (GABA) to succinate. The gab operon comprises three structural genes – gabD, gabT and gabP – that encode for a succinate semialdehyde dehydrogenase, GABA transaminase and a GABA permease respectively. There is a regulatory gene csiR, downstream of the operon, that codes for a putative transcriptional repressor and is activated when nitrogen is limiting.

In bacterial genetics, the mal regulon is a regulon - or group of genes under common regulation - associated with the catabolism of maltose and maltodextrins. The system is especially well characterized in the model organism Escherichia coli, where it is classically described as a group of ten genes in multiple operons whose expression is regulated by a single regulatory protein, malT. MalT binds to maltose or maltodextrin and undergoes a conformational change that allows it to bind DNA at sequences near the promoters of genes required for uptake and catabolism of these sugars. The maltose regulation system in E. coli is a classic example of positive regulation. malT is regulated by catabolite repression via the catabolite activator protein. Genes under the control of malT include ATP-binding cassette transporter components, maltoporin, maltose binding protein, and several enzymes. Other Gram-negative bacteria such as Klebsiella pneumoniae have additional genes under the control of malT.

Julio Collado-Vides is a Guatemalan scientist and Professor of Computational Genomics at the National Autonomous University of Mexico. His research focuses on genomics and bioinformatics.

<span class="mw-page-title-main">Pho regulon</span>

The Phosphate (Pho) regulon is a regulatory mechanism used for the conservation and management of inorganic phosphate within the cell. It was first discovered in Escherichia coli as an operating system for the bacterial strain, and was later identified in other species. The Pho system is composed of various components including extracellular enzymes and transporters that are capable of phosphate assimilation in addition to extracting inorganic phosphate from organic sources. This is an essential process since phosphate plays an important role in cellular membranes, genetic expression, and metabolism within the cell. Under low nutrient availability, the Pho regulon helps the cell survive and thrive despite a depletion of phosphate within the environment. When this occurs, phosphate starvation-inducible (psi) genes activate other proteins that aid in the transport of inorganic phosphate.

References

  1. 1 2 Gama-Castro S, Salgado H, Peralta-Gil M, Santos-Zavaleta A, Muñiz-Rascado L, Solano-Lira H, Jimenez-Jacinto V, Weiss V, García-Sotelo JS, López-Fuentes A, Porrón-Sotelo L, Alquicira-Hernández S, Medina-Rivera A, Martínez-Flores I, Alquicira-Hernández K, Martínez-Adame R, Bonavides-Martínez C, Miranda-Ríos J, Huerta AM, Mendoza-Vargas A, Collado-Torres L, Taboada B, Vega-Alvarado L, Olvera M, Olvera L, Grande R, Morett E, Collado-Vides J (Jan 2011). "RegulonDB version 7.0: transcriptional regulation of Escherichia coli K-12 integrated within genetic sensory response units (Gensor Units)". Nucleic Acids Res. England. 39 (Database issue): D98-105. doi:10.1093/nar/gkq1110. PMC   3013702 . PMID   21051347.
  2. Gama-Castro, Socorro; Salgado, Heladia; Santos-Zavaleta, Alberto; Ledezma-Tejeida, Daniela; Muñiz-Rascado, Luis; García-Sotelo, Jair Santiago; Alquicira-Hernández, Kevin; Martínez-Flores, Irma; Pannier, Lucia (2016-01-04). "RegulonDB version 9.0: high-level integration of gene regulation, coexpression, motif clustering and beyond". Nucleic Acids Research. 44 (D1): D133–143. doi:10.1093/nar/gkv1156. ISSN   1362-4962. PMC   4702833 . PMID   26527724.