Structural gene

Last updated

A structural gene is a gene that codes for any RNA or protein product other than a regulatory factor (i.e. regulatory protein). A term derived from the lac operon, structural genes are typically viewed as those containing sequences of DNA corresponding to the amino acids of a protein that will be produced, as long as said protein does not function to regulate gene expression. Structural gene products include enzymes and structural proteins. Also encoded by structural genes are non-coding RNAs, such as rRNAs and tRNAs (but excluding any regulatory miRNAs and siRNAs).

Contents

Placement in the genome

In prokaryotes, structural genes of related function are typically adjacent to one another on a single strand of DNA, forming an operon. This permits simpler regulation of gene expression, as a single regulatory factor can affect transcription of all associated genes. This is best illustrated by the well-studied lac operon, in which three structural genes ( lacZ , lacY , and lacA ) are all regulated by a single promoter and a single operator. Prokaryotic structural genes are transcribed into a polycistronic mRNA and subsequently translated. [1]

In eukaryotes, structural genes are not sequentially placed. Each gene is instead composed of coding exons and interspersed non-coding introns. Regulatory sequences are typically found in non-coding regions upstream and downstream from the gene. Structural gene mRNAs must be spliced prior to translation to remove intronic sequences. This in turn lends itself to the eukaryotic phenomenon of alternative splicing, in which a single mRNA from a single structural gene can produce several different proteins based on which exons are included. Despite the complexity of this process, it is estimated that up to 94% of human genes are spliced in some way. [2] Furthermore, different splicing patterns occur in different tissue types. [3]

An exception to this layout in eukaryotes are genes for histone proteins, which lack introns entirely. [4] Also distinct are the rDNA clusters of structural genes, in which 28S, 5.8S, and 18S sequences are adjacent, separated by short internally transcribed spacers, and likewise the 45S rDNA occurs five distinct places on the genome, but is clustered into adjacent repeats. In eubacteria these genes are organized into operons. However, in archaebacteria these genes are non-adjacent and exhibit no linkage. [5]

Role in human disease

The identification of the genetic basis for the causative agent of a disease can be an important component of understanding its effects and spread. Location and content of structural genes can elucidate the evolution of virulence, [6] as well as provide necessary information for treatment. Likewise understanding the specific changes in structural gene sequences underlying a gain or loss of virulence aids in understanding the mechanism by which diseases affect their hosts. [7]

For example, Yersinia pestis (the bubonic plague) was found to carry several virulence and inflammation-related structural genes on plasmids. [8] Likewise, the structural gene responsible for tetanus was determined to be carried on a plasmid as well. [9] Diphtheria is caused by a bacterium, but only after that bacterium has been infected by a bacteriophage carrying the structural genes for the toxin. [10]

In Herpes simplex virus, the structural gene sequence responsible for virulence was found in two locations in the genome despite only one location actually producing the viral gene product. This was hypothesized to serve as a potential mechanism for strains to regain virulence if lost through mutation. [11]

Understanding the specific changes in structural genes underlying a gain or loss of virulence is a necessary step in the formation of specific treatments, as well the study of possible medicinal uses of toxins. [10]

Phylogenetics

As far back as 1974, DNA sequence similarity was recognized as a valuable tool for determining relationships among taxa. [12] Structural genes in general are more highly conserved due to functional constraint, and so can prove useful in examinations of more disparate taxa. Original analyses enriched samples for structural genes via hybridization to mRNA. [13]

More recent phylogenetic approaches focused on structural genes of known function, conserved to varying degrees. rRNA sequences frequent targets, as they are conserved in all species. [14] Microbiology has specifically targeted the 16S gene to determine species level differences. [15] In higher-order taxa, COI is now considered the “barcode of life,” and is applied for most biological identification. [16]

Debate

Despite the widespread classification of genes as either structural or regulatory, these categories are not an absolute division. Recent genetic discoveries call into question the distinction between regulatory and structural genes. [17]

The distinction between regulatory and structural genes can be attributed to the original 1959 work on Lac operon protein expression. [18] In this instance, a single regulatory protein was detected that affected the transcription of the other proteins now known to compose the Lac operon. From this point forward, the two types of coding sequences were separated. [18]

However, increasing discoveries of gene regulation suggest greater complexity. Structural gene expression is regulated by numerous factors including epigenetics (e.g. methylation), RNAi, and more. Regulatory and structural genes can be epigenetically regulated identically, so not all regulation is coded for by “regulatory genes”. [17]

There are also examples of proteins that do not decidedly fit either category, such as chaperone proteins. These proteins aid in the folding of other proteins, a seemingly regulatory role. [19] [20] Yet these same proteins also aid in the movement of their chaperoned proteins across membranes, [21] and have now been implicated in immune responses (see Hsp60) [22] and in the apoptotic pathway (see Hsp70). [23]

More recently, microRNAs were found to be produced from the internal transcribed spacers of rRNA genes. [24] Thus an internal component of a structural gene is, in fact, regulatory. Binding sites for microRNAs were also detected within coding sequences of genes. Typically interfering RNAs target the 3’UTR, but inclusion of binding sites within the sequence of the protein itself allows the transcripts of these proteins to effectively regulate the microRNAs within the cell. This interaction was demonstrated to have an effect on expression, and thus again a structural gene contains a regulatory component. [25]

Related Research Articles

<span class="mw-page-title-main">Promoter (genetics)</span> Region of DNA encouraging transcription

In genetics, a promoter is a sequence of DNA to which proteins bind to initiate transcription of a single RNA transcript from the DNA downstream of the promoter. The RNA transcript may encode a protein (mRNA), or can have a function in and of itself, such as tRNA or rRNA. Promoters are located near the transcription start sites of genes, upstream on the DNA . Promoters can be about 100–1000 base pairs long, the sequence of which is highly dependent on the gene and product of transcription, type or class of RNA polymerase recruited to the site, and species of organism.

<span class="mw-page-title-main">Gene expression</span> Conversion of a genes sequence into a mature gene product or products

Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, proteins or non-coding RNA, and ultimately affect a phenotype. These products are often proteins, but in non-protein-coding genes such as transfer RNA (tRNA) and small nuclear RNA (snRNA), the product is a functional non-coding RNA. Gene expression is summarized in the central dogma of molecular biology first formulated by Francis Crick in 1958, further developed in his 1970 article, and expanded by the subsequent discoveries of reverse transcription and RNA replication.

In genetics, an operon is a functioning unit of DNA containing a cluster of genes under the control of a single promoter. The genes are transcribed together into an mRNA strand and either translated together in the cytoplasm, or undergo splicing to create monocistronic mRNAs that are translated separately, i.e. several strands of mRNA that each encode a single gene product. The result of this is that the genes contained in the operon are either expressed together or not at all. Several genes must be co-transcribed to define an operon.

<span class="mw-page-title-main">Non-coding RNA</span> Class of ribonucleic acid that is not translated into proteins

A non-coding RNA (ncRNA) is a functional RNA molecule that is not translated into a protein. The DNA sequence from which a functional non-coding RNA is transcribed is often called an RNA gene. Abundant and functionally important types of non-coding RNAs include transfer RNAs (tRNAs) and ribosomal RNAs (rRNAs), as well as small RNAs such as microRNAs, siRNAs, piRNAs, snoRNAs, snRNAs, exRNAs, scaRNAs and the long ncRNAs such as Xist and HOTAIR.

A regulatory sequence is a segment of a nucleic acid molecule which is capable of increasing or decreasing the expression of specific genes within an organism. Regulation of gene expression is an essential feature of all living organisms and viruses.

<span class="mw-page-title-main">Lac repressor</span> DNA-binding protein

The lac repressor (LacI) is a DNA-binding protein that inhibits the expression of genes coding for proteins involved in the metabolism of lactose in bacteria. These genes are repressed when lactose is not available to the cell, ensuring that the bacterium only invests energy in the production of machinery necessary for uptake and utilization of lactose when lactose is present. When lactose becomes available, it is firstly converted into allolactose by β-Galactosidase (lacZ) in bacteria. The DNA binding ability of lac repressor bound with allolactose is inhibited due to allosteric regulation, thereby genes coding for proteins involved in lactose uptake and utilization can be expressed.

<i>lac</i> operon Set genes encoding proteins and enzymes for lactose metabolism

The lactose operon is an operon required for the transport and metabolism of lactose in E. coli and many other enteric bacteria. Although glucose is the preferred carbon source for most bacteria, the lac operon allows for the effective digestion of lactose when glucose is not available through the activity of beta-galactosidase. Gene regulation of the lac operon was the first genetic regulatory mechanism to be understood clearly, so it has become a foremost example of prokaryotic gene regulation. It is often discussed in introductory molecular and cellular biology classes for this reason. This lactose metabolism system was used by François Jacob and Jacques Monod to determine how a biological cell knows which enzyme to synthesize. Their work on the lac operon won them the Nobel Prize in Physiology in 1965.

<span class="mw-page-title-main">Regulation of gene expression</span> Modifying mechanisms used by cells to increase or decrease the production of specific gene products

Regulation of gene expression, or gene regulation, includes a wide range of mechanisms that are used by cells to increase or decrease the production of specific gene products. Sophisticated programs of gene expression are widely observed in biology, for example to trigger developmental pathways, respond to environmental stimuli, or adapt to new food sources. Virtually any step of gene expression can be modulated, from transcriptional initiation, to RNA processing, and to the post-translational modification of a protein. Often, one gene regulator controls another, and so on, in a gene regulatory network.

<span class="mw-page-title-main">Repressor</span> Sort of RNA-binding protein in molecular genetics

In molecular genetics, a repressor is a DNA- or RNA-binding protein that inhibits the expression of one or more genes by binding to the operator or associated silencers. A DNA-binding repressor blocks the attachment of RNA polymerase to the promoter, thus preventing transcription of the genes into messenger RNA. An RNA-binding repressor binds to the mRNA and prevents translation of the mRNA into protein. This blocking or reducing of expression is called repression.

<span class="mw-page-title-main">Silencer (genetics)</span> Type of DNA sequence

In genetics, a silencer is a DNA sequence capable of binding transcription regulation factors, called repressors. DNA contains genes and provides the template to produce messenger RNA (mRNA). That mRNA is then translated into proteins. When a repressor protein binds to the silencer region of DNA, RNA polymerase is prevented from transcribing the DNA sequence into RNA. With transcription blocked, the translation of RNA into proteins is impossible. Thus, silencers prevent genes from being expressed as proteins.

<span class="mw-page-title-main">Regulator gene</span>

A regulator gene, regulator, or regulatory gene is a gene involved in controlling the expression of one or more other genes. Regulatory sequences, which encode regulatory genes, are often at the five prime end (5') to the start site of transcription of the gene they regulate. In addition, these sequences can also be found at the three prime end (3') to the transcription start site. In both cases, whether the regulatory sequence occurs before (5') or after (3') the gene it regulates, the sequence is often many kilobases away from the transcription start site. A regulator gene may encode a protein, or it may work at the level of RNA, as in the case of genes encoding microRNAs. An example of a regulator gene is a gene that codes for a repressor protein that inhibits the activity of an operator.

In molecular biology, an inducer is a molecule that regulates gene expression. An inducer functions in two ways; namely:

Cis-regulatory elements (CREs) or Cis-regulatory modules (CRMs) are regions of non-coding DNA which regulate the transcription of neighboring genes. CREs are vital components of genetic regulatory networks, which in turn control morphogenesis, the development of anatomy, and other aspects of embryonic development, studied in evolutionary developmental biology.

<span class="mw-page-title-main">Gene</span> Sequence of DNA or RNA that codes for an RNA or protein product

In biology, the word gene can have several different meanings. The Mendelian gene is a basic unit of heredity and the molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protein-coding genes and noncoding genes.

Gene structure is the organisation of specialised sequence elements within a gene. Genes contain most of the information necessary for living cells to survive and reproduce. In most organisms, genes are made of DNA, where the particular DNA sequence determines the function of the gene. A gene is transcribed (copied) from DNA into RNA, which can either be non-coding (ncRNA) with a direct function, or an intermediate messenger (mRNA) that is then translated into protein. Each of these steps is controlled by specific sequence elements, or regions, within the gene. Every gene, therefore, requires multiple sequence elements to be functional. This includes the sequence that actually encodes the functional protein or ncRNA, as well as multiple regulatory sequence regions. These regions may be as short as a few base pairs, up to many thousands of base pairs long.

fis E. coli gene

fis is an E. coli gene encoding the Fis protein. The regulation of this gene is more complex than most other genes in the E. coli genome, as Fis is an important protein which regulates expression of other genes. It is supposed that fis is regulated by H-NS, IHF and CRP. It also regulates its own expression (autoregulation). Fis is one of the most abundant DNA binding proteins in Escherichia coli under nutrient-rich growth conditions.

The gal operon is a prokaryotic operon, which encodes enzymes necessary for galactose metabolism. Repression of gene expression for this operon works via binding of repressor molecules to two operators. These repressors dimerize, creating a loop in the DNA. The loop as well as hindrance from the external operator prevent RNA polymerase from binding to the promoter, and thus prevent transcription. Additionally, since the metabolism of galactose in the cell is involved in both anabolic and catabolic pathways, a novel regulatory system using two promoters for differential repression has been identified and characterized within the context of the gal operon.

Bacterial small RNAs (sRNA) are an important class of regulatory molecules in bacteria such as Brucella. They are often bound to the chaperone protein Hfq, which allows them to interact with mRNA(s). In Brucella suis 1330 RNA sequencing identified a novel list of 33 sRNAs and 62 Hfq-associated mRNAs. In Brucella melitensis eight novel sRNA genes were identified using bioinformatic and experimental approach. One of them BSR0602 was found to modulate the intracellular survival of B. melitensis. In another large-scale deep sequencing study 1321 sRNAs were identified in B. melitensis. BSR0441 sRNA was further investigated in this study and shown to play role in the intracellular survival. sRNA BM-sr0117 from Brucella melitensis was identified and shown to be bound to and cleaved by Bm-RNase III. AbcR and AbcR2 were studied B. abortus. Seven novel sRNAs were validated and their interaction with a putative target sequence was verified in B. abortus.

Transcription-translation coupling is a mechanism of gene expression regulation in which synthesis of an mRNA (transcription) is affected by its concurrent decoding (translation). In prokaryotes, mRNAs are translated while they are transcribed. This allows communication between RNA polymerase, the multisubunit enzyme that catalyzes transcription, and the ribosome, which catalyzes translation. Coupling involves both direct physical interactions between RNA polymerase and the ribosome, as well as ribosome-induced changes to the structure and accessibility of the intervening mRNA that affect transcription.

References

  1. Müller-Hill, Benno (1996-01-01). The Lac Operon: A Short History of a Genetic Paradigm. Walter de Gruyter. ISBN   9783110148305.
  2. Wang, Eric T.; Sandberg, Rickard; Luo, Shujun; Khrebtukova, Irina; Zhang, Lu; Mayr, Christine; Kingsmore, Stephen F.; Schroth, Gary P.; Burge, Christopher B. (2008). "Alternative isoform regulation in human tissue transcriptomes". Nature. 456 (7221): 470–476. Bibcode:2008Natur.456..470W. doi:10.1038/nature07509. PMC   2593745 . PMID   18978772.
  3. Yeo, Gene; Holste, Dirk; Kreiman, Gabriel; Burge, Christopher B. (2004-01-01). "Variation in alternative splicing across human tissues". Genome Biology. 5 (10): R74. doi: 10.1186/gb-2004-5-10-r74 . ISSN   1474-760X. PMC   545594 . PMID   15461793.
  4. Makałowski, W. (2001-01-01). "The human genome structure and organization". Acta Biochimica Polonica. 48 (3): 587–598. doi: 10.18388/abp.2001_3893 . ISSN   0001-527X. PMID   11833767.
  5. Tu, J; Zillig, W (1982-11-25). "Organization of rRNA structural genes in the archaebacterium Thermoplasma acidophilum". Nucleic Acids Research. 10 (22): 7231–7245. doi:10.1093/nar/10.22.7231. ISSN   0305-1048. PMC   327000 . PMID   7155894.
  6. Sreevatsan, Srinand; Pan, Xi; Stockbauer, Kathryn E.; Connell, Nancy D.; Kreiswirth, Barry N.; Whittam, Thomas S.; Musser, James M. (1997-09-02). "Restricted structural gene polymorphism in the Mycobacterium tuberculosis complex indicates evolutionarily recent global dissemination". Proceedings of the National Academy of Sciences. 94 (18): 9869–9874. Bibcode:1997PNAS...94.9869S. doi: 10.1073/pnas.94.18.9869 . ISSN   0027-8424. PMC   23284 . PMID   9275218.
  7. Maharaj, Payal D.; Anishchenko, Michael; Langevin, Stanley A.; Fang, Ying; Reisen, William K.; Brault, Aaron C. (2012-01-01). "Structural gene (prME) chimeras of St Louis encephalitis virus and West Nile virus exhibit altered in vitro cytopathic and growth phenotypes". Journal of General Virology. 93 (1): 39–49. doi:10.1099/vir.0.033159-0. PMC   3352334 . PMID   21940408.
  8. Brubaker, Robert R. (2007-08-01). "How the structural gene products of Yersinia pestis relate to virulence". Future Microbiology. 2 (4): 377–385. doi:10.2217/17460913.2.4.377. ISSN   1746-0921. PMID   17683274.
  9. Finn, C. W.; Silver, R. P.; Habig, W. H.; Hardegree, M. C.; Zon, G.; Garon, C. F. (1984-05-25). "The structural gene for tetanus neurotoxin is on a plasmid". Science. 224 (4651): 881–884. Bibcode:1984Sci...224..881F. doi:10.1126/science.6326263. ISSN   0036-8075. PMID   6326263.
  10. 1 2 Greenfield, L.; Bjorn, M. J.; Horn, G.; Fong, D.; Buck, G. A.; Collier, R. J.; Kaplan, D. A. (1983-11-01). "Nucleotide sequence of the structural gene for diphtheria toxin carried by corynebacteriophage beta". Proceedings of the National Academy of Sciences of the United States of America. 80 (22): 6853–6857. Bibcode:1983PNAS...80.6853G. doi: 10.1073/pnas.80.22.6853 . ISSN   0027-8424. PMC   390084 . PMID   6316330.
  11. Knipe, David; Ruyechan, William; Honess, Robert; Roizman, Bernard (1979). "Molecular genetics of Herpes Simplex Virus: The terminal sequences of the L and S components are obligatorily identical and constitute a part of structural gene mapping predominantly in the S component" (PDF). Proceedings of the National Academy of Sciences of the United States of America. 76 (9): 4534–4538. Bibcode:1979PNAS...76.4534K. doi: 10.1073/pnas.76.9.4534 . PMC   411612 . PMID   228300.
  12. Moore, R. L. (1974-01-01). "Nucleic Acid Reassociation as a Guide to Genetic Relatedness among Bacteria". Modern Aspects of Electrochemistry. pp. 105–128. doi:10.1007/978-3-642-65848-8_4. ISBN   978-3-642-65850-1. ISSN   0070-217X. PMID   4602647.{{cite book}}: |journal= ignored (help)
  13. Angerer, R. C.; Davidson, E. H.; Britten, R. J. (1976-07-08). "Single copy DNA and structural gene sequence relationships among four sea urchin species". Chromosoma. 56 (3): 213–226. doi:10.1007/bf00293186. ISSN   0009-5915. PMID   964102. S2CID   26007034.
  14. Pruesse, E.; Quast, C.; Knittel, K.; Fuchs, B. M.; Ludwig, W.; Peplies, J.; Glockner, F. O. (2007-12-01). "SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB". Nucleic Acids Research. 35 (21): 7188–7196. doi:10.1093/nar/gkm864. ISSN   0305-1048. PMC   2175337 . PMID   17947321.
  15. Chun, Jongsik; Lee, Jae-Hak; Jung, Yoonyoung; Kim, Myungjin; Kim, Seil; Kim, Byung Kwon; Lim, Young-Woon (2007-01-01). "EzTaxon: a web-based tool for the identification of prokaryotes based on 16S ribosomal RNA gene sequences". International Journal of Systematic and Evolutionary Microbiology. 57 (10): 2259–2261. doi: 10.1099/ijs.0.64915-0 . PMID   17911292.
  16. Hebert, Paul D. N.; Cywinska, Alina; Ball, Shelley L.; deWaard, Jeremy R. (2003-02-07). "Biological identifications through DNA barcodes". Proceedings of the Royal Society of London B: Biological Sciences. 270 (1512): 313–321. doi:10.1098/rspb.2002.2218. ISSN   0962-8452. PMC   1691236 . PMID   12614582.
  17. 1 2 Piro, Rosario Michael (2011-03-29). "Are all genes regulatory genes?". Biology & Philosophy. 26 (4): 595–602. doi:10.1007/s10539-011-9251-9. ISSN   0169-3867. S2CID   16289510.
  18. 1 2 Pardee, Arthur B.; Jacob, François; Monod, Jacques (1959-06-01). "The genetic control and cytoplasmic expression of "Inducibility" in the synthesis of β-galactosidase by E. coli". Journal of Molecular Biology. 1 (2): 165–178. doi:10.1016/S0022-2836(59)80045-0.
  19. Hendrick, J. P.; Hartl, F. U. (1995-12-01). "The role of molecular chaperones in protein folding". FASEB Journal. 9 (15): 1559–1569. doi:10.1096/fasebj.9.15.8529835. ISSN   0892-6638. PMID   8529835. S2CID   33498269.
  20. Saibil, Helen (2013-10-01). "Chaperone machines for protein folding, unfolding and disaggregation". Nature Reviews Molecular Cell Biology. 14 (10): 630–642. doi:10.1038/nrm3658. ISSN   1471-0072. PMC   4340576 . PMID   24026055.
  21. Koll, H.; Guiard, B.; Rassow, J.; Ostermann, J.; Horwich, A. L.; Neupert, W.; Hartl, F. U. (1992-03-20). "Antifolding activity of hsp60 couples protein import into the mitochondrial matrix with export to the intermembrane space" (PDF). Cell. 68 (6): 1163–1175. doi:10.1016/0092-8674(92)90086-r. ISSN   0092-8674. PMID   1347713. S2CID   7430067.
  22. Hansen, Jens J.; Bross, Peter; Westergaard, Majken; Nielsen, Marit Nyholm; Eiberg, Hans; Børglum, Anders D.; Mogensen, Jens; Kristiansen, Karsten; Bolund, Lars (2003-01-01). "Genomic structure of the human mitochondrial chaperonin genes: HSP60 and HSP10 are localised head to head on chromosome 2 separated by a bidirectional promoter". Human Genetics. 112 (1): 71–77. doi:10.1007/s00439-002-0837-9. ISSN   0340-6717. PMID   12483302. S2CID   25856774.
  23. Cappello, Francesco; Di Stefano, Antonino; David, Sabrina; Rappa, Francesco; Anzalone, Rita; La Rocca, Giampiero; D'Anna, Silvestro E.; Magno, Francesca; Donner, Claudio F. (2006-11-15). "Hsp60 and Hsp10 down-regulation predicts bronchial epithelial carcinogenesis in smokers with chronic obstructive pulmonary disease". Cancer. 107 (10): 2417–2424. doi: 10.1002/cncr.22265 . ISSN   0008-543X. PMID   17048249.
  24. Son, Dong Ju; Kumar, Sandeep; Takabe, Wakako; Kim, Chan Woo; Ni, Chih-Wen; Alberts-Grill, Noah; Jang, In-Hwan; Kim, Sangok; Kim, Wankyu (2013-12-18). "The atypical mechanosensitive microRNA-712 derived from pre-ribosomal RNA induces endothelial inflammation and atherosclerosis". Nature Communications. 4: 3000. Bibcode:2013NatCo...4.3000S. doi:10.1038/ncomms4000. ISSN   2041-1723. PMC   3923891 . PMID   24346612.
  25. Forman, Joshua J.; Coller, Hilary A. (2010-04-15). "The code within the code: microRNAs target coding regions". Cell Cycle. 9 (8): 1533–1541. doi:10.4161/cc.9.8.11202. ISSN   1538-4101. PMC   2936675 . PMID   20372064.