Kozak consensus sequence

Last updated

The Kozak consensus sequence (Kozak consensus or Kozak sequence) is a nucleic acid motif that functions as the protein translation initiation site in most eukaryotic mRNA transcripts. [1] Regarded as the optimum sequence for initiating translation in eukaryotes, the sequence is an integral aspect of protein regulation and overall cellular health as well as having implications in human disease. [1] [2] It ensures that a protein is correctly translated from the genetic message, mediating ribosome assembly and translation initiation. A wrong start site can result in non-functional proteins. [3] As it has become more studied, expansions of the nucleotide sequence, bases of importance, and notable exceptions have arisen. [1] [4] [5] The sequence was named after the scientist who discovered it, Marilyn Kozak. Kozak discovered the sequence through a detailed analysis of DNA genomic sequences. [6]

Contents

The Kozak sequence is not to be confused with the ribosomal binding site (RBS), that being either the 5′ cap of a messenger RNA or an internal ribosome entry site (IRES).

Sequence

The Kozak sequence was determined by sequencing of 699 vertebrate mRNAs and verified by site-directed mutagenesis. [7] While initially limited to a subset of vertebrates (i.e. human, cow, cat, dog, chicken, guinea pig, hamster, mouse, pig, rabbit, sheep, and Xenopus ), subsequent studies confirmed its conservation in higher eukaryotes generally. [1] The sequence was defined as 5'-(gcc)gccRccAUGG-3' (IUPAC nucleobase notation summarized here) where: [7]

  1. The underlined nucleotides indicate the translation start codon, coding for Methionine.
  2. upper-case letters indicate highly conserved bases, i.e. the 'AUGG' sequence is constant or rarely, if ever, changes. [8]
  3. 'R' indicates that a purine (adenine or guanine) is always observed at this position (with adenine being more frequent according to Kozak)
  4. a lower-case letter denotes the most common base at a position where the base can nevertheless vary
  5. the sequence in parentheses (gcc) is of uncertain significance.

The AUG is the initiation codon encoding a methionine amino acid at the N-terminus of the protein. (Rarely, GUG is used as an initiation codon, but methionine is still the first amino acid as it is the met-tRNA in the initiation complex that binds to the mRNA). Variation within the Kozak sequence alters the "strength" thereof. Kozak sequence strength refers to the favorability of initiation, affecting how much protein is synthesized from a given mRNA. [4] [9] The A nucleotide of the "AUG" is delineated as +1 in mRNA sequences with the preceding base being labeled as −1, i.e. there is no 0 position. For a 'strong' consensus, the nucleotides at positions +4 (i.e. G in the consensus) and −3 (i.e. either A or G in the consensus) relative to the +1 nucleotide must both match the consensus. An 'adequate' consensus has only 1 of these sites, while a 'weak' consensus has neither. The cc at −1 and −2 are not as conserved, but contribute to the overall strength. [10] There is also evidence that a G in the -6 position is important in the initiation of translation. [4] While the +4 and the −3 positions in the Kozak sequence have the greatest relative importance in the establishing a favorable initiation context a CC or AA motif at −2 and −1 were found to be important in the initiation of translation in tobacco and maize plants. [11] Protein synthesis in yeast was found to be highly affected by composition of the Kozak sequence in yeast, with adenine enrichment resulting in higher levels of gene expression. [12] A suboptimal Kozak sequence can allow for PIC to scan past the first AUG site and start initiation at a downstream AUG codon. [13] [2]

A sequence logo showing the most conserved bases around the initiation codon from over 10 000 human mRNAs. Larger letters indicate a higher frequency of incorporation. Note the larger size of A and G at the 8 position (-3, Kozak position) and at the G at position 14 which corresponds to (+4) position in the Kozak sequence. Human Kozak context. Version 2.png
A sequence logo showing the most conserved bases around the initiation codon from over 10 000 human mRNAs. Larger letters indicate a higher frequency of incorporation. Note the larger size of A and G at the 8 position (−3, Kozak position) and at the G at position 14 which corresponds to (+4) position in the Kozak sequence.

Ribosome assembly

The ribosome assembles on the start codon (AUG), located within the Kozak sequence. Prior to translation initiation, scanning is done by the pre-initiation complex (PIC). The PIC consists of the 40S (small ribosomal subunit) bound to the ternary complex, eIF2-GTP-intiatorMet tRNA (TC) to form the 43S ribosome. Assisted by several other initiation factors (eIF1 and eIF1A, eIF5, eIF3, polyA binding protein) it is recruited to the 5′ end of the mRNA. Eukaryotic mRNA is capped with a 7-methylguanosine (m7G) nucleotide which can help recruit the PIC to the mRNA and initiate scanning. This recruitment to the m7G 5′ cap is supported by the inability of eukaryotic ribosomes to translate circular mRNA, which has no 5′ end. [14] Once the PIC binds to the mRNA it scans until it reaches the first AUG codon in a Kozak sequence. [15] [16] This scanning is referred to as the scanning mechanism of initiation.

An overview of eukaryotic initiation showing the formation of the PIC and the scanning method of initiation. Eukaryotic initiation.png
An overview of eukaryotic initiation showing the formation of the PIC and the scanning method of initiation.

The scanning mechanism of Initiation starts when the PIC binds the 5′ end of the mRNA. Scanning is stimulated by Dhx29 and Ddx3/Ded1 and eIF4 proteins. [1] The Dhx29 and Ddx3/Ded1 are DEAD-box helicases that help to unwind any secondary mRNA structure which could hinder scanning. [17] The scanning of an mRNA continues until the first AUG codon on the mRNA is reached, this is known as the "First AUG Rule". [1] While exceptions to the "First AUG Rule" exist, most exceptions take place at a second AUG codon that is located 3 to 5 nucleotides downstream from the first AUG, or within 10 nucleotides from the 5′ end of the mRNA. [18] At the AUG codon a Methionine tRNA anticodon is recognized by mRNA codon. [19] Upon base pairing to the start codon the eIF5 in the PIC helps to hydrolyze a guanosine triphosphate (GTP) bound to the eIF2. [20] [21] This leads to the a structural rearrangement that commits the PIC to binding to the large ribosomal subunit (60S) and forming the ribosomal complex (80S). Once the 80S ribosome complex is formed then the elongation phase of translation starts.

The first start codon closest to the 5′ end of the strand is not always recognized if it is not contained in a Kozak-like sequence. Lmx1b is an example of a gene with a weak Kozak consensus sequence. [22] For initiation of translation from such a site, other features are required in the mRNA sequence in order for the ribosome to recognize the initiation codon. Exceptions to the first AUG rule may occur if it is not contained in a Kozak-like sequence. This is called leaky scanning and could be a potential way to control translation through initiation. [23] For initiation of translation from such a site, other features are required in the mRNA sequence in order for the ribosome to recognize the initiation codon.

It is believed that the PIC is stalled at the Kozak sequence by interactions between eIF2 and the −3 and +4 nucleotides in the Kozak position. [24] This stalling allows the start codon and the corresponding anticodon time to form the correct hydrogen bonding. The Kozak consensus sequence is so common that the similarity of the sequence around the AUG codon to the Kozak Sequence is used as a criterion for finding start codons in eukaryotes. [25]

Differences from bacterial initiation

The scanning mechanism of initiation, which utilizes the Kozak sequence, is found only in eukaryotes and has significant differences from the way bacteria initiate translation. The biggest difference is the existence of the Shine-Dalgarno (SD) sequence in mRNA for bacteria. The SD sequence is located near the start codon which is in contrast to the Kozak sequence which actually contains the start codon. The Shine Dalgarno sequence allows the 16S subunit of the small ribosome subunit to bind to the AUG (or alternative) start codon immediately. In contrast, scanning along the mRNA results in a more rigorous selection process for the AUG codon than in bacteria. [26] An example of bacterial start codon promiscuity can be seen in the use of the alternate start codons UUG and GUG for some genes. [27]

Archaeal transcripts use a mix of SD sequence, Kozak sequence, and leaderless initiation. Haloarchaea are known to have a variant of the Kozak consensus sequence in their Hsp70 genes. [28]

Mutations and disease

Marilyn Kozak demonstrated, through systematic study of point mutations, that any mutations of a strong consensus sequence in the −3 position or to the +4 position resulted in highly impaired translation initiation both in vitro and in vivo. [29] [30]

Campomelic dysplasia, a disorder that results in skeletal, reproductive and/or airway issues. Campomelic dysplasia can be the result of a Kozak-related mutation in the SOX9 gene. Campomelic dysplasia.png
Campomelic dysplasia, a disorder that results in skeletal, reproductive and/or airway issues. Campomelic dysplasia can be the result of a Kozak-related mutation in the SOX9 gene.

Research has shown that a mutation of G—>C in the −6 position of the β-globin gene (β+45; human) disrupted the haematological and biosynthetic phenotype function. This was the first mutation found in the Kozak sequence and showed a 30% decrease in translational efficiency. It was found in a family from the Southeast Italy and they suffered from thalassaemia intermedia. [4]

Similar observations were made regarding mutations in the position −5 from the start codon, AUG. Cytosine in this position, as opposed to thymine, showed more efficient translation and increased expression of the platelet adhesion receptor, glycoprotein Ibα in humans. [33]

Mutations to the Kozak sequence can also have drastic effects upon human health; in particular, certain forms of congenital heart disease are caused by Kozak sequence mutations in the GATA4 gene's 5' untranslated region. The GATA4 gene is responsible for gene expression in a wide variety of tissues including the heart. [34] When the guanosine at the -6 position in the Kozak sequence of GATA4 is mutated to a cytosine, a reduction in GATA4 protein levels results, which leads to a decrease in the expression of genes regulated by the GATA4 transcription factor and linked to the development of atrial septal defect. [35]

The ability of the Kozak sequence to optimize translation can result in novel initiation codons in the typically untranslated region of the 5′ (5′ UTR) end of the mRNA transcript. A G to A mutation was described by Bohlen et al. (2017) in a Kozak-like region in the SOX9 gene that created a new translation initiation codon in an out-of-frame open reading frame. The correct initiation codon was located in a region that did not match the Kozak consensus sequence as closely as the surrounding sequence of the new, upstream initiation site did, which resulted in reduced translation efficiency of functional SOX9 protein. The patient in whom this mutation was detected had developed acampomelic campomelic dysplasia, a developmental disorder that causes skeletal, reproductive and airway issues due to insufficient SOX9 expression. [32]

Variations in the consensus sequence

The Kozak consensus has been variously described as: [36]

     65432-+234 (gcc)gccRccAUGG (Kozak 1987)        AGNNAUGN         ANNAUGG         ACCAUGG (Spotts et al., 1997, mentioned in Kozak 2002)      GACACCAUGG (H. sapiens HBB, HBD, R. norvegicus Hbb, etc.) 
Kozak-like sequences in various eukaryotes
BiotaPhylumConsensus sequences
Vertebrate (Kozak 1987)gccRccATGG [7]
Fruit fly (Drosophila spp.) Arthropoda atMAAMATGamc [37]
Budding yeast (Saccharomyces cerevisiae) Ascomycota aAaAaAATGTCt [38]
Slime mold (Dictyostelium discoideum) Amoebozoa aaaAAAATGRna [39]
Ciliate Ciliophora nTaAAAATGRct [39]
Malarial protozoa (Plasmodium spp.) Apicomplexa taaAAAATGAan [39]
Toxoplasma (Toxoplasma gondii) Apicomplexa gncAaaATGg [40]
Trypanosomatidae Euglenozoa nnnAnnATGnC [39]
Terrestrial plants acAACAATGGC [41]
Microalga (Chlamydomonas reinhardtii) Chlorophyta gccAaCATGGcg [42] [43]

See also

Related Research Articles

<span class="mw-page-title-main">Messenger RNA</span> RNA that is read by the ribosome to produce a protein

In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of synthesizing a protein.

<span class="mw-page-title-main">Translation (biology)</span> Cellular process of protein synthesis

In biology, translation is the process in living cells in which proteins are produced using RNA molecules as templates. The generated protein is a sequence of amino acids. This sequence is determined by the sequence of nucleotides in the RNA. The nucleotides are considered three at a time. Each such triple results in addition of one specific amino acid to the protein being generated. The matching from nucleotide triple to amino acid is called the genetic code. The translation is performed by a large complex of functional RNA and proteins called ribosomes. The entire process is called gene expression.

<span class="mw-page-title-main">Transfer RNA</span> RNA that facilitates the addition of amino acids to a new protein

Transfer RNA is an adaptor molecule composed of RNA, typically 76 to 90 nucleotides in length. In a cell, it provides the physical link between the genetic code in messenger RNA (mRNA) and the amino acid sequence of proteins, carrying the correct sequence of amino acids to be combined by the protein-synthesizing machinery, the ribosome. Each three-nucleotide codon in mRNA is complemented by a three-nucleotide anticodon in tRNA. As such, tRNAs are a necessary component of translation, the biological synthesis of new proteins in accordance with the genetic code.

The 5′ untranslated region is the region of a messenger RNA (mRNA) that is directly upstream from the initiation codon. This region is important for the regulation of translation of a transcript by differing mechanisms in viruses, prokaryotes and eukaryotes. While called untranslated, the 5′ UTR or a portion of it is sometimes translated into a protein product. This product can then regulate the translation of the main coding sequence of the mRNA. In many organisms, however, the 5′ UTR is completely untranslated, instead forming a complex secondary structure to regulate translation.

The Shine–Dalgarno (SD) sequence is a ribosomal binding site in bacterial and archaeal messenger RNA, generally located around 8 bases upstream of the start codon AUG. The RNA sequence helps recruit the ribosome to the messenger RNA (mRNA) to initiate protein synthesis by aligning the ribosome with the start codon. Once recruited, tRNA may add amino acids in sequence as dictated by the codons, moving downstream from the translational start site.

An internal ribosome entry site, abbreviated IRES, is an RNA element that allows for translation initiation in a cap-independent manner, as part of the greater process of protein synthesis. Initiation of eukaryotic translation nearly always occurs at and is dependent on the 5' cap of mRNA molecules, where the translation initiation complex forms and ribosomes engage the mRNA. IRES elements, however allow ribosomes to engage the mRNA and begin translation independently of the 5' cap.

<span class="mw-page-title-main">Start codon</span> First codon of a messenger RNA translated by a ribosome

The start codon is the first codon of a messenger RNA (mRNA) transcript translated by a ribosome. The start codon always codes for methionine in eukaryotes and archaea and a N-formylmethionine (fMet) in bacteria, mitochondria and plastids.

Bacterial translation is the process by which messenger RNA is translated into proteins in bacteria.

Eukaryotic translation is the biological process by which messenger RNA is translated into proteins in eukaryotes. It consists of four phases: initiation, elongation, termination, and recapping.

<span class="mw-page-title-main">Directionality (molecular biology)</span> End-to-end chemical orientation of a single strand of nucleic acid

Directionality, in molecular biology and biochemistry, is the end-to-end chemical orientation of a single strand of nucleic acid. In a single strand of DNA or RNA, the chemical convention of naming carbon atoms in the nucleotide pentose-sugar-ring means that there will be a 5′ end, which frequently contains a phosphate group attached to the 5′ carbon of the ribose ring, and a 3′ end, which typically is unmodified from the ribose -OH substituent. In a DNA double helix, the strands run in opposite directions to permit base pairing between them, which is essential for replication or transcription of the encoded information.

Eukaryotic initiation factors (eIFs) are proteins or protein complexes involved in the initiation phase of eukaryotic translation. These proteins help stabilize the formation of ribosomal preinitiation complexes around the start codon and are an important input for post-transcription gene regulation. Several initiation factors form a complex with the small 40S ribosomal subunit and Met-tRNAiMet called the 43S preinitiation complex. Additional factors of the eIF4F complex recruit the 43S PIC to the five-prime cap structure of the mRNA, from which the 43S particle scans 5'-->3' along the mRNA to reach an AUG start codon. Recognition of the start codon by the Met-tRNAiMet promotes gated phosphate and eIF1 release to form the 48S preinitiation complex, followed by large 60S ribosomal subunit recruitment to form the 80S ribosome. There exist many more eukaryotic initiation factors than prokaryotic initiation factors, reflecting the greater biological complexity of eukaryotic translation. There are at least twelve eukaryotic initiation factors, composed of many more polypeptides, and these are described below.

A ribosome binding site, or ribosomal binding site (RBS), is a sequence of nucleotides upstream of the start codon of an mRNA transcript that is responsible for the recruitment of a ribosome during the initiation of translation. Mostly, RBS refers to bacterial sequences, although internal ribosome entry sites (IRES) have been described in mRNAs of eukaryotic cells or viruses that infect eukaryotes. Ribosome recruitment in eukaryotes is generally mediated by the 5' cap present on eukaryotic mRNAs.

<span class="mw-page-title-main">Untranslated region</span> Non-coding regions on either end of mRNA

In molecular genetics, an untranslated region refers to either of two sections, one on each side of a coding sequence on a strand of mRNA. If it is found on the 5' side, it is called the 5' UTR, or if it is found on the 3' side, it is called the 3' UTR. mRNA is RNA that carries information from DNA to the ribosome, the site of protein synthesis (translation) within a cell. The mRNA is initially transcribed from the corresponding DNA sequence and then translated into protein. However, several regions of the mRNA are usually not translated into protein, including the 5' and 3' UTRs.

Ribosomal frameshifting, also known as translational frameshifting or translational recoding, is a biological phenomenon that occurs during translation that results in the production of multiple, unique proteins from a single mRNA. The process can be programmed by the nucleotide sequence of the mRNA and is sometimes affected by the secondary, 3-dimensional mRNA structure. It has been described mainly in viruses, retrotransposons and bacterial insertion elements, and also in some cellular genes.

Leaky scanning is a mechanism used during the initiation phase of eukaryotic translation that enables regulation of gene expression. During initiation, the small 40S ribosomal subunit "scans" or moves in a 5' --> 3' direction along the 5'UTR to locate a start codon to commence elongation. Sometimes, the scanning ribosome bypasses the initial AUG start codon and begins translation at further downstream AUG start codons. Translation in eukaryotic cells according to most scanning mechanisms occurs at the AUG start codon proximal to the 5' end of mRNA; however, the scanning ribosome may encounter an “unfavorable nucleotide context” around the start codon and continue scanning.

<span class="mw-page-title-main">Red clover necrotic mosaic virus translation enhancer elements</span>

Red clover necrotic mosaic virus (RCNMV) contains several structural elements present within the 3′ and 5′ untranslated regions (UTR) of the genome that enhance translation. In eukaryotes transcription is a prerequisite for translation. During transcription the pre-mRNA transcript is processes where a 5′ cap is attached onto mRNA and this 5′ cap allows for ribosome assembly onto the mRNA as it acts as a binding site for the eukaryotic initiation factor eIF4F. Once eIF4F is bound to the mRNA this protein complex interacts with the poly(A) binding protein which is present within the 3′ UTR and results in mRNA circularization. This multiprotein-mRNA complex then recruits the ribosome subunits and scans the mRNA until it reaches the start codon. Transcription of viral genomes differs from eukaryotes as viral genomes produce mRNA transcripts that lack a 5’ cap site. Despite lacking a cap site viral genes contain a structural element within the 5’ UTR known as an internal ribosome entry site (IRES). IRES is a structural element that recruits the 40s ribosome subunit to the mRNA within close proximity of the start codon.

<span class="mw-page-title-main">Riboregulator</span>

In molecular biology, a riboregulator is a ribonucleic acid (RNA) that responds to a signal nucleic acid molecule by Watson-Crick base pairing. A riboregulator may respond to a signal molecule in any number of manners including, translation of the RNA into a protein, activation of a ribozyme, release of silencing RNA (siRNA), conformational change, and/or binding other nucleic acids. Riboregulators contain two canonical domains, a sensor domain and an effector domain. These domains are also found on riboswitches, but unlike riboswitches, the sensor domain only binds complementary RNA or DNA strands as opposed to small molecules. Because binding is based on base-pairing, a riboregulator can be tailored to differentiate and respond to individual genetic sequences and combinations thereof.

The Consensus Coding Sequence (CCDS) Project is a collaborative effort to maintain a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assemblies. The CCDS project tracks identical protein annotations on the reference mouse and human genomes with a stable identifier, and ensures that they are consistently represented by the National Center for Biotechnology Information (NCBI), Ensembl, and UCSC Genome Browser. The integrity of the CCDS dataset is maintained through stringent quality assurance testing and on-going manual curation.

Marilyn S. Kozak is an American professor of biochemistry at the Robert Wood Johnson Medical School. She was previously at the University of Medicine and Dentistry of New Jersey before the school was merged. She was awarded a PhD in microbiology by Johns Hopkins University studying the synthesis of the Bacteriophage MS2, advised by Daniel Nathans.

<span class="mw-page-title-main">Translation regulation by 5′ transcript leader cis-elements</span>

Translation regulation by 5′ transcript leader cis-elements is a process in cellular translation.

References

  1. 1 2 3 4 5 6 Kozak, M. (February 1989). "The scanning model for translation: an update". The Journal of Cell Biology. 108 (2): 229–241. doi:10.1083/jcb.108.2.229. ISSN   0021-9525. PMC   2115416 . PMID   2645293.
  2. 1 2 Kozak, Marilyn (2002-10-16). "Pushing the limits of the scanning mechanism for initiation of translation". Gene. 299 (1): 1–34. doi:10.1016/S0378-1119(02)01056-9. ISSN   0378-1119. PMC   7126118 . PMID   12459250.
  3. Kozak, Marilyn (1999-07-08). "Initiation of translation in prokaryotes and eukaryotes". Gene. 234 (2): 187–208. doi:10.1016/S0378-1119(99)00210-3. ISSN   0378-1119. PMID   10395892.
  4. 1 2 3 4 De Angioletti M, Lacerra G, Sabato V, Carestia C (2004). "Beta+45 G → C: a novel silent beta-thalassaemia mutation, the first in the Kozak sequence". Br J Haematol. 124 (2): 224–31. doi: 10.1046/j.1365-2141.2003.04754.x . PMID   14687034. S2CID   86704907.
  5. Hernández, Greco; Osnaya, Vincent G.; Pérez-Martínez, Xochitl (2019-07-25). "Conservation and Variability of the AUG Initiation Codon Context in Eukaryotes". Trends in Biochemical Sciences. 44 (12): 1009–1021. doi: 10.1016/j.tibs.2019.07.001 . ISSN   0968-0004. PMID   31353284.
  6. Kozak, M (1984-01-25). "Compilation and analysis of sequences upstream from the translational start site in eukaryotic mRNAs". Nucleic Acids Research. 12 (2): 857–872. doi:10.1093/nar/12.2.857. ISSN   0305-1048. PMC   318541 . PMID   6694911.
  7. 1 2 3 Kozak M (October 1987). "An analysis of 5'-noncoding sequences from 699 vertebrate messenger RNAs". Nucleic Acids Res. 15 (20): 8125–8148. doi:10.1093/nar/15.20.8125. PMC   306349 . PMID   3313277.
  8. Nomenclature for Incompletely Specified Bases in Nucleic Acid Sequences, NC-IUB, 1984.
  9. Kozak M (1984). "Point mutations close to the AUG initiator codon affect the efficiency of translation of rat preproinsulin in vivo". Nature. 308 (5956): 241–246. Bibcode:1984Natur.308..241K. doi:10.1038/308241a0. PMID   6700727. S2CID   4366379.
  10. Kozak M (1986). "Point mutations define a sequence flanking the AUG initiator codon that modulates translation by eukaryotic ribosomes". Cell. 44 (2): 283–92. doi:10.1016/0092-8674(86)90762-2. PMID   3943125. S2CID   15613863.
  11. Lukaszewicz, Marcin; Feuermann, Marc; Jérouville, Bénédicte; Stas, Arnaud; Boutry, Marc (2000-05-15). "In vivo evaluation of the context sequence of the translation initiation codon in plants". Plant Science. 154 (1): 89–98. doi:10.1016/S0168-9452(00)00195-3. ISSN   0168-9452. PMID   10725562.
  12. Li, Jing; Liang, Qiang; Song, Wenjiang; Marchisio, Mario Andrea (2017). "Nucleotides upstream of the Kozak sequence strongly influence gene expression in the yeast S. cerevisiae". Journal of Biological Engineering. 11: 25. doi: 10.1186/s13036-017-0068-1 . ISSN   1754-1611. PMC   5563945 . PMID   28835771.
  13. Kochetov, Alex V. (2005-04-01). "AUG codons at the beginning of protein coding sequences are frequent in eukaryotic mRNAs with a suboptimal start codon context". Bioinformatics. 21 (7): 837–840. doi: 10.1093/bioinformatics/bti136 . ISSN   1367-4803. PMID   15531618.
  14. Kozak, Marilyn (July 1979). "Inability of circular mRNA to attach to eukaryotic ribosomes". Nature. 280 (5717): 82–85. Bibcode:1979Natur.280...82K. doi:10.1038/280082a0. ISSN   1476-4687. PMID   15305588. S2CID   4319259.
  15. Schmitt, Emmanuelle; Coureux, Pierre-Damien; Monestier, Auriane; Dubiez, Etienne; Mechulam, Yves (2019-02-21). "Start Codon Recognition in Eukaryotic and Archaeal Translation Initiation: A Common Structural Core". International Journal of Molecular Sciences. 20 (4): 939. doi: 10.3390/ijms20040939 . ISSN   1422-0067. PMC   6412873 . PMID   30795538.
  16. Grzegorski, Steven J.; Chiari, Estelle F.; Robbins, Amy; Kish, Phillip E.; Kahana, Alon (2014). "Natural Variability of Kozak Sequences Correlates with Function in a Zebrafish Model". PLOS ONE. 9 (9): e108475. Bibcode:2014PLoSO...9j8475G. doi: 10.1371/journal.pone.0108475 . PMC   4172775 . PMID   25248153.
  17. Hinnebusch, Alan G. (2014). "The Scanning Mechanism of Eukaryotic Translation Initiation". Annual Review of Biochemistry. 83 (1): 779–812. doi:10.1146/annurev-biochem-060713-035802. PMID   24499181.
  18. Kozak, M. (1995-03-28). "Adherence to the first-AUG rule when a second AUG codon follows closely upon the first". Proceedings of the National Academy of Sciences. 92 (7): 2662–2666. Bibcode:1995PNAS...92.2662K. doi: 10.1073/pnas.92.7.2662 . ISSN   0027-8424. PMC   42278 . PMID   7708701.
  19. Cigan, A. M.; Feng, L.; Donahue, T. F. (1988-10-07). "tRNAi(met) functions in directing the scanning ribosome to the start site of translation". Science. 242 (4875): 93–97. Bibcode:1988Sci...242...93C. doi:10.1126/science.3051379. ISSN   0036-8075. PMID   3051379.
  20. Pestova, Tatyana V.; Lomakin, Ivan B.; Lee, Joon H.; Choi, Sang Ki; Dever, Thomas E.; Hellen, Christopher U. T. (January 2000). "The joining of ribosomal subunits in eukaryotes requires eIF5B". Nature. 403 (6767): 332–335. Bibcode:2000Natur.403..332P. doi:10.1038/35002118. ISSN   1476-4687. PMID   10659855. S2CID   3739106.
  21. Algire, Mikkel A.; Maag, David; Lorsch, Jon R. (2005-10-28). "Pi Release from eIF2, Not GTP Hydrolysis, Is the Step Controlled by Start-Site Selection during Eukaryotic Translation Initiation". Molecular Cell. 20 (2): 251–262. doi: 10.1016/j.molcel.2005.09.008 . ISSN   1097-2765. PMID   16246727.
  22. Dunston JA, Hamlington JD, Zaveri J, et al. (September 2004). "The human LMX1B gene: transcription unit, promoter, and pathogenic mutations". Genomics. 84 (3): 565–76. doi:10.1016/j.ygeno.2004.06.002. PMID   15498463.
  23. Alekhina, O. M.; Vassilenko, K. S. (2012). "Translation initiation in eukaryotes: Versatility of the scanning model". Biochemistry (Moscow). 77 (13): 1465–1477. doi:10.1134/s0006297912130056. PMID   23379522. S2CID   14157104.
  24. Hinnebusch, Alan G. (September 2011). "Molecular Mechanism of Scanning and Start Codon Selection in Eukaryotes". Microbiology and Molecular Biology Reviews. 75 (3): 434–467. doi:10.1128/MMBR.00008-11. ISSN   1092-2172. PMC   3165540 . PMID   21885680.
  25. Louis, B. G.; Ganoza, M. C. (1988). "Signals determining translational start-site recognition in eukaryotes and their role in prediction of genetic reading frames". Molecular Biology Reports. 13 (2): 103–115. doi:10.1007/bf00539058. ISSN   0301-4851. PMID   3221841. S2CID   25936805.
  26. Huang, Han-kuei; Yoon, Heejeong; Hannig, Ernest M.; Donahue, Thomas F. (1997-09-15). "GTP hydrolysis controls stringent selection of the AUG start codon during translation initiation in Saccharomyces cerevisiae". Genes & Development. 11 (18): 2396–2413. doi:10.1101/gad.11.18.2396. ISSN   0890-9369. PMC   316512 . PMID   9308967.
  27. Gualerzi, C. O.; Pon, C. L. (1990-06-26). "Initiation of mRNA translation in prokaryotes". Biochemistry. 29 (25): 5881–5889. doi:10.1021/bi00477a001. ISSN   0006-2960. PMID   2200518.
  28. Chen, Wenchao; Yang, Guopeng; He, Yue; Zhang, Shaoming; Chen, Haiyan; Shen, Ping; Chen, Xiangdong; Huang, Yu-Ping (17 September 2015). "Nucleotides Flanking the Start Codon in hsp70 mRNAs with Very Short 5'-UTRs Greatly Affect Gene Expression in Haloarchaea". PLOS ONE. 10 (9): e0138473. Bibcode:2015PLoSO..1038473C. doi: 10.1371/journal.pone.0138473 . PMC   4574771 . PMID   26379277.
  29. Kozak, Marilyn (1986-01-31). "Point mutations define a sequence flanking the AUG initiator codon that modulates translation by eukaryotic ribosomes". Cell. 44 (2): 283–292. doi:10.1016/0092-8674(86)90762-2. ISSN   0092-8674. PMID   3943125. S2CID   15613863.
  30. Kozak, Marilyn (March 1984). "Point mutations close to the AUG initiator codon affect the efficiency of translation of rat preproinsulin in vivo". Nature. 308 (5956): 241–246. Bibcode:1984Natur.308..241K. doi:10.1038/308241a0. ISSN   1476-4687. PMID   6700727. S2CID   4366379.
  31. Unger, Shelia; Scherer, Gerd; Superti-Furga, Andrea (1993). "Campomelic Dysplasia". GeneReviews. Seattle: University of Washington. PMID   20301724 via NIH National Library of Medicine, National Center for Biotechnology Information.
  32. 1 2 Bohlen, Anna E. von; Böhm, Johann; Pop, Ramona; Johnson, Diana S.; Tolmie, John; Stücker, Ralf; Morris‐Rosendahl, Deborah; Scherer, Gerd (2017). "A mutation creating an upstream initiation codon in the SOX9 5′ UTR causes acampomelic campomelic dysplasia". Molecular Genetics & Genomic Medicine. 5 (3): 261–268. doi:10.1002/mgg3.282. ISSN   2324-9269. PMC   5441400 . PMID   28546996.
  33. Afshar-Kharghan, Vahid; Li, Chester Q.; Khoshnevis-Asl, Mohammad; LóPez, José A. (1999). "Kozak Sequence Polymorphism of the Glycoprotein (GP) Ib Gene is a Major Determinant of the Plasma Membrane Levels of the Platelet GP Ib-IX-V Complex". Blood. 94: 186–191. doi:10.1182/blood.v94.1.186.413k19_186_191.
  34. Lee, Y.; Shioi, T.; Kasahara, H.; Jobe, S. M.; Wiese, R. J.; Markham, B. E.; Izumo, S. (June 1998). "The cardiac tissue-restricted homeobox protein Csx/Nkx2.5 physically associates with the zinc finger protein GATA4 and cooperatively activates atrial natriuretic factor gene expression". Molecular and Cellular Biology. 18 (6): 3120–3129. doi:10.1128/mcb.18.6.3120. ISSN   0270-7306. PMC   108894 . PMID   9584153.
  35. Mohan, Rajiv A.; Engelen, Klaartje van; Stefanovic, Sonia; Barnett, Phil; Ilgun, Aho; Baars, Marieke J. H.; Bouma, Berto J.; Mulder, Barbara J. M.; Christoffels, Vincent M.; Postma, Alex V. (2014). "A mutation in the Kozak sequence of GATA4 hampers translation in a family with atrial septal defects". American Journal of Medical Genetics Part A. 164 (11): 2732–2738. doi:10.1002/ajmg.a.36703. ISSN   1552-4833. PMID   25099673. S2CID   32674053.
  36. Tang, Sen-Lin; Chang, Bill C.H.; Halgamuge, Saman K. (August 2010). "Gene functionality's influence on the second codon: A large-scale survey of second codon composition in three domains". Genomics. 96 (2): 92–101. doi: 10.1016/j.ygeno.2010.04.001 . PMID   20417269.
  37. Cavener DR (February 1987). "Comparison of the consensus sequence flanking translational start sites in Drosophila and vertebrates". Nucleic Acids Res. 15 (4): 1353–61. doi:10.1093/nar/15.4.1353. PMC   340553 . PMID   3822832.
  38. Hamilton R, Watanabe CK, de Boer HA (April 1987). "Compilation and comparison of the sequence context around the AUG startcodons in Saccharomyces cerevisiae mRNAs". Nucleic Acids Res. 15 (8): 3581–93. doi:10.1093/nar/15.8.3581. PMC   340751 . PMID   3554144.
  39. 1 2 3 4 Yamauchi K (May 1991). "The sequence flanking translational initiation site in protozoa". Nucleic Acids Res. 19 (10): 2715–20. doi:10.1093/nar/19.10.2715. PMC   328191 . PMID   2041747.
  40. Seeber, F. (1997). "Consensus sequence of translational initiation sites from Toxoplasma gondii genes". Parasitology Research. 83 (3): 309–311. doi:10.1007/s004360050254. PMID   9089733. S2CID   10433917.
  41. Lütcke HA, Chow KC, Mickel FS, Moss KA, Kern HF, Scheele GA (January 1987). "Selection of AUG initiation codons differs in plants and animals". EMBO J. 6 (1): 43–8. doi:10.1002/j.1460-2075.1987.tb04716.x. PMC   553354 . PMID   3556162.
  42. Cross F (February 6, 2016). "Tying Down Loose Ends in the Chlamydomonas Genome: Functional Significance of Abundant Upstream Open Reading Frames". G3 (2): 435–446. doi:10.1534/g3.115.023119. PMC   4751561 . PMID   26701783.
  43. Gallaher SD, Craig RJ, Ganesan I, Purvine SO, McCorkle SR, Grimwood J, Strenkert D, Davidi L, Roth MS, Jeffers TL, Lipton MS, Niyogi KK, Schmutz J, Theg SM, Blaby-Haas CE, Merchant SS (February 12, 2021). "Widespread polycistronic gene expression in green algae". Proceedings of the National Academy of Sciences. 118 (7). doi: 10.1073/pnas.2017714118 . PMC   7896298 .

Further reading