ORFeome

Last updated

In, molecular genetics, an ORFeome refers to the complete set of open reading frames (ORFs) in a genome. The term may also be used to describe a set of cloned ORFs. [1] ORFs correspond to the protein coding sequences (CDS) of genes. ORFs can be found in genome sequences by computer programs such as GENSCAN and then amplified by PCR. While this is relatively trivial in bacteria the problem is non-trivial in eukaryotic genomes because of the presence of introns and exons as well as splice variants.

Contents

Use in research

The usage of complete ORFeomes reflects a new trend in biology that can be succinctly summarized as omics. ORFeomes are used for the study of protein-protein interactions, [2] [3] protein microarrays, the study of antigens, [4] and other fields of study.

Cloned ORFeomes

Complete ORF sets have been cloned for a number of organisms including Brucella melitensis , [5] Chlamydia pneumoniae , [6] Escherichia coli , [7] Neisseria gonorrhoeae , [8]

Pseudomonas aeruginosa , [9] Schizosaccharomyces pombe,

    Staphylococcus aureus [10]

    and human herpesviruses [11]

    A partial human ORFeome has also been produced. [12] [13]

    Related Research Articles

    <span class="mw-page-title-main">Interactome</span> Complete set of molecular interactions in a biological cell

    In molecular biology, an interactome is the whole set of molecular interactions in a particular cell. The term specifically refers to physical interactions among molecules but can also describe sets of indirect interactions among genes.

    <span class="mw-page-title-main">DDX17</span> Protein-coding gene in the species Homo sapiens

    Probable ATP-dependent RNA helicase DDX17 (p72) is an enzyme that in humans is encoded by the DDX17 gene.

    <span class="mw-page-title-main">IFI27</span> Protein-coding gene in the species Homo sapiens

    Interferon alpha-inducible protein 27 is a protein that in humans is encoded by the IFI27 gene.

    <span class="mw-page-title-main">BTBD1</span> Protein-coding gene in the species Homo sapiens

    BTB/POZ domain-containing protein 1 is a protein that in humans is encoded by the BTBD1 gene.

    <span class="mw-page-title-main">BTBD2</span> Protein-coding gene in the species Homo sapiens

    BTB/POZ domain-containing protein 2 is a protein that in humans is encoded by the BTBD2 gene.

    <span class="mw-page-title-main">MAP3K7CL</span> Protein-coding gene in the species Homo sapiens

    MAP3K7CL, is a human gene located on chromosome 21. It is a protein-coding gene.

    <span class="mw-page-title-main">TMEM47</span> Protein-coding gene in the species Homo sapiens

    Transmembrane protein 47 is a protein that in humans is encoded by the TMEM47 gene.

    <span class="mw-page-title-main">TMC2</span> Protein-coding gene in the species Homo sapiens

    Transmembrane channel-like protein 2 is a protein that in humans is encoded by the TMC2 gene.

    <span class="mw-page-title-main">CCDC186</span> Protein found in humans

    CCDC186 is a protein that in humans is encoded by the CCDC186 gene The CCDC186 gene is also known as the CTCL-tumor associated antigen with accession number NM_018017.

    TOX high mobility group box family member 2, also known as TOX2, is a human gene.

    <span class="mw-page-title-main">LTR retrotransposon</span> Class I transposable element

    LTR retrotransposons are class I transposable element characterized by the presence of long terminal repeats (LTRs) directly flanking an internal coding region. As retrotransposons, they mobilize through reverse transcription of their mRNA and integration of the newly created cDNA into another location. Their mechanism of retrotransposition is shared with retroviruses, with the difference that most LTR-retrotransposons do not form infectious particles that leave the cells and therefore only replicate inside their genome of origin. Those that do (occasionally) form virus-like particles are classified under Ortervirales.

    <span class="mw-page-title-main">Synaptotagmin-14</span> Protein-coding gene in the species Homo sapiens

    Synaptotagmin XIV is a protein that in humans is encoded by the SYT14 gene.

    αr9 is a family of bacterial small non-coding RNAs with representatives in a broad group of α-proteobacteria from the order Hyphomicrobiales. The first member of this family (Smr9C) was found in a Sinorhizobium meliloti 1021 locus located in the chromosome (C). Further homology and structure conservation analysis have identified full-length Smr9C homologs in several nitrogen-fixing symbiotic rhizobia, in the plant pathogens belonging to Agrobacterium species as well as in a broad spectrum of Brucella species. αr9C RNA species are 144-158 nt long and share a well defined common secondary structure consisting of seven conserved regions. Most of the αr9 transcripts can be catalogued as trans-acting sRNAs expressed from well-defined promoter regions of independent transcription units within intergenic regions (IGRs) of the α-proteobacterial genomes.

    αr14 is a family of bacterial small non-coding RNAs with representatives in a broad group of α-proteobacteria. The first member of this family (Smr14C2) was found in a Sinorhizobium meliloti 1021 locus located in the chromosome (C). It was later renamed NfeR1 and shown to be highly expressed in salt stress and during the symbiotic interaction on legume roots. Further homology and structure conservation analysis identified 2 other chromosomal copies and 3 plasmidic ones. Moreover, full-length Smr14C homologs have been identified in several nitrogen-fixing symbiotic rhizobia, in the plant pathogens belonging to Agrobacterium species as well as in a broad spectrum of Brucella species. αr14C RNA species are 115-125 nt long and share a well defined common secondary structure. Most of the αr14 transcripts can be catalogued as trans-acting sRNAs expressed from well-defined promoter regions of independent transcription units within intergenic regions (IGRs) of the α-proteobacterial genomes.

    αr15 is a family of bacterial small non-coding RNAs with representatives in a broad group of α-proteobacteria from the order Rhizobiales. The first members of this family were found tandemly arranged in the same intergenic region (IGR) of the Sinorhizobium meliloti 1021 chromosome (C). Further homology and structure conservation analysis have identified full-length Smr15C1 and Smr15C2 homologs in several nitrogen-fixing symbiotic rhizobia, in the plant pathogens belonging to Agrobacterium species as well as in a broad spectrum of Brucella species. The Smr15C1 and Smr15C2 homologs are also encoded in tandem within the same IGR region of Rhizobium and Agrobacterium species, whereas in Brucella species the αr15C loci are spread in the IGRs of Chromosome I. Moreover, this analysis also identified a third αr15 loci in extrachromosomal replicons of the mentioned nitrogen-fixing α-proteobacteria and in the Chromosome II of Brucella species. αr15 RNA species are 99-121 nt long and share a well defined common secondary structure consisting of three stem loops. The transcripts of the αr15 family can be catalogued as trans-acting sRNAs encoded by independent transcription units with recognizable promoter and transcription termination signatures within intergenic regions (IGRs) of the α-proteobacterial genomes.

    αr35 is a family of bacterial small non-coding RNAs with representatives in a reduced group of Alphaproteobacteria from the order Hyphomicrobiales. The first member of this family (Smr35B) was found in a Sinorhizobium meliloti 1021 locus located in the symbiotic plasmid B (pSymB). Further homology and structure conservation analysis have identified full-length SmrB35 homologs in other legume symbionts, as well as in the human and plant pathogens Brucella anthropi and Agrobacterium tumefaciens, respectively. αr35 RNA species are 139-142 nt long and share a common secondary structure consisting of two stem loops and a well conserved rho independent terminator. Most of the αr35 transcripts can be catalogued as trans-acting sRNAs expressed from well-defined promoter regions of independent transcription units within intergenic regions of the Alphaproteobacterial genomes.

    An overlapping gene is a gene whose expressible nucleotide sequence partially overlaps with the expressible nucleotide sequence of another gene. In this way, a nucleotide sequence may make a contribution to the function of one or more gene products. Overlapping genes are present in and a fundamental feature of both cellular and viral genomes. The current definition of an overlapping gene varies significantly between eukaryotes, prokaryotes, and viruses. In prokaryotes and viruses overlap must be between coding sequences but not mRNA transcripts, and is defined when these coding sequences share a nucleotide on either the same or opposite strands. In eukaryotes, gene overlap is almost always defined as mRNA transcript overlap. Specifically, a gene overlap in eukaryotes is defined when at least one nucleotide is shared between the boundaries of the primary mRNA transcripts of two or more genes, such that a DNA base mutation at any point of the overlapping region would affect the transcripts of all genes involved. This definition includes 5′ and 3′ untranslated regions (UTRs) along with introns.

    The Pseudomonas phage F116 holin is a non-characterized holin homologous to one in Neisseria gonorrheae that has been characterized. This protein is the prototype of the Pseudomonasphage F116 holin family, which is a member of the Holin Superfamily II. Bioinformatic analysis of the genome sequence of N. gonorrhoeae revealed the presence of nine probable prophage islands. The genomic sequence of FA1090 identified five genomic regions that are related to dsDNA lysogenic phage. The DNA sequences from NgoPhi1, NgoPhi2 and NgoPhi3 contained regions of identity. A region of NgoPhi2 showed high similarity with the Pseudomonas aeruginosa generalized transducing phage F116. NgoPhi1 and NgoPhi2 encode functionally active phages. The holin gene of NgoPhi1, when expressed in E. coli, could substitute for the phage lambda S gene.

    MHC class III is a group of proteins belonging the class of major histocompatibility complex (MHC). Unlike other MHC types such as MHC class I and MHC class II, of which their structure and functions in immune response are well defined, MHC class III are poorly defined structurally and functionally. They are not involved in antigen binding. Only few of them are actually involved in immunity while many are signalling molecules in other cell communications. They are mainly known from their genes because their gene cluster is present between those of class I and class II. The gene cluster was discovered when genes were found in between class I and class II genes on the short (p) arm of human chromosome 6. It was later found that it contains many genes for different signalling molecules such as tumour necrosis factors (TNFs) and heat shock proteins. More than 60 MHC class III genes are described, which is about 28% of the total MHC genes (224). The region previously considered within MHC class III gene cluster that contains genes for TNFs is now known as MHC class IV or inflammatory region.

    References

    1. Ohara, O. (2009). "ORFeome Cloning". Reverse Chemical Genetics. Methods in Molecular Biology. Vol. 577. pp. 3–9. doi:10.1007/978-1-60761-232-2_1. ISBN   978-1-60761-231-5. PMID   19718504.
    2. Titz B, Rajagopala SV, Goll J, Häuser R, McKevitt MT, Palzkill T, Uetz P (2008). Hall N (ed.). "The binary protein interactome of Treponema pallidum--the syphilis spirochete". PLOS ONE. 3 (5): e2292. Bibcode:2008PLoSO...3.2292T. doi: 10.1371/journal.pone.0002292 . PMC   2386257 . PMID   18509523. Open Access logo PLoS transparent.svg
    3. Uetz P, Rajagopala SV, Dong YA, Haas J (Oct 2004). "From ORFeomes to protein interaction maps in viruses". Genome Research. 14 (10B): 2029–33. doi: 10.1101/gr.2583304 . PMID   15489322.
    4. McKevitt, Matthew; Brinkman, Mary Beth; McLoughlin, Melanie; Perez, Carla; Howell, Jerrilyn K.; Weinstock, George M.; Norris, Steven J.; Palzkill, Timothy (2005-07-01). "Genome scale identification of Treponema pallidum antigens". Infection and Immunity. 73 (7): 4445–4450. doi:10.1128/IAI.73.7.4445-4450.2005. ISSN   0019-9567. PMC   1168556 . PMID   15972547.
    5. Viadas C, Rodríguez MC, García-Lobo JM, Sangari FJ, López-Goñi I (Oct 2009). "Construction and evaluation of an ORFeome-based Brucella whole-genome DNA microarray". Microbial Pathogenesis. 47 (4): 189–95. doi:10.1016/j.micpath.2009.06.002. PMID   19524659.
    6. Maier CJ, Maier RH, Virok DP, Maass M, Hintner H, Bauer JW, Onder K (2012). "Construction of a highly flexible and comprehensive gene collection representing the ORFeome of the human pathogen Chlamydia pneumoniae". BMC Genomics. 13: 632. doi: 10.1186/1471-2164-13-632 . PMC   3534531 . PMID   23157390.
    7. Rajagopala SV, Yamamoto N, Zweifel AE, Nakamichi T, Huang HK, Mendez-Rios JD, Franca-Koh J, Boorgula MP, Fujita K, Suzuki K, Hu JC, Wanner BL, Mori H, Uetz P (2010). "The Escherichia coli K-12 ORFeome: a resource for comparative molecular microbiology". BMC Genomics. 11: 470. doi: 10.1186/1471-2164-11-470 . PMC   3091666 . PMID   20701780.
    8. Brettin T, Altherr MR, Du Y, Mason RM, Friedrich A, Potter L, Langford C, Keller TJ, Jens J, Howie H, Weyand NJ, Clary S, Prichard K, Wachocki S, Sodergren E, Dillard JP, Weinstock G, So M, Arvidson CG (2005). "Expression capable library for studies of Neisseria gonorrhoeae, version 1.0". BMC Microbiology. 5: 50. doi: 10.1186/1471-2180-5-50 . PMC   1236931 . PMID   16137322.
    9. Labaer J, Qiu Q, Anumanthan A, Mar W, Zuo D, Murthy TV, Taycher H, Halleck A, Hainsworth E, Lory S, Brizuela L (Oct 2004). "The Pseudomonas aeruginosa PA01 gene collection". Genome Research. 14 (10B): 2190–200. doi:10.1101/gr.2482804. PMC   528936 . PMID   15489342.
    10. Brandner CJ, Maier RH, Henderson DS, Hintner H, Bauer JW, Onder K (2008). "The ORFeome of Staphylococcus aureus v 1.1". BMC Genomics. 9: 321. doi: 10.1186/1471-2164-9-321 . PMC   2474624 . PMID   18605992.
    11. Fossum E, Friedel CC, Rajagopala SV, Titz B, Baiker A, Schmidt T, Kraus T, Stellberger T, Rutenberg C, Suthram S, Bandyopadhyay S, Rose D, von Brunn A, Uhlmann M, Zeretzke C, Dong YA, Boulet H, Koegl M, Bailer SM, Koszinowski U, Ideker T, Uetz P, Zimmer R, Haas J (Sep 2009). Sun R (ed.). "Evolutionarily conserved herpesviral protein interaction networks". PLOS Pathogens. 5 (9): e1000570. doi: 10.1371/journal.ppat.1000570 . PMC   2731838 . PMID   19730696.
    12. Lamesch P, Li N, Milstein S, Fan C, Hao T, Szabo G, Hu Z, Venkatesan K, Bethel G, Martin P, Rogers J, Lawlor S, McLaren S, Dricot A, Borick H, Cusick ME, Vandenhaute J, Dunham I, Hill DE, Vidal M (Mar 2007). "hORFeome v3.1: a resource of human open reading frames representing over 10,000 human genes". Genomics. 89 (3): 307–15. doi:10.1016/j.ygeno.2006.11.012. PMC   4647941 . PMID   17207965.
    13. http://horfdb.dfci.harvard.edu/ Human ORFeome 2011 Release