In molecular genetics, an ORFeome refers to the complete set of open reading frames (ORFs) in a genome. The term may also be used to describe a set of cloned ORFs. [1] ORFs correspond to the protein coding sequences (CDS) of genes. ORFs can be found in genome sequences by computer programs such as GENSCAN and then amplified by PCR. While this is relatively trivial in bacteria the problem is non-trivial in eukaryotic genomes because of the presence of introns and exons as well as splice variants.
The usage of complete ORFeomes reflects a new trend in biology that can be succinctly summarized as omics. ORFeomes are used for the study of protein-protein interactions, [2] [3] protein microarrays, the study of antigens, [4] and other fields of study.
Complete ORF sets have been cloned for a number of organisms including Brucella melitensis , [5] Chlamydia pneumoniae , [6] Escherichia coli , [7] Neisseria gonorrhoeae , [8]
Pseudomonas aeruginosa , [9] Schizosaccharomyces pombe,
and human herpesviruses [11]
In molecular biology, an interactome is the whole set of molecular interactions in a particular cell. The term specifically refers to physical interactions among molecules but can also describe sets of indirect interactions among genes.
Within the field of molecular biology, a protein-fragment complementation assay, or PCA, is a method for the identification and quantification of protein–protein interactions. In the PCA, the proteins of interest are each covalently linked to fragments of a third protein. Interaction between the bait and the prey proteins brings the fragments of the reporter protein in close proximity to allow them to form a functional reporter protein whose activity can be measured. This principle can be applied to many different reporter proteins and is also the basis for the yeast two-hybrid system, an archetypical PCA assay.
Zinc transporter 8 (ZNT8) is a protein that in humans is encoded by the SLC30A8 gene. ZNT8 is a zinc transporter related to insulin secretion in humans. In particular, ZNT8 is critical for the accumulation of zinc into beta cell secretory granules and the maintenance of stored insulin as tightly packaged hexamers. Certain alleles of the SLC30A8 gene may increase the risk for developing type 2 diabetes, but a loss-of-function mutation appears to greatly reduce the risk of diabetes.
Probable ATP-dependent RNA helicase DDX17 (p72) is an enzyme that in humans is encoded by the DDX17 gene.
Interferon alpha-inducible protein 27 is a protein that in humans is encoded by the IFI27 gene.
BTB/POZ domain-containing protein 1 is a protein that in humans is encoded by the BTBD1 gene.
MAP3K7CL, is a human gene located on chromosome 21. It is a protein-coding gene.
Histone deacetylase complex subunit SAP130 is an enzyme that in humans is encoded by the SAP130 gene.
Transmembrane protein 47 is a protein that in humans is encoded by the TMEM47 gene.
Transmembrane channel-like protein 2 is a protein that in humans is encoded by the TMC2 gene.
CCDC186 is a protein that in humans is encoded by the CCDC186 gene The CCDC186 gene is also known as the CTCL-tumor associated antigen with accession number NM_018017.
LTR retrotransposons are class I transposable elements (TEs) characterized by the presence of long terminal repeats (LTRs) directly flanking an internal coding region. As retrotransposons, they mobilize through reverse transcription of their mRNA and integration of the newly created cDNA into another genomic location. Their mechanism of retrotransposition is shared with retroviruses, with the difference that the rate of horizontal transfer in LTR-retrotransposons is much lower than the vertical transfer by passing active TE insertions to the progeny. LTR retrotransposons that form virus-like particles are classified under Ortervirales.
αr9 is a family of bacterial small non-coding RNAs with representatives in a broad group of α-proteobacteria from the order Hyphomicrobiales. The first member of this family (Smr9C) was found in a Sinorhizobium meliloti 1021 locus located in the chromosome (C). Further homology and structure conservation analysis have identified full-length Smr9C homologs in several nitrogen-fixing symbiotic rhizobia, in the plant pathogens belonging to Agrobacterium species as well as in a broad spectrum of Brucella species. αr9C RNA species are 144-158 nt long and share a well defined common secondary structure consisting of seven conserved regions. Most of the αr9 transcripts can be catalogued as trans-acting sRNAs expressed from well-defined promoter regions of independent transcription units within intergenic regions (IGRs) of the α-proteobacterial genomes.
αr14 is a family of bacterial small non-coding RNAs with representatives in a broad group of α-proteobacteria. The first member of this family (Smr14C2) was found in a Sinorhizobium meliloti 1021 locus located in the chromosome (C). It was later renamed NfeR1 and shown to be highly expressed in salt stress and during the symbiotic interaction on legume roots. Further homology and structure conservation analysis identified 2 other chromosomal copies and 3 plasmidic ones. Moreover, full-length Smr14C homologs have been identified in several nitrogen-fixing symbiotic rhizobia, in the plant pathogens belonging to Agrobacterium species as well as in a broad spectrum of Brucella species. αr14C RNA species are 115-125 nt long and share a well defined common secondary structure. Most of the αr14 transcripts can be catalogued as trans-acting sRNAs expressed from well-defined promoter regions of independent transcription units within intergenic regions (IGRs) of the α-proteobacterial genomes.
αr15 is a family of bacterial small non-coding RNAs with representatives in a broad group of α-proteobacteria from the order Rhizobiales. The first members of this family were found tandemly arranged in the same intergenic region (IGR) of the Sinorhizobium meliloti 1021 chromosome (C). Further homology and structure conservation analysis have identified full-length Smr15C1 and Smr15C2 homologs in several nitrogen-fixing symbiotic rhizobia, in the plant pathogens belonging to Agrobacterium species as well as in a broad spectrum of Brucella species. The Smr15C1 and Smr15C2 homologs are also encoded in tandem within the same IGR region of Rhizobium and Agrobacterium species, whereas in Brucella species the αr15C loci are spread in the IGRs of Chromosome I. Moreover, this analysis also identified a third αr15 loci in extrachromosomal replicons of the mentioned nitrogen-fixing α-proteobacteria and in the Chromosome II of Brucella species. αr15 RNA species are 99-121 nt long and share a well defined common secondary structure consisting of three stem loops. The transcripts of the αr15 family can be catalogued as trans-acting sRNAs encoded by independent transcription units with recognizable promoter and transcription termination signatures within intergenic regions (IGRs) of the α-proteobacterial genomes.
An overlapping gene is a gene whose expressible nucleotide sequence partially overlaps with the expressible nucleotide sequence of another gene. In this way, a nucleotide sequence may make a contribution to the function of one or more gene products. Overlapping genes are present in and a fundamental feature of both cellular and viral genomes. The current definition of an overlapping gene varies significantly between eukaryotes, prokaryotes, and viruses. In prokaryotes and viruses overlap must be between coding sequences but not mRNA transcripts, and is defined when these coding sequences share a nucleotide on either the same or opposite strands. In eukaryotes, gene overlap is almost always defined as mRNA transcript overlap. Specifically, a gene overlap in eukaryotes is defined when at least one nucleotide is shared between the boundaries of the primary mRNA transcripts of two or more genes, such that a DNA base mutation at any point of the overlapping region would affect the transcripts of all genes involved. This definition includes 5′ and 3′ untranslated regions (UTRs) along with introns.
The Pseudomonas phage F116 holin is a non-characterized holin homologous to one in Neisseria gonorrheae that has been characterized. This protein is the prototype of the Pseudomonasphage F116 holin family, which is a member of the Holin Superfamily II. Bioinformatic analysis of the genome sequence of N. gonorrhoeae revealed the presence of nine probable prophage islands. The genomic sequence of FA1090 identified five genomic regions that are related to dsDNA lysogenic phage. The DNA sequences from NgoPhi1, NgoPhi2 and NgoPhi3 contained regions of identity. A region of NgoPhi2 showed high similarity with the Pseudomonas aeruginosa generalized transducing phage F116. NgoPhi1 and NgoPhi2 encode functionally active phages. The holin gene of NgoPhi1, when expressed in E. coli, could substitute for the phage lambda S gene.
MHC class III is a group of proteins belonging the class of major histocompatibility complex (MHC). Unlike other MHC types such as MHC class I and MHC class II, of which their structure and functions in immune response are well defined, MHC class III are poorly defined structurally and functionally. They are not involved in antigen binding. Only few of them are actually involved in immunity while many are signalling molecules in other cell communications. They are mainly known from their genes because their gene cluster is present between those of class I and class II. The gene cluster was discovered when genes were found in between class I and class II genes on the short (p) arm of human chromosome 6. It was later found that it contains many genes for different signaling molecules such as tumour necrosis factors (TNFs) and heat shock proteins. More than 60 MHC class III genes are described, which is about 28% of the total MHC genes (224). The region previously considered within MHC class III gene cluster that contains genes for TNFs is now known as MHC class IV or inflammatory region.
Diversity-generating retroelements (DGRs) are a family of retroelements that were first found in Bordetella phage (BPP-1), and since been found in bacteria, Archaea, Archaean viruses, temperate phages, and lytic phages. DGRs benefit their host by mutating particular regions of specific target proteins, for instance, phage tail fiber in BPP-1, lipoprotein in legionella pneumophila, and TvpA in Treponema denticola . An error-prone reverse transcriptase is responsible for generating these hypervariable regions in target proteins. In mutagenic retrohoming, a mutagenized cDNA is reverse transcribed from a template region (TR), and is replaced with a segment similar to the template region called variable region (VR). Accessory variability determinant (Avd) protein is another component of DGRs, and its complex formation with the error-prone RT is of importance to mutagenic rehoming.