Tuberculosis Structural Genomics Consortium

Last updated

The TB Structural Genomics Consortium (TBSGC) is a worldwide consortium of scientists developing a foundation for tuberculosis diagnosis and treatment by determining the three-dimensional structures of proteins from M. tuberculosis founded in 2000 as a part of the Protein Structure Initiative. [1] [2] The consortium seeks to solve structures of proteins that are of great interest to the TB biology community. [3] A major goal of the consortium is to have a putative function for every ORF in the TB genome.

Contents

Activities

As of June 2006, 82 TB protein structures have been determined, 15 since January 1, 2006. The database of linked structural and functional information that has been constructed using this information can form a lasting basis for understanding M. tuberculosis pathogenesis and for structure-based drug design.[ citation needed ]

As of June 2006, the TB Structural Genomics Consortium consists of 430 active members in 148 laboratories from 83 institutions across 15 countries. [3] Consortium laboratories are collectively responsible for 3.3% of all protein structures in the protein data bank and have extensive records of methods development. Consortium members have carried out a pilot project on the structural genomics of a hyperthermophile that has identified bottlenecks in the structure determination process and resulted in the development of methodologies for high-throughput structure determination and analysis. The consortium has five core facilities (located at Lawrence Livermore National Laboratory, Los Alamos National Lab, Lawrence Berkeley National Laboratory, University of California, Los Angeles and Texas A&M University) that carry out an increasing fraction of routine tasks such as protein production, crystallization and X-ray data collection. Members of the consortium improve their productivity by sending materials to these facilities, receiving the resulting products or data, and reporting this activity to the database. This helps to minimize redundant pursuits of targets. This structural and functional information is publicly available.[ citation needed ]

The five core facilities available to consortium members provide services for cloning, expression, and purification of proteins as well as crystallization and subsequent diffraction and data analysis of protein crystals. [4] Furthermore, a database has been developed to record all activity done within the consortium. This database also tracks the movement of materials between members and allows the up to the minute status to be recorded and available to all other members.[ citation needed ]

Related Research Articles

<span class="mw-page-title-main">Bioinformatics</span> Computational analysis of large, complex sets of biological data

Bioinformatics is an interdisciplinary field of science that develops methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, chemistry, physics, computer science, computer programming, information engineering, mathematics and statistics to analyze and interpret biological data. The subsequent process of analyzing and interpreting data is referred to as computational biology.

<span class="mw-page-title-main">Protein</span> Biomolecule consisting of chains of amino acid residues

Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, responding to stimuli, providing structure to cells and organisms, and transporting molecules from one location to another. Proteins differ from one another primarily in their sequence of amino acids, which is dictated by the nucleotide sequence of their genes, and which usually results in protein folding into a specific 3D structure that determines its activity.

<span class="mw-page-title-main">Genomics</span> Discipline in genetics

Genomics is an interdisciplinary field of biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, three-dimensional structural configuration. In contrast to genetics, which refers to the study of individual genes and their roles in inheritance, genomics aims at the collective characterization and quantification of all of an organism's genes, their interrelations and influence on the organism. Genes may direct the production of proteins with the assistance of enzymes and messenger molecules. In turn, proteins make up body structures such as organs and tissues as well as control chemical reactions and carry signals between cells. Genomics also involves the sequencing and analysis of genomes through uses of high throughput DNA sequencing and bioinformatics to assemble and analyze the function and structure of entire genomes. Advances in genomics have triggered a revolution in discovery-based research and systems biology to facilitate understanding of even the most complex biological systems such as the brain.

The Protein Data Bank (PDB) is a database for the three-dimensional structural data of large biological molecules, such as proteins and nucleic acids. The data, typically obtained by X-ray crystallography, NMR spectroscopy, or, increasingly, cryo-electron microscopy, and submitted by biologists and biochemists from around the world, are freely accessible on the Internet via the websites of its member organisations. The PDB is overseen by an organization called the Worldwide Protein Data Bank, wwPDB.

<span class="mw-page-title-main">Structural genomics</span>

Structural genomics seeks to describe the 3-dimensional structure of every protein encoded by a given genome. This genome-based approach allows for a high-throughput method of structure determination by a combination of experimental and modeling approaches. The principal difference between structural genomics and traditional structural prediction is that structural genomics attempts to determine the structure of every protein encoded by the genome, rather than focusing on one particular protein. With full-genome sequences available, structure prediction can be done more quickly through a combination of experimental and modeling approaches, especially because the availability of large number of sequenced genomes and previously solved protein structures allows scientists to model protein structure on the structures of previously solved homologs.

<span class="mw-page-title-main">ENCODE</span> Research consortium investigating functional elements in human and model organism DNA

The Encyclopedia of DNA Elements (ENCODE) is a public research project which aims "to build a comprehensive parts list of functional elements in the human genome."

<span class="mw-page-title-main">Pfam</span> Database of protein families

Pfam is a database of protein families that includes their annotations and multiple sequence alignments generated using hidden Markov models. The most recent version, Pfam 36.0, was released in September 2023 and contains 20,795 families.

Aled Morgan Edwards is the founder and Chief Executive of the Structural Genomics Consortium, a charitable public-private partnership. He is Professor of Medical Genetics and Medical Biophysics at the University of Toronto, Visiting Professor of Chemical Biology at the University of Oxford, and Adjunct Professor at McGill University.

The Structural Genomics Consortium (SGC) is a public-private-partnership focusing on elucidating the functions and disease relevance of all proteins encoded by the human genome, with an emphasis on those that are relatively understudied. The SGC places all its research output into the public domain without restriction and does not file for patents and continues to promote open science. Two recent publications revisit the case for open science. Founded in 2003, and modelled after the Single Nucleotide Polymorphism Database (dbSNP) Consortium, the SGC is a charitable company whose Members comprise organizations that contribute over $5,4M Euros to the SGC over a five-year period. The Board has one representative from each Member and an independent Chair, who serves one 5-year term. The current Chair is Anke Müller-Fahrnow (Germany), and previous Chairs have been Michael Morgan (U.K.), Wayne Hendrickson (U.S.A.), Markus Gruetter (Switzerland) and Tetsuyuki Maruyama (Japan). The founding and current CEO is Aled Edwards (Canada). The founding Members of the SGC Company were the Canadian Institutes of Health Research, Genome Canada, the Ontario Research Fund, GlaxoSmithKline and Wellcome Trust. The current Members comprise Bayer Pharma AG, Bristol Myers Squibb, Boehringer Ingelheim, the Eshelman Institute for Innovation, Genentech, Genome Canada, Janssen, Merck KGaA, Pfizer, and Takeda.

Mark Bender Gerstein is an American scientist working in bioinformatics and Data Science. As of 2009, he is co-director of the Yale Computational Biology and Bioinformatics program.

<span class="mw-page-title-main">Protein Structure Initiative</span>

The Protein Structure Initiative (PSI) was a USA based project that aimed at accelerating discovery in structural genomics and contribute to understanding biological function. Funded by the U.S. National Institute of General Medical Sciences (NIGMS) between 2000 and 2015, its aim was to reduce the cost and time required to determine three-dimensional protein structures and to develop techniques for solving challenging problems in structural biology, including membrane proteins. Over a dozen research centers have been supported by the PSI for work in building and maintaining high-throughput structural genomics pipelines, developing computational protein structure prediction methods, organizing and disseminating information generated by the PSI, and applying high-throughput structure determination to study a broad range of important biological and biomedical problems.

<span class="mw-page-title-main">STRING</span>

In molecular biology, STRING is a biological database and web resource of known and predicted protein–protein interactions.

<span class="mw-page-title-main">Racemic crystallography</span>

Racemic crystallography is a technique used in structural biology where crystals of a protein molecule are developed from an equimolar mixture of an L-protein molecule of natural chirality and its D-protein mirror image. L-protein molecules consist of 'left-handed' L-amino acids and the achiral amino acid glycine, whereas the mirror image D-protein molecules consist of 'right-handed' D-amino acids and glycine. Typically, both the L-protein and the D-protein are prepared by total chemical synthesis.

SUPERFAMILY is a database and search platform of structural and functional annotation for all proteins and genomes. It classifies amino acid sequences into known structural domains, especially into SCOP superfamilies. Domains are functional, structural, and evolutionary units that form proteins. Domains of common Ancestry are grouped into superfamilies. The domains and domain superfamilies are defined and described in SCOP. Superfamilies are groups of proteins which have structural evidence to support a common evolutionary ancestor but may not have detectable sequence homology.

Protein function prediction methods are techniques that bioinformatics researchers use to assign biological or biochemical roles to proteins. These proteins are usually ones that are poorly studied or predicted based on genomic sequence data. These predictions are often driven by data-intensive computational procedures. Information may come from nucleic acid sequence homology, gene expression profiles, protein domain structures, text mining of publications, phylogenetic profiles, phenotypic profiles, and protein-protein interaction. Protein function is a broad term: the roles of proteins range from catalysis of biochemical reactions to transport to signal transduction, and a single protein may play a role in multiple processes or cellular pathways.

<span class="mw-page-title-main">Alfonso Valencia</span>

Alfonso Valencia is a Spanish biologist, ICREA Professor, current director of the Life Sciences department at Barcelona Supercomputing Center. and of Spanish National Bioinformatics Institute (INB-ISCIII). From 2015-2018, he was President of the International Society for Computational Biology. His research is focused on the study of biomedical systems with computational biology and bioinformatics approaches.

Julian John Thurstan Gough is a Group Leader in the Laboratory of Molecular Biology (LMB) of the Medical Research Council (MRC). He was previously a professor of bioinformatics at the University of Bristol.

Donna R. Maglott is a staff scientist at the National Center for Biotechnology Information known for her research on large-scale genomics projects, including the mouse genome and development of databases required for genomics research.

<span class="mw-page-title-main">Cryogenic electron microscopy</span> Form of transmission electron microscopy (TEM)

Cryogenic electron microscopy (cryo-EM) is a cryomicroscopy technique applied on samples cooled to cryogenic temperatures. For biological specimens, the structure is preserved by embedding in an environment of vitreous ice. An aqueous sample solution is applied to a grid-mesh and plunge-frozen in liquid ethane or a mixture of liquid ethane and propane. While development of the technique began in the 1970s, recent advances in detector technology and software algorithms have allowed for the determination of biomolecular structures at near-atomic resolution. This has attracted wide attention to the approach as an alternative to X-ray crystallography or NMR spectroscopy for macromolecular structure determination without the need for crystallization.

References

  1. Grabowski, Marek; Niedzialkowska, Ewa; Zimmerman, Matthew D.; Minor, Wladek (2016-03-01). "The impact of structural genomics: the first quindecennial". Journal of Structural and Functional Genomics. 17 (1): 1–16. doi:10.1007/s10969-016-9201-5. ISSN   1570-0267. PMC   4834271 . PMID   26935210.
  2. Terwilliger, T. C; Park, M. S; Waldo, G. S; Berendzen, J; Hung, L. -W; Kim, C. -Y; Smith, C. V; Sacchettini, J. C; Bellinzoni, M; Bossi, R; De Rossi, E (2003-08-01). "The TB structural genomics consortium: a resource for Mycobacterium tuberculosis biology". Tuberculosis. 83 (4): 223–249. doi:10.1016/S1472-9792(03)00051-9. ISSN   1472-9792. PMID   12906835.
  3. 1 2 Chim, Nicholas; Habel, Jeff E.; Johnston, Jodie M.; Krieger, Inna; Miallau, Linda; Sankaranarayanan, Ramasamy; Morse, Robert P.; Bruning, John; Swanson, Stephanie; Kim, Haelee; Kim, Chang-Yub (March 2011). "The TB Structural Genomics Consortium: a decade of progress". Tuberculosis (Edinburgh, Scotland). 91 (2): 155–172. doi:10.1016/j.tube.2010.11.009. ISSN   1873-281X. PMC   3310434 . PMID   21247804.
  4. Rupp, B.; Segelke, B. W.; Krupka, H. I.; Lekin, T.; Schäfer, J.; Zemla, A.; Toppani, D.; Snell, G.; Earnest, T. (2002-09-26). "The TB structural genomics consortium crystallization facility: towards automation from protein to electron density". Acta Crystallographica Section D: Biological Crystallography. 58 (10): 1514–1518. Bibcode:2002AcCrD..58.1514R. doi:10.1107/s0907444902014282. PMID   12351851 . Retrieved 2022-03-05.