CLC bio

Last updated • 2 min readFrom Wikipedia, The Free Encyclopedia
CLC bio
Company type Privately held company
Industry Bioinformatics
Founded2005
Headquarters,
Area served
Denmark
Cambridge, Massachusetts
Tokyo
Taipei
Delhi
Products Software and consulting
Website http://www.clcbio.com (dead)

CLC bio was a bioinformatics software company that developed a software suite subsequently purchased by QIAGEN. [1]

Contents

History

CLC bio started commercial activities on January 1, 2005, headquartered in Aarhus, Denmark. Its product's development was also partly funded by collaborating with researchers on grant-funded projects. [2] [3] By 2012, it had additional offices in Cambridge, Massachusetts, Tokyo, Taipei and Delhi, with staff largely from research backgrounds (30% having a PhD) [4] and had built a userbase of around 250,000 users in both academic institutions and biotechnology companies. [5] [6]

CLC bio was acquired by QIAGEN in 2013 and merged into its bioinformatics research and development division with several other purchased platforms in 2014. [7] [8] [9]

Software

CLC bio's main activities were in software development for desktop (Mac OS X, Windows, and Linux), enterprise, and cloud software for analysis of biological data. CLC bio developed some of their own open source algorithms, as well as their own SIMD-accelerated implementations of several existing popular applications. In 2010, CLC bio was notable as the first commercial platform for bioinformatics analysis that utilized a graphical user interface for building, managing, and deploying analysis workflows as well as command-line tools, a SOAP and REST API, and later, the ability to run containerized tools. [10]

As additional capabilities were added to the software platform, it was eventually split into several themed Workbenches and plugins with collections of features relevant to different applications (e.g. pathway analysis, genomics, and other omics). Features include read mapping and de novo assembly of high-throughput sequencing data, whole-genome detection of SNPs and structural variations, ChIP-seq, RNA-Seq, small RNA analysis, genome finishing, microbial genomics, structural biology, and functions to analyze, visualize, and compare genomic, transcriptomic, and epigenomic data.

Cloud Computing

In 2017, CLC bio launched their CLC Genomics Cloud Engine [8] as a command-line driven platform for cloud-based bioinformatics workflow execution on Amazon Web Services. In 2019, this platform was adapted for and approved for use in AWS GovCloud. In 2020, CLC bio released a free plug-in that enables workflow execution on AWS directly from the CLC Genomics Workbench desktop software.[ citation needed ]

Hardware

Early on, the company initially presented own-developed high-performance computing solutions, focusing on accelerating open source algorithms such as HMMER, Smith-Waterman and ClustalW, using FPGA technology. However these products are no longer under development.

Related Research Articles

<span class="mw-page-title-main">Bioinformatics</span> Computational analysis of large, complex sets of biological data

Bioinformatics is an interdisciplinary field of science that develops methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, chemistry, physics, computer science, data science, computer programming, information engineering, mathematics and statistics to analyze and interpret biological data. The process of analyzing and interpreting data can sometimes be referred to as computational biology, however this distinction between the two terms is often disputed. To some, the term computational biology refers to building and using models of biological systems.

In bioinformatics, sequence analysis is the process of subjecting a DNA, RNA or peptide sequence to any of a wide range of analytical methods to understand its features, function, structure, or evolution. It can be performed on the entire genome, transcriptome or proteome of an organism, and can also involve only selected segments or regions, like tandem repeats and transposable elements. Methodologies used include sequence alignment, searches against biological databases, and others.

Bioconductor is a free, open source and open development software project for the analysis and comprehension of genomic data generated by wet lab experiments in molecular biology.

<span class="mw-page-title-main">Apache Taverna</span>

Apache Taverna was an open source software tool for designing and executing workflows, initially created by the myGrid project under the name Taverna Workbench, then a project under the Apache incubator. Taverna allowed users to integrate many different software components, including WSDL SOAP or REST Web services, such as those provided by the National Center for Biotechnology Information, the European Bioinformatics Institute, the DNA Databank of Japan (DDBJ), SoapLab, BioMOBY and EMBOSS. The set of available services was not finite and users could import new service descriptions into the Taverna Workbench.

<span class="mw-page-title-main">Galaxy (computational biology)</span>

Galaxy is a scientific workflow, data integration, and data and analysis persistence and publishing platform that aims to make computational biology accessible to research scientists that do not have computer programming or systems administration experience. Although it was initially developed for genomics research, it is largely domain agnostic and is now used as a general bioinformatics workflow management system.

GeneMark is a generic name for a family of ab initio gene prediction algorithms and software programs developed at the Georgia Institute of Technology in Atlanta. Developed in 1993, original GeneMark was used in 1995 as a primary gene prediction tool for annotation of the first completely sequenced bacterial genome of Haemophilus influenzae, and in 1996 for the first archaeal genome of Methanococcus jannaschii. The algorithm introduced inhomogeneous three-periodic Markov chain models of protein-coding DNA sequence that became standard in gene prediction as well as Bayesian approach to gene prediction in two DNA strands simultaneously. Species specific parameters of the models were estimated from training sets of sequences of known type. The major step of the algorithm computes for a given DNA fragment posterior probabilities of either being "protein-coding" in each of six possible reading frames or being "non-coding". The original GeneMark was an HMM-like algorithm; it could be viewed as approximation to known in the HMM theory posterior decoding algorithm for appropriately defined HMM model of DNA sequence.

<span class="mw-page-title-main">UGENE</span> Computer software for bioinformatics

UGENE is computer software for bioinformatics. It works on personal computer operating systems such as Windows, macOS, or Linux. It is released as free and open-source software, under a GNU General Public License (GPL) version 2.

<span class="mw-page-title-main">RNA-Seq</span> Lab technique in cellular biology

RNA-Seq is a technique that uses next-generation sequencing to reveal the presence and quantity of RNA molecules in a biological sample, providing a snapshot of gene expression in the sample, also known as transcriptome.

QIAGEN Silicon Valley is a company based in Redwood City, California, USA, that develops software to analyze complex biological systems. QIAGEN Silicon Valley's first product, IPA, was introduced in 2003, and is used to help researchers analyze omics data and model biological systems. The software has been cited in thousands of scientific molecular biology publications and is one of several tools for systems biology researchers and bioinformaticians in drug discovery and institutional research.

Integromics was a global bioinformatics company headquartered in Granada, Spain and Madrid. The company had subsidiaries in the United States and United Kingdom, and distributors in 10 countries. Integromics specialised in bioinformatics software for data management and data analysis in genomics and proteomics. The company provided a line of products that serve gene expression, sequencing, and proteomics markets. Customers included genomic research centers, pharmaceutical companies, academic institutions, clinical research organizations, and biotechnology companies.

GenomeSpace is an environment for genomics software tools and applications. It helps users manage their analysis workflows involving multiple diverse tools, including web applications and desktop tools and facilitates the transfer of data between tools via automatic format conversion. Analyses can use data from local or cloud-based stores.

A bioinformatics workflow management system is a specialized form of workflow management system designed specifically to compose and execute a series of computational or data manipulation steps, or a workflow, that relate to bioinformatics.

<span class="mw-page-title-main">Geworkbench</span> Genomic data analysis software

geWorkbench is an open-source software platform for integrated genomic data analysis. It is a desktop application written in the programming language Java. geWorkbench uses a component architecture. As of 2016, there are more than 70 plug-ins available, providing for the visualization and analysis of gene expression, sequence, and structure data.

<span class="mw-page-title-main">BioBam Bioinformatics</span> Bioinformatics software company

BioBam is a bioinformatics software company located in Valencia, Spain selling software for the analysis of biological data. Its products are used by public and private research organizations around the world. The firm was founded in 2011 by Dr. Stefan Götz and Dr. Ana Conesa. The managing director of BioBam is Dr. Stefan Götz.

The BioCompute Object (BCO) project is a community-driven initiative to build a framework for standardizing and sharing computations and analyses generated from High-throughput sequencing. The project has since been standardized as IEEE 2791-2020, and the project files are maintained in an open source repository. The July 22nd, 2020 edition of the Federal Register announced that the FDA now supports the use of BioCompute in regulatory submissions, and the inclusion of the standard in the Data Standards Catalog for the submission of HTS data in NDAs, ANDAs, BLAs, and INDs to CBER, CDER, and CFSAN.

The 'German Network for Bioinformatics Infrastructure – de.NBI' is a national, academic and non-profit infrastructure initiated by the Federal Ministry of Education and Research funding 2015-2021. The network provides bioinformatics services to users in life sciences research and biomedicine in Germany and Europe. The partners organize training events, courses and summer schools on tools, standards and compute services provided by de.NBI to assist researchers to more effectively exploit their data. From 2022, the network will be integrated into Forschungszentrum Jülich.

Nvidia Parabricks is a suite of free software for genome analysis developed by Nvidia, designed to deliver high throughput by using graphics processing unit (GPU) acceleration.

References

  1. "QIAGEN Acquires CLC Bio". Informatics from Technology Networks. Retrieved 2025-01-17.
  2. "CLC Bio to Participate in $15M EU Effort to Study Stem Cell Differentiation Mechanisms". GenomeWeb. 2012-02-17. Retrieved 2022-11-29.
  3. "EU Funds Development of Gene Regulation Software Suite". GenomeWeb. 2011-01-13. Retrieved 2022-11-29.
  4. "Company profile (PDF)" (PDF). Archived from the original (PDF) on 2012-05-04. Retrieved 2012-05-29.
  5. "Number of downloads". clcbio.com. 2012-02-05. Archived from the original on 2012-02-05.
  6. "Customers". clcbio.com. 2012-10-05. Archived from the original on 2012-10-05.
  7. "Bioinformatics Tools and Applications". QIAGEN Digital Insights. Retrieved 2022-11-29.
  8. 1 2 "Latest Improvements". QIAGEN Digital Insights. Retrieved 2022-11-30.
  9. "About Us". QIAGEN Digital Insights. Retrieved 2022-11-30.
  10. "Latest Improvements". Bioinformatics Software and Services: QIAGEN Digital Insights. Retrieved 2020-05-07.