Type | Private company |
---|---|
Industry | Software |
Founded | 9 July 2001[1] |
Headquarters | , |
Revenue | £9.1 million [1] (2017) |
Number of employees | 94 [1] (2017) |
Parent | IQVIA |
Website | www |
Linguamatics, headquartered in Cambridge, England, with offices in the United States and UK, is a provider of text mining systems through software licensing and services, primarily for pharmaceutical and healthcare applications. Founded in 2001, the company was purchased by IQVIA in January 2019. [2] [3]
The company develops enterprise search tools for the life sciences sector. [4] The core natural language processing engine (I2E) uses a federated architecture [5] to incorporate data from 3rd party resources. [6] Initially developed to be used interactively through a graphic user interface, the core software also has an application programming interface that can be used to automate searches. [7]
LabKey, [8] Penn Medicine, [9] Atrius Health [10] and Mercy [11] all use Linguamatics software to extract electronic health record data into data warehouses. Linguamatics software is used by 17 of the top 20 global pharmaceutical companies, the US Food and Drug Administration, as well as healthcare providers. [6]
The core software, "I2E", is used by a number of companies to either extend their own software or to publish their data.
Copyright Clearance Center uses I2E to produce searchable indexes of material that would otherwise be unsearchable due to copyright. [12]
Thomson Reuters produces Cortellis Informatics Clinical Text Analytics, which depends on I2E to make clinical data accessible and searchable. [6]
Pipeline Pilot can integrate I2E as part of a workflow. [13]
ChemAxon can be used alongside I2E to allow named entity recognition of chemicals within unstructured data. [14]
Data sources include MEDLINE, ClinicalTrials.gov, FDA Drug Labels, PubMed Central, and Patent Abstracts. [15]
Text mining, text data mining (TDM) or text analytics is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extracting information from different written resources." Written resources may include websites, books, emails, reviews, and articles. High-quality information is typically obtained by devising patterns and trends by means such as statistical pattern learning. According to Hotho et al. (2005) we can distinguish between three different perspectives of text mining: information extraction, data mining, and a knowledge discovery in databases (KDD) process. Text mining usually involves the process of structuring the input text, deriving patterns within the structured data, and finally evaluation and interpretation of the output. 'High quality' in text mining usually refers to some combination of relevance, novelty, and interest. Typical text mining tasks include text categorization, text clustering, concept/entity extraction, production of granular taxonomies, sentiment analysis, document summarization, and entity relation modeling.
UIMA, short for Unstructured Information Management Architecture, is an OASIS standard for content analytics, originally developed at IBM. It provides a component software architecture for the development, discovery, composition, and deployment of multi-modal analytics for the analysis of unstructured information and integration with search technologies.
Unstructured data is information that either does not have a pre-defined data model or is not organized in a pre-defined manner. Unstructured information is typically text-heavy, but may contain data such as dates, numbers, and facts as well. This results in irregularities and ambiguities that make it difficult to understand using traditional programs as compared to data stored in fielded form in databases or annotated in documents.
Biomedical text mining refers to the methods and study of how text mining may be applied to texts and literature of the biomedical domain. As a field of research, biomedical text mining incorporates ideas from natural language processing, bioinformatics, medical informatics and computational linguistics. The strategies in this field have been applied to the biomedical literature available through services such as PubMed.
Health technology is defined by the World Health Organization as the "application of organized knowledge and skills in the form of devices, medicines, vaccines, procedures, and systems developed to solve a health problem and improve quality of lives". This includes pharmaceuticals, devices, procedures, and organizational systems used in the healthcare industry, as well as computer-supported information systems. In the United States, these technologies involve standardized physical objects, as well as traditional and designed social means and methods to treat or care for patients.
A Regional Health Information Organization, also called a Health Information Exchange Organization, is a multistakeholder organization created to facilitate a health information exchange (HIE) – the transfer of healthcare information electronically across organizations – among stakeholders of that region's healthcare system. The ultimate objective is to improve the safety, quality, and efficiency of healthcare as well as access to healthcare through the efficient application of health information technology. RHIOs are also intended to support secondary use of clinical data for research as well as institution/provider quality assessment and improvement. RHIO stakeholders include smaller clinics, hospitals, medical societies, major employers and payers.
IMS Health was an American company that provided information, services and technology for the healthcare industry. IMS stood for Intercontinental Medical Statistics. It was the largest vendor of U.S. physician prescribing data. IMS Health was founded in 1954 by Bill Frohlich and David Dubow with Arthur Sackler having a hidden ownership stake. In 2010, IMS Health was taken private by TPG Capital, CPP Investment Board and Leonard Green & Partners. The company went public on April 4, 2014, and began trading on the NYSE under the symbol IMS. IMS Health was headquartered in Danbury, Connecticut.
IQVIA, formerly Quintiles and IMS Health, Inc., is an American multinational company serving the combined industries of health information technology and clinical research. IQVIA is a provider of biopharmaceutical development, consulting and commercial outsourcing services, focused primarily on Phase I-IV clinical trials and associated laboratory and analytical services, including strategy and management consulting services. It has a network of more than 88,000 employees in more than 100 countries and a market capitalization of US$49 billion as of August 2021. As of 2017, IQVIA was reported to be one of the world's largest contract research organizations.
Ontotext is a software company with offices in Europe and USA. It is the semantic technology branch of Sirma Group. Its main domain of activity is the development of software based on the Semantic Web languages and standards, in particular RDF, OWL and SPARQL. Ontotext is best known for the Ontotext GraphDB semantic graph database engine. Another major business line is the development of enterprise knowledge management and analytics systems that involve big knowledge graphs. Those systems are developed on top of the Ontotext Platform that builds on top of GraphDB capabilities for text mining using big knowledge graphs.
BIOVIA is a software company headquartered in the United States, with representation in Europe and Asia. It provides software for chemical, materials and bioscience research for the pharmaceutical, biotechnology, consumer packaged goods, aerospace, energy and chemical industries.
Chemaxon is a cheminformatics and bioinformatics software development company, headquartered in Budapest with 250 employees. The company also has offices in Cambridge, San Diego, Basel and in Prague. and it has distributors in China, India, Japan, South Korea, Singapore, and Australia.
Qlik [pronounced "klik"] provides a business analytics platform. The software company was founded in 1993 in Lund, Sweden and is now based in King of Prussia, Pennsylvania, United States. The company's main products are Qlik Replicate and Qlik Sense, both software for business intelligence and data integration.
KNIME, the Konstanz Information Miner, is a free and open-source data analytics, reporting and integration platform. KNIME integrates various components for machine learning and data mining through its modular data pipelining "Building Blocks of Analytics" concept. A graphical user interface and use of JDBC allows assembly of nodes blending different data sources, including preprocessing, for modeling, data analysis and visualization without, or with only minimal, programming.
Chemicalize is an online platform for chemical calculations, search, and text processing. It is developed and owned by ChemAxon and offers various cheminformatics tools in freemium model: chemical property predictions, structure-based and text-based search, chemical text processing, and checking compounds with respect to national regulations of different countries.
Apache cTAKES: clinical Text Analysis and Knowledge Extraction System is an open-source Natural Language Processing (NLP) system that extracts clinical information from electronic health record unstructured text. It processes clinical notes, identifying types of clinical named entities — drugs, diseases/disorders, signs/symptoms, anatomical sites and procedures. Each named entity has attributes for the text span, the ontology mapping code, context, and negated/not negated.
EMIS Health, formerly known as Egton Medical Information Systems, supplies electronic patient record systems and software used in primary care, acute care and community pharmacy in the United Kingdom. The company is based in Leeds. It claims that more than half of GP practices across the UK use EMIS Health software and holds number one or two market positions in its main markets. In June 2022 the company was acquired by Bordeaux UK Holdings II Limited, an affiliate of UnitedHealth’s Optum business for a 49% premium on EMIS’s closing share price.
Search as a service is a branch of software as a service (SaaS), focused on enterprise search or site-specific web search.
Spark NLP is an open-source text processing library for advanced natural language processing for the Python, Java and Scala programming languages. The library is built on top of Apache Spark and its Spark ML library.
deepset is an enterprise software vendor that provides developers with the tools to build production-ready natural language processing (NLP) systems. It was founded in 2018 in Berlin by Milos Rusic, Malte Pietsch, and Timo Möller. deepset authored and maintains the open source software Haystack and its commercial SaaS offering deepset Cloud.