Company type | Subsidiary |
---|---|
Industry | Statistical software |
Predecessor | Revolution Computing |
Founded | 2007 |
Headquarters | Mountain View, CA , United States |
Key people | David Rich, CEO |
Products | Revolution R |
Revenue | 8-11 Million in 2009 |
Owner | Microsoft [1] |
Parent | Microsoft |
Website | revolutionanalytics |
Revolution Analytics (formerly REvolution Computing) is a statistical software company focused on developing open source and "open-core" [2] versions of the free and open source software R for enterprise, academic and analytics customers. Revolution Analytics was founded in 2007 as REvolution Computing providing support and services for R in a model similar to Red Hat's approach with Linux in the 1990s as well as bolt-on additions for parallel processing. In 2009 the company received nine million in venture capital from Intel along with a private equity firm and named Norman H. Nie as their new CEO. In 2010 the company announced the name change as well as a change in focus. Their core product, Revolution R, would be offered free to academic users and their commercial software would focus on big data, large scale multiprocessor (or "high performance") computing, and multi-core functionality.
Microsoft announced on January 23, 2015, that they had reached an agreement to purchase Revolution Analytics for an as yet undisclosed amount. [3] [4] In 2021, Microsoft announced they would be retiring their R distribution they acquired from Revolution Analytics. [5] In 2023, Microsoft retired the Microsoft R Application Network, which was a proprietary package hosting service similar to the Comprehensive R Archive Network for packages acquired from Revolution Analytics (like "ScaleR"). [6]
REvolution Computing was founded in New Haven, Connecticut in 2007 by Richard Schultz, Martin Schultz, Steve Weston and Kirk Mettler. At the time Martin Schultz was also the Watson Professor of Computer Science at Yale University. [7] [8] Adding parallel computing to R allowed the company to net large gains in speed for many common analytics operations and early clients like Pfizer took advantage of REvolution R to see large performance gains using R on computing clusters. [9] While the improvements to core R were released under the GNU General Public License (GPL), REvolution provides support and services to customers of their commercial product and had considerable early success with life sciences and pharmaceutical companies. [10] [11] A year later the company opened an additional office in Seattle. [12]
In 2009 REvolution Computing accepted nine million dollars in venture capital from Intel and North Bridge Venture Partners, a private equity firm. Intel had previously supported REvolution Computing with venture capital in 2008. [13] A number of Intel employees also joined Revolution Analytics as employees or as advisors. [9] Concurrently, the company changed their name to Revolution Analytics and invited Norman Nie, founder of SPSS, to serve as CEO. [14] [15] This change in management corresponded with a movement toward building a more complete set of software for commercial users; prior to 2009 Revolution had been focused on building parallel processing functionality into the then mostly single threaded R. [16] David Rich replaced Norman Nie as CEO in February 2012. [17]
Unlike analytics products offered by SAS Institute, R does not natively handle datasets larger than main memory. In 2010 Revolution Analytics introduced ScaleR, a package for Revolution R Enterprise designed to handle big data through a high-performance disk-based data store called XDF (not related to IBM's Extensible Data Format) and high performance computing across large clusters. [18] The release of ScaleR marked a push away from consulting and services alone to custom code and a la carte package pricing. [19] ScaleR also works with Apache Hadoop and other distributed file systems and Revolution Analytics has partnered with IBM to further integrate Hadoop into Revolution R. [20] [21] Packages to integrate Hadoop and MapReduce into open source R can also be found on the community package repository, CRAN. [22] [23]
In comparison to developers of similar analytics tools, Revolution Analytics is a small company; in 2010 the company had a projected revenue of $8–11 million, but no official records of revenue or profit were published in their projections. [24] According to Nie, the increased use of R - a fully fledged programming language, in contrast to other analytics packages - within academia is helping the company to grow quickly. [25] [26] [27] [28] [29] Community vice president David Smith suggested that movement away from "black box" analytics toward open source tools in general supported vendors like Revolution over solely proprietary tools. [30]
Revolution Analytics' product Revolution R is available in three editions. Revolution R Open is a free and open source distribution of R with additional features for performance and reproducibility. Revolution R Plus provides technical support and open-source assurance (legal indemnification) subscriptions for Revolution R Open and other open-source components that work with R. (These products were first announced October 15, 2014. [31] ) Revolution R Enterprise adds proprietary components to support statistical analysis of Big Data, and is sold as subscriptions for workstations, servers, Hadoop and databases. (Single-user licenses are available free for academic users as well as users competing in Kaggle data mining competitions. [32] [33] )
In January 2015 Microsoft rebranded and renewed several Revolution Analytics products and offerings for Hadoop, Teradata Database, SUSE Linux, Red Hat, and Microsoft Windows. Microsoft made several of these R-based products free of charge for developers - these products included:
Oracle Corporation is an American multinational computer technology company headquartered in Austin, Texas. In 2020, Oracle was the third-largest software company in the world by revenue and market capitalization. In 2023, the company’s seat in Forbes Global 2000 was 80. The company sells database software and cloud computing. Oracle's core application software is a suite of enterprise software products, such as enterprise resource planning (ERP) software, human capital management (HCM) software, customer relationship management (CRM) software, enterprise performance management (EPM) software, Customer Experience Commerce and supply chain management (SCM) software.
Db2 is a family of data management products, including database servers, developed by IBM. It initially supported the relational model, but was extended to support object–relational features and non-relational structures like JSON and XML. The brand name was originally styled as DB2 until 2017, when it changed to its present form.
Teradata Corporation is an American software company that provides cloud database and analytics-related software, products, and services. The company was formed in 1979 in Brentwood, California, as a collaboration between researchers at Caltech and Citibank's advanced technology group.
SPSS Statistics is a statistical software suite developed by IBM for data management, advanced analytics, multivariate analysis, business intelligence, and criminal investigation. Long produced by SPSS Inc., it was acquired by IBM in 2009. Versions of the software released since 2015 have the brand name IBM SPSS Statistics.
A GIS software program is a computer program to support the use of a geographic information system, providing the ability to create, store, manage, query, analyze, and visualize geographic data, that is, data representing phenomena for which location is important. The GIS software industry encompasses a broad range of commercial and open-source products that provide some or all of these capabilities within various information technology architectures.
PowerLinux is the combination of a Linux-based operating system (OS) running on PowerPC- or Power ISA-based computers from IBM. It is often used in reference along with Linux on Power, and is also the name of several Linux-only IBM Power Systems.
Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model. Hadoop was originally designed for computer clusters built from commodity hardware, which is still the common use. It has since also found use on clusters of higher-end hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be automatically handled by the framework.
In computing, the term data warehouse appliance (DWA) was coined by Foster Hinshaw for a computer architecture for data warehouses (DW) specifically marketed for big data analysis and discovery that is simple to use and has a high performance for the workload. A DWA includes an integrated set of servers, storage, operating systems, and databases.
Aster Data Systems was a data management and analysis software company headquartered in San Carlos, California. It was founded in 2005 and acquired by Teradata in 2011.
Greenplum is a big data technology based on MPP architecture and the Postgres open source database technology. The technology was created by a company of the same name headquartered in San Mateo, California around 2005. Greenplum was acquired by EMC Corporation in July 2010.
Vertica is an analytic database management software company. Vertica was founded in 2005 by the database researcher Michael Stonebraker with Andrew Palmer as the founding CEO. Ralph Breslauer and Christopher P. Lynch served as CEOs later on.
Altoros Systems is a software development company that provides products and services for the Cloud Foundry platform. Altoros contributes to development and evolution of this open source initiative as governed by the Linux Foundation.
Pentaho is business intelligence (BI) software that provides data integration, OLAP services, reporting, information dashboards, data mining and extract, transform, load (ETL) capabilities. Its headquarters are in Orlando, Florida. Pentaho was acquired by Hitachi Data Systems in 2015 and in 2017 became part of Hitachi Vantara.
HPCC, also known as DAS, is an open source, data-intensive computing system platform developed by LexisNexis Risk Solutions. The HPCC platform incorporates a software architecture implemented on commodity computing clusters to provide high-performance, data-parallel processing for applications utilizing big data. The HPCC platform includes system configurations to support both parallel batch data processing (Thor) and high-performance online query applications using indexed data files (Roxie). The HPCC platform also includes a data-centric declarative programming language for parallel data processing called ECL.
PureSystems is an IBM product line of factory pre-configured components and servers also being referred to as an "Expert Integrated System". The centrepiece of PureSystems is the IBM Flex System Manager in tandem with the so-called "Patterns of Expertise" for the automated configuration and management of PureSystems.
PSSC Labs is a California-based company that provides supercomputing solutions in the United States and internationally. Its products include "high-performance" servers, clusters, workstations, and RAID storage systems for scientific research, government and military, entertainment content creators, developers, and private clouds. The company has implemented clustering software from NASA Goddard's Beowulf project in its supercomputers designed for bioinformatics, medical imaging, computational chemistry and other scientific applications.
oneAPI Data Analytics Library, is a library of optimized algorithmic building blocks for data analysis stages most commonly associated with solving Big Data problems.
Presto is a distributed query engine for big data using the SQL query language. Its architecture allows users to query data sources such as Hadoop, Cassandra, Kafka, AWS S3, Alluxio, MySQL, MongoDB and Teradata, and allows use of multiple data sources within a query. Presto is community-driven open-source software released under the Apache License.