Hortonworks

Last updated
Hortonworks, Inc.
Company type Subsidiary
Industry Computer software
Founded2011;14 years ago (2011)
Headquarters,
United States
ProductsHortonworks Data Platform, Hortonworks DataFlow, Hortonworks DataPlane
Number of employees
~1,110 (2017) [1]
Parent Cloudera
Website Hortonworks.com

Hortonworks, Inc. was a data software company based in Santa Clara, California that developed and supported open-source software (primarily around Apache Hadoop) designed to manage big data and associated processing.

Contents

Hortonworks software was used to build enterprise data services and applications such as IoT (connected cars, for example), single view of X (such as customer, risk, patient), and advanced analytics and machine learning (such as next best action and realtime cybersecurity). Hortonworks had three interoperable product lines:

In January 2019, Hortonworks completed its merger with Cloudera. [3]

History

Hortonworks was formed in June 2011 as an independent company, funded by $23 million venture capital from Yahoo! and Benchmark Capital. Its first office was in Sunnyvale, California. [4] The company employed contributors to the open source software project Apache Hadoop. [5] The Hortonworks Data Platform (HDP) product, first released in June 2012, [6] included Apache Hadoop and was used for storing, processing, and analyzing large volumes of data. The platform was designed to deal with data from many sources and formats. The platform included Hadoop technology such as the Hadoop Distributed File System, MapReduce, Pig, Hive, HBase, ZooKeeper, and additional components. [7]

Eric Baldeschweiler (from Yahoo) was initial chief executive, and Rob Bearden chief operating officer, formerly from SpringSource. Benchmark partner Peter Fenton was a board member. The company name refers to the character Horton the Elephant, since the elephant is the symbol for Hadoop. [4] [8]

In October 2018, Hortonworks and Cloudera announced they would be merging in an all-stock merger of equals. [9] After the merger, the Apache products of Hortonworks became Cloudera Data Platform.

Related Research Articles

Apache Hadoop is a collection of open-source software utilities for reliable, scalable, distributed computing. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model. Hadoop was originally designed for computer clusters built from commodity hardware, which is still the common use. It has since also found use on clusters of higher-end hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be automatically handled by the framework.

<span class="mw-page-title-main">Doug Cutting</span> American information theorist

Douglass Read Cutting is a software designer, advocate for, and creator of open-source search technology. He founded two technology projects, Lucene and Nutch, with Mike Cafarella. The Apache Software Foundation now manages both projects. Cutting and Cafarella were also co-founders of Apache Hadoop.

<span class="mw-page-title-main">Horton the Elephant</span> Fictional character created by Dr. Seuss

Horton the Elephant is a fictional character from the 1940 book Horton Hatches the Egg and 1954 book Horton Hears a Who!, both by Dr. Seuss. He is also featured in the short story Horton and the Kwuggerbug, first published for Redbook in 1951 and later rediscovered by Charles D. Cohen and published in the 2014 anthology Horton and the Kwuggerbug and More Lost Stories. In all books and other media, Horton is characterized as a kind, sweet-natured, and naïve elephant who manages to overcome hardships.

<span class="mw-page-title-main">Apache Solr</span> Open-source enterprise-search platform

Solr is an open-source enterprise-search platform, written in Java. Its major features include full-text search, hit highlighting, faceted search, real-time indexing, dynamic clustering, database integration, NoSQL features and rich document handling. Providing distributed search and index replication, Solr is designed for scalability and fault tolerance. Solr is widely used for enterprise search and analytics use cases and has an active development community and regular releases.

Christophe Bisciglia is an American entrepreneur known for his work with big data and cloud computing. Known for helping to popularize the programming model MapReduce while working at Google, and in addition he co-founded Cloudera and WibiData.

Aster Data Systems was a data management and analysis software company headquartered in San Carlos, California. It was founded in 2005 and acquired by Teradata in 2011.

Cloudera, Inc. is an American data lake software company.

Within database management systems, the record columnar file or RCFile is a data placement structure that determines how to store relational tables on computer clusters. It is designed for systems using the MapReduce framework. The RCFile structure includes a data storage format, data compression approach, and optimization techniques for data reading. It is able to meet all the four requirements of data placement: (1) fast data loading, (2) fast query processing, (3) highly efficient storage space utilization, and (4) a strong adaptivity to dynamic data access patterns.

<span class="mw-page-title-main">MapR</span> American business software company

MapR was a business software company headquartered in Santa Clara, California. MapR software provides access to a variety of data sources from a single computer cluster, including big data workloads such as Apache Hadoop and Apache Spark, a distributed file system, a multi-model database management system, and event stream processing, combining analytics in real-time with operational applications. Its technology runs on both commodity hardware and public cloud computing services. In August 2019, following financial difficulties, the technology and intellectual property of the company were sold to Hewlett Packard Enterprise.

The Oracle data appliance consists of hardware and software from Oracle Corporation sold as a computer appliance. It was announced in 2011,and is used for the consolidating and loading unstructured data into Oracle Database software.

Sqoop is a command-line interface application for transferring data between relational databases and Hadoop.

Apache Impala is an open source massively parallel processing (MPP) SQL query engine for data stored in a computer cluster running Apache Hadoop. Impala has been described as the open-source equivalent of Google F1, which inspired its development in 2012.

WibiData was a software company that developed big data applications for enterprises to personalize their customer experiences. It developed applications based on open-source technologies Apache Hadoop, Apache Cassandra, Apache HBase, Apache Avro and the Kiji Project. Wibidata was founded under the name Odiago in 2010 by Christophe Bisciglia, Aaron Kimball, and Garrett Wu. Based in San Francisco, California, WibiData was backed by investors such as Canaan Partners, New Enterprise Associates, SV Angel, and Eric Schmidt.

Platfora, Inc. is a big data analytics company based in San Mateo, California. The firm’s software works with the open-source software framework Apache Hadoop to assist with data analysis, data visualization, and sharing.

<span class="mw-page-title-main">PSSC Labs</span> American supercomputing solution company

PSSC Labs is a California-based company that provides supercomputing solutions in the United States and internationally. Its products include "high-performance" servers, clusters, workstations, and RAID storage systems for scientific research, government and military, entertainment content creators, developers, and private clouds. The company has implemented clustering software from NASA Goddard's Beowulf project in its supercomputers designed for bioinformatics, medical imaging, computational chemistry and other scientific applications.

Apache Phoenix is an open source, massively parallel, relational database engine supporting OLTP for Hadoop using Apache HBase as its backing store. Phoenix provides a JDBC driver that hides the intricacies of the NoSQL store enabling users to create, delete, and alter SQL tables, views, indexes, and sequences; insert and delete rows singly and in bulk; and query data through SQL. Phoenix compiles queries and other statements into native NoSQL store APIs rather than using MapReduce enabling the building of low latency applications on top of NoSQL stores.

<span class="mw-page-title-main">Apache Kylin</span> Open-source distributed analytics engine

Apache Kylin is an open source distributed analytics engine designed to provide a SQL interface and multi-dimensional analysis (OLAP) on Hadoop and Alluxio supporting extremely large datasets.

Apache Parquet is a free and open-source column-oriented data storage format in the Apache Hadoop ecosystem. It is similar to RCFile and ORC, the other columnar-storage file formats in Hadoop, and is compatible with most of the data processing frameworks around Hadoop. It provides efficient data compression and encoding schemes with enhanced performance to handle complex data in bulk.

<span class="mw-page-title-main">Apache ORC</span> Column-oriented data storage format

Apache ORC is a free and open-source column-oriented data storage format. It is similar to the other columnar-storage file formats available in the Hadoop ecosystem such as RCFile and Parquet. It is used by most of the data processing frameworks Apache Spark, Apache Hive, Apache Flink, and Apache Hadoop.

References

  1. "Hortonworks : Quick Facts - Hortonworks". Archived from the original on 2 April 2019. Retrieved 15 July 2017.
  2. "Hortonworks upgrades DataPlane Services". 17 April 2018. Retrieved April 30, 2018.
  3. "Jan 2019 Cloudera and Hortonworks Complete Planned Merger". Archived from the original on 2024-08-13. Retrieved 2024-08-13.
  4. 1 2 Charles Babcock (June 29, 2011). "Hadoop Big Data Startup Spins Out Of Yahoo". Information Week. Archived from the original on July 4, 2011. Retrieved February 21, 2017.
  5. Sarah McBride; Alistair Barr (April 20, 2012). "Big-data investors look for the next Splunk". Reuters. Archived from the original on February 22, 2017. Retrieved February 21, 2017.
  6. "Hortonworks Announces General Availability of Hortonworks Data Platform". Hortonworks. 12 June 2012. Archived from the original on 22 September 2012.
  7. Joab Jackson (November 1, 2011). "HortonWorks Hones a Hadoop Distribution". PC World. Archived from the original on August 17, 2013. Retrieved October 14, 2013.
  8. Cade Metz (June 28, 2011). "Yahoo! seeds Hadoop startup on open source dream: Hortonworks hears a Big Data revolution". The Register. Retrieved February 21, 2017.
  9. "Cloudera and Hortonworks Announce Merger to Create World's Leading Next Generation Data Platform and Deliver Industry's First Enterprise Data Cloud". BusinessWire. 3 October 2018. Retrieved 3 October 2018.