Apache Helix

Last updated
Apache Helix
Original author(s) LinkedIn
Developer(s) Contributors
Stable release
0.9.x0.9.9 / 11 May 2022;19 months ago (2022-05-11) [1]
1.x1.0.4 / 31 August 2020;3 years ago (2020-08-31) [2]
Repository Helix Repository
Written in Java
Type Cluster management framework
License Apache License 2.0
Website helix.apache.org

Apache Helix is an open-source cluster management framework developed by the Apache Software Foundation.

Contents

History

Helix is one of the several notable open source projects developed by LinkedIn. It is a stable cluster management framework used for the automatic management of partitioned, replicated and distributed resources hosted on a cluster of systems. [3]

Helix is one of several notable cluster management framework originally developed by LinkedIn apart from Apache Samza, Apache Kafka and Voldemort. [4] The origins of Helix lie in a distributed NoSQL called Espresso.

Enterprises that use Apache Helix

The following is a list of notable enterprises that have used or are using Helix:

See also

Related Research Articles

<span class="mw-page-title-main">Apache Nutch</span> Open source web crawler

Apache Nutch is a highly extensible and scalable open source web crawler software project.

Apache Jackrabbit is an open source content repository for the Java platform. The Jackrabbit project was started on August 28, 2004, when Day Software licensed an initial implementation of the Java Content Repository API (JCR). Jackrabbit was also used as the reference implementation of JSR-170, specified within the Java Community Process. The project graduated from the Apache Incubator on March 15, 2006, and is now a Top Level Project of the Apache Software Foundation.

Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model. Hadoop was originally designed for computer clusters built from commodity hardware, which is still the common use. It has since also found use on clusters of higher-end hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be automatically handled by the framework.

Apache ServiceMix is an open-source software project to implement a distributed enterprise service bus (ESB).

<span class="mw-page-title-main">Apache Solr</span> Open-source enterprise-search platform

Solr is an open-source enterprise-search platform, written in Java. Its major features include full-text search, hit highlighting, faceted search, real-time indexing, dynamic clustering, database integration, NoSQL features and rich document handling. Providing distributed search and index replication, Solr is designed for scalability and fault tolerance. Solr is widely used for enterprise search and analytics use cases and has an active development community and regular releases.

<span class="mw-page-title-main">Apache CouchDB</span> Document-oriented NoSQL database

Apache CouchDB is an open-source document-oriented NoSQL database, implemented in Erlang.

<span class="mw-page-title-main">Apache Cassandra</span> Free and open-source database management system

Cassandra is a free and open-source, distributed, wide-column store, NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cassandra offers support for clusters spanning multiple datacenters, with asynchronous masterless replication allowing low latency operations for all clients. Cassandra was designed to implement a combination of Amazon's Dynamo distributed storage and replication techniques combined with Google's Bigtable data and storage engine model.

The Apache Traffic Server (ATS) is a modular, high-performance reverse proxy and forward proxy server, generally comparable to Nginx and Squid. It was created by Inktomi, and distributed as a commercial product called the Inktomi Traffic Server, before Inktomi was acquired by Yahoo!.

<span class="mw-page-title-main">Apache ZooKeeper</span> System for distributed coordination

Apache ZooKeeper is an open-source server for highly reliable distributed coordination of cloud applications. It is a project of the Apache Software Foundation.

<span class="mw-page-title-main">Apache Storm</span> Open-source distributed stream processing

Apache Storm is a distributed stream processing computation framework written predominantly in the Clojure programming language. Originally created by Nathan Marz and team at BackType, the project was open sourced after being acquired by Twitter. It uses custom created "spouts" and "bolts" to define information sources and manipulations to allow batch, distributed processing of streaming data. The initial release was on 17 September 2011.

Apache Kafka is a distributed event store and stream-processing platform. It is an open-source system developed by the Apache Software Foundation written in Java and Scala. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. Kafka can connect to external systems via Kafka Connect, and provides the Kafka Streams libraries for stream processing applications. Kafka uses a binary TCP-based protocol that is optimized for efficiency and relies on a "message set" abstraction that naturally groups messages together to reduce the overhead of the network roundtrip. This "leads to larger network packets, larger sequential disk operations, contiguous memory blocks [...] which allows Kafka to turn a bursty stream of random message writes into linear writes."

<span class="mw-page-title-main">Apache Spark</span> Open-source data analytics cluster computing framework

Apache Spark is an open-source unified analytics engine for large-scale data processing. Spark provides an interface for programming clusters with implicit data parallelism and fault tolerance. Originally developed at the University of California, Berkeley's AMPLab, the Spark codebase was later donated to the Apache Software Foundation, which has maintained it since.

<span class="mw-page-title-main">Apache Mesos</span> Software to manage computer clusters

Apache Mesos is an open-source project to manage computer clusters. It was developed at the University of California, Berkeley.

Perforce Software, Inc. is an American developer of software used for developing and running applications, including version control software, web-based repository management, developer collaboration, application lifecycle management, web application servers, debugging tools and agile planning software.

Apache MXNet is an open-source deep learning software framework that trains and deploys deep neural networks. It is scalable, allows fast model training, and supports a flexible programming model and multiple programming languages. The MXNet library is portable and can scale to multiple GPUs and machines. It was co-developed by Carlos Guestrin at the University of Washington.

GeoTrellis is an open source, geographic data processing library designed to work with large geospatial raster data sets. It is written in Scala and has an open-source Apache 2.0 license.

<span class="mw-page-title-main">Apache RocketMQ</span> Open-source stream processing platform

RocketMQ is a distributed messaging and streaming platform with low latency, high performance and reliability, trillion-level capacity and flexible scalability. It is the third generation distributed messaging middleware open sourced by Alibaba in 2012. On November 21, 2016, Alibaba donated RocketMQ to the Apache Software Foundation. Next year, on February 20, the Apache Software Foundation announced Apache RocketMQ as a Top-Level Project.

<span class="mw-page-title-main">Apache Pinot</span> Open-source distributed data store

Apache Pinot is a column-oriented, open-source, distributed data store written in Java. Pinot is designed to execute OLAP queries with low latency. It is suited in contexts where fast analytics, such as aggregations, are needed on immutable data, possibly, with real-time data ingestion. The name Pinot comes from the Pinot grape vines that are pressed into liquid that is used to produce a variety of different wines. The founders of the database chose the name as a metaphor for analyzing vast quantities of data from a variety of different file formats or streaming data sources.

References

  1. "Apache Helix – Release Notes for Apache Helix 0.9.9" . Retrieved 27 September 2022.
  2. "Apache Helix – Release Notes for Apache Helix 1.0.4" . Retrieved 27 September 2022.
  3. "Apache Helix - Home". helix.apache.org.
  4. Asay, Matt (20 April 2016). "The secrets to LinkedIn's open source success".