Apache RocketMQ

Last updated
Apache RocketMQ
Developer(s) Apache Software Foundation
Initial release2012;12 years ago (2012)
Stable release
5.0.0 / September 9, 2022;16 months ago (2022-09-09) [1]
Repository RocketMQ Repository
Written in Java
Operating system Cross-platform
Type Stream processing, Message broker
License Apache License 2.0
Website rocketmq.apache.org

RocketMQ [2] is a distributed messaging and streaming platform with low latency, high performance and reliability, trillion-level capacity and flexible scalability. It is the third generation distributed messaging middleware open sourced by Alibaba in 2012. On November 21, 2016, Alibaba donated RocketMQ to the Apache Software Foundation. Next year, on February 20, the Apache Software Foundation announced Apache RocketMQ as a Top-Level Project.

Contents

History

The development of RocketMQ can be divided into three stages. [3]

The first generation uses the push mode in data transportation, and relational database in data storage. It shows low latency in message delivery and meets the command of a typical E-commerce platform [4] with distributed transactions.

The second generation uses the pull mode in data transportation, and file system in data storage. It paid more attention to stability and reliability, and shows a comparable performance to the first generation in response time and Kafka on log collection.

The third generation combines the Pull mode with some Push operations. It inherits the advantages of the first and second generation, and shows high performance in concurrency and massive amounts of data scenarios.

Features

Much comparison have been made between the various messaging solutions, and it is widely known that when the number of topics increases dramatically, the throughput of RocketMQ dropped much less than Kafka. [5] Because the characteristics of high performance, high reliability and high real-time ability, more and more efforts have been made to the combination of RocketMQ and other protocol components in every type of messaging scenarios such as MQTT. [6]

Client SDKProtocol and SpecificationOrdered MessageScheduled MessageBatched MessageBroadCast MessageMessage FilterServer Triggered Redelivery
Java, C/C++, Python, Go, NodejsPull model, support TCP, JMS, OpenMessagingEnsure strict ordering of messages, and can scale out gracefullySupportedSupported, with sync mode to avoid message lossSupportedSupported, property filter expressions based on SQL92Supported
Message StorageMessage RetroactiveMessage PriorityHigh Availability and FailoverMessage TrackConfiguration
High performance and low latency file storageSupported timestamp and offset two indicatesNot SupportedSupported, Master-Slave model, without another kitSupportedWork out of box,user only need to pay attention to a few configurations

Architecture

Rmq-structure.png

RocketMQ consists of four parts: name servers, brokers, producers and consumers. Each of them can be horizontally extended without a single point of Failure. As shown in image left.

NameServer Cluster

The lightweight component for service discovery and they can be used to read and write routing information. Each one records global information, and supports fast storage expansion.

Broker Cluster

They use lightweight TOPIC and QUEUE mechanisms to manage data storage. To realize fault tolerance, two copies or three copies of data are provided. And Client can get message in Push and Pull model. In addition, disaster recovery and rich metrics statistics are also supported.

Producer Cluster

Producers can be distributed deployed, and messages from producers to brokers can be balanced through multi-path. In addition, fast failure and low latency are supported.

Consumer Cluster

Consumers can also be distributed deployed in the push and pull model, and they can subscribe message real-time, consume message in the unit of cluster. Message broadcasting is also supported.

Applications

There are at least five aspects Apache RocketMQ could relate to:

Community Maintenance

The RocketMQ team have done much to active the community. Meetups, Workshops, ApacheCon and Code Marathon are regular held in Beijing, Shenzhen and Hangzhou to attract new contributors and committers. The OpenMessaging benchmarking suites are currently available for the RocketMQ and it makes RocketMQ keep the pace with the global standard for distributed messaging. [7] As for version management, a series of standardized software development processes are adopted. The latest version is 4.2.0, and 4.3.0 is on the way. More information can be reached in here.

Awards

2016 China's most popular open source software award

2017 China's most popular open source software award Archived 2018-06-19 at the Wayback Machine

16th CJK(China-Japan-South Korea) open source software outstanding technology award

2018 China's most popular open source software award Archived 2019-01-02 at the Wayback Machine

2019 China's most popular open source software award

See also

Related Research Articles

The Jakarta Messaging API is a Java application programming interface (API) for message-oriented middleware. It provides generic messaging models, able to handle the producer–consumer problem, that can be used to facilitate the sending and receiving of messages between software systems. Jakarta Messaging is a part of Jakarta EE and was originally defined by a specification developed at Sun Microsystems before being guided by the Java Community Process.

In computer science, message queues and mailboxes are software-engineering components typically used for inter-process communication (IPC), or for inter-thread communication within the same process. They use a queue for messaging – the passing of control or of content. Group communication systems provide similar kinds of functionality.

Message-oriented middleware (MOM) is software or hardware infrastructure supporting sending and receiving messages between distributed systems. MOM allows application modules to be distributed over heterogeneous platforms and reduces the complexity of developing applications that span multiple operating systems and network protocols. The middleware creates a distributed communications layer that insulates the application developer from the details of the various operating systems and network interfaces. APIs that extend across diverse platforms and networks are typically provided by MOM.

The Advanced Message Queuing Protocol (AMQP) is an open standard application layer protocol for message-oriented middleware. The defining features of AMQP are message orientation, queuing, routing, reliability and security.

<span class="mw-page-title-main">Message broker</span> Computer program module

A message broker is an intermediary computer program module that translates a message from the formal messaging protocol of the sender to the formal messaging protocol of the receiver. Message brokers are elements in telecommunication or computer networks where software applications communicate by exchanging formally-defined messages. Message brokers are a building block of message-oriented middleware (MOM) but are typically not a replacement for traditional middleware like MOM and remote procedure call (RPC).

Fault Tolerant Messaging in the context of computer systems and networks, refers to a design approach and set of techniques aimed at ensuring reliable and continuous communication between components or nodes even in the presence of errors or failures. This concept is especially critical in distributed systems, where components may be geographically dispersed and interconnected through networks, making them susceptible to various potential points of failure.

HBase is an open-source non-relational distributed database modeled after Google's Bigtable and written in Java. It is developed as part of Apache Software Foundation's Apache Hadoop project and runs on top of HDFS or Alluxio, providing Bigtable-like capabilities for Hadoop. That is, it provides a fault-tolerant way of storing large quantities of sparse data.

A reliable multicast is any computer networking protocol that provides a reliable sequence of packets to multiple recipients simultaneously, making it suitable for applications such as multi-receiver file transfer.

A message queueing service is a message-oriented middleware or MOM deployed in a compute cloud using software as a service model. Service subscribers access queues and or topics to exchange data using point-to-point or publish and subscribe patterns.

HornetQ is an open-source asynchronous messaging project from JBoss. It is an example of Message-oriented middleware. HornetQ is an open source project to build a multi-protocol, embeddable, very high performance, clustered, asynchronous messaging system. During much of its development, the HornetQ code base was developed under the name JBoss Messaging 2.0.

RabbitMQ is an open-source message-broker software that originally implemented the Advanced Message Queuing Protocol (AMQP) and has since been extended with a plug-in architecture to support Streaming Text Oriented Messaging Protocol (STOMP), MQ Telemetry Transport (MQTT), and other protocols.

ZeroMQ is an asynchronous messaging library, aimed at use in distributed or concurrent applications. It provides a message queue, but unlike message-oriented middleware, a ZeroMQ system can run without a dedicated message broker; the zero in the name is for zero broker. The library's API is designed to resemble Berkeley sockets.

<span class="mw-page-title-main">Apache ActiveMQ</span> Software message broker

Apache ActiveMQ is an open source message broker written in Java together with a full Java Message Service (JMS) client. It provides "Enterprise Features" which in this case means fostering the communication from more than one client or server. Supported clients include Java via JMS 1.1 as well as several other "cross language" clients. The communication is managed with features such as computer clustering and ability to use any database as a JMS persistence provider besides virtual memory, cache, and journal persistency.

MQTT is a lightweight, publish-subscribe, machine to machine network protocol for message queue/message queuing service. It is designed for connections with remote locations that have devices with resource constraints or limited network bandwidth, such as in the Internet of Things (IoT). It must run over a transport protocol that provides ordered, lossless, bi-directional connections—typically, TCP/IP, but also possibly over QUIC It is an open OASIS standard and an ISO recommendation.

DataStax, Inc. is a real-time data for AI company based in Santa Clara, California. Its product Astra DB is a cloud database-as-a-service based on Apache Cassandra. DataStax also offers DataStax Enterprise (DSE), an on-premises database built on Apache Cassandra, and Astra Streaming, a messaging and event streaming cloud service based on Apache Pulsar. As of June 2022, the company has roughly 800 customers distributed in over 50 countries.

Apache Kafka is a distributed event store and stream-processing platform. It is an open-source system developed by the Apache Software Foundation written in Java and Scala. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. Kafka can connect to external systems via Kafka Connect, and provides the Kafka Streams libraries for stream processing applications. Kafka uses a binary TCP-based protocol that is optimized for efficiency and relies on a "message set" abstraction that naturally groups messages together to reduce the overhead of the network roundtrip. This "leads to larger network packets, larger sequential disk operations, contiguous memory blocks [...] which allows Kafka to turn a bursty stream of random message writes into linear writes."

<span class="mw-page-title-main">Apache Spark</span> Open-source data analytics cluster computing framework

Apache Spark is an open-source unified analytics engine for large-scale data processing. Spark provides an interface for programming clusters with implicit data parallelism and fault tolerance. Originally developed at the University of California, Berkeley's AMPLab, the Spark codebase was later donated to the Apache Software Foundation, which has maintained it since.

NATS is an open-source messaging system. The NATS server is written in the Go programming language. Client libraries to interface with the server are available for dozens of major programming languages. The core design principles of NATS are performance, scalability, and ease of use. The acronym NATS stands for Neural Autonomic Transport System.

<span class="mw-page-title-main">Apache Apex</span>

Apache Apex is a YARN-native platform that unifies stream and batch processing. It processes big data-in-motion in a way that is scalable, performant, fault-tolerant, stateful, secure, distributed, and easily operable.

References

  1. "Release Notes - Apache RocketMQ - Version 5.0.0". 9 September 2022. Retrieved 27 September 2022.
  2. "apache/rocketmq". GitHub. Retrieved 2018-05-25.
  3. "From Alibaba to Apache: RocketMQ's Past, Present, and Future". InfoQ. Retrieved 2018-06-26.
  4. Liao, Jianwei; Zhuang, Xiaodan; Fan, Renyi; Peng, Xiaoning (2017). "Toward a General Distributed Messaging Framework for Online Transaction Processing Applications". IEEE Access. 5: 18166–18178. Bibcode:2017IEEEA...518166L. doi: 10.1109/ACCESS.2017.2717930 .
  5. Cloud, Alibaba (2018-01-04). "Kafka vs. RocketMQ- Multiple Topic Stress Test Results". Medium. Retrieved 2018-07-08.
  6. Yue, Ma; Ruiyang, Yan; Jianwei, Sun; Kaifeng, Yao (2017). "A MQTT Protocol Message Push Server Based on RocketMQ". 2017 10th International Conference on Intelligent Computation Technology and Automation (ICICTA). pp. 295–298. doi:10.1109/ICICTA.2017.72. ISBN   978-1-5386-1230-9. S2CID   28825800.
  7. "The OpenMessaging Benchmark Framework". openmessaging.cloud. Retrieved 2018-07-08.