Media Resource Control Protocol

Last updated

Media Resource Control Protocol (MRCP) is a communication protocol used by speech servers to provide various services (such as speech recognition and speech synthesis) to their clients. MRCP relies on another protocol, such as Real Time Streaming Protocol (RTSP) or Session Initiation Protocol (SIP) for establishing a control session and audio streams between the client and the server.

MRCP uses a similar style of clear-text signaling as HTTP and many other Internet protocols, in which each message contains 3 sections: a first-line, a header and a body. The first line indicates the type of message as well as information such as response codes. The header contains a number of lines, each in the format <header>: <data>. The body, whose length is specified by the header, contains the details of the message.

Like HTTP, MRCP uses a request (usually issued by the client) and response model. Responses may simply acknowledge receipt of the request or give other information regarding its processing. For example, an MRCP client may request to send some audio data for processing (say, for speech recognition), to which the server could respond with a message containing a suitable port number to send the data, since MRCP does not have support for audio data specifically as this would have to be handled by some other protocol, such as Real-time Transport Protocol (RTP).

MRCP protocol version 2 has been approved as an RFC. Version 2 uses SIP for managing sessions and audio streams between the server and the clients, whereas version 1 did not specify the underlying protocol.

MRCP has been adopted by a wide range of commercial speech servers, such as Verbio Technologies, Vernacular.ai's VIVA, Microsoft Speech Server, LumenVox Speech Engine, ReadSpeaker speechServer MRCP, Nuance Recognizer and Vocalizer, Sestek TTS, Sestek Call Steering as well as commercial Interactive Voice Response software such as Blueworx Voice Response..


Related Research Articles

Hypertext Transfer Protocol Application protocol for distributed, collaborative, hypermedia information systems

The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, where hypertext documents include hyperlinks to other resources that the user can easily access, for example by a mouse click or by tapping the screen in a web browser.

Multipurpose Internet Mail Extensions (MIME) is an Internet standard that extends the format of email messages to support text in character sets other than ASCII, as well as attachments of audio, video, images, and application programs. Message bodies may consist of multiple parts, and header information may be specified in non-ASCII character sets. Email messages with MIME formatting are typically transmitted with standard protocols, such as the Simple Mail Transfer Protocol (SMTP), the Post Office Protocol (POP), and the Internet Message Access Protocol (IMAP).

The Real Time Streaming Protocol (RTSP) is an application-level network protocol designed for multiplexing and packetizing multimedia transport streams over a suitable transport protocol. RTSP is used in entertainment and communications systems to control streaming media servers. The protocol is used for establishing and controlling media sessions between endpoints. Clients of media servers issue commands such as play, record and pause, to facilitate real-time control of the media streaming from the server to a client or from a client to the server.

The Real-time Transport Protocol (RTP) is a network protocol for delivering audio and video over IP networks. RTP is used in communication and entertainment systems that involve streaming media, such as telephony, video teleconference applications including WebRTC, television services and web-based push-to-talk features.

The Session Initiation Protocol (SIP) is a signaling protocol used for initiating, maintaining, and terminating real-time sessions that include voice, video and messaging applications. SIP is used for signaling and controlling multimedia communication sessions in applications of Internet telephony for voice and video calls, in private IP telephone systems, in instant messaging over Internet Protocol (IP) networks as well as mobile phone calling over LTE (VoLTE).

The Transmission Control Protocol (TCP) is one of the main protocols of the Internet protocol suite. It originated in the initial network implementation in which it complemented the Internet Protocol (IP). Therefore, the entire suite is commonly referred to as TCP/IP. TCP provides reliable, ordered, and error-checked delivery of a stream of octets (bytes) between applications running on hosts communicating via an IP network. Major internet applications such as the World Wide Web, email, remote administration, and file transfer rely on TCP, which is part of the Transport Layer of the TCP/IP suite. SSL/TLS often runs on top of TCP.

In computer networking, the User Datagram Protocol (UDP) is one of the core members of the Internet protocol suite. With UDP, computer applications can send messages, in this case referred to as datagrams, to other hosts on an Internet Protocol (IP) network. Prior communications are not required in order to set up communication channels or data paths.

The File Transfer Protocol (FTP) is a standard communication protocol used for the transfer of computer files from a server to a client on a computer network. FTP is built on a client–server model architecture using separate control and data connections between the client and the server. FTP users may authenticate themselves with a clear-text sign-in protocol, normally in the form of a username and password, but can connect anonymously if the server is configured to allow it. For secure transmission that protects the username and password, and encrypts the content, FTP is often secured with SSL/TLS (FTPS) or replaced with SSH File Transfer Protocol (SFTP).

STUN is a standardized set of methods, including a network protocol, for traversal of network address translator (NAT) gateways in applications of real-time voice, video, messaging, and other interactive communications.

H.248

The Gateway Control Protocol is an implementation of the media gateway control protocol architecture for providing telecommunication services across a converged internetwork consisting of the traditional public switched telephone network (PSTN) and modern packet networks, such as the Internet. H.248 is the designation of the recommendations developed by the ITU Telecommunication Standardization Sector (ITU-T) and Megaco is a contraction of media gateway control protocol used by the earliest specifications by the Internet Engineering Task Force (IETF). The standard published in March 2013 by ITU-T is entitled H.248.1: Gateway control protocol: Version 3.

A session border controller (SBC) is a network element deployed to protect SIP based voice over Internet Protocol (VoIP) networks.

Interactive Connectivity Establishment (ICE) is a technique used in computer networking to find ways for two computers to talk to each other as directly as possible in peer-to-peer networking. This is most commonly used for interactive media such as Voice over Internet Protocol (VoIP), peer-to-peer communications, video, and instant messaging. In such applications, communicating through a central server would be slow and expensive, but direct communication between client applications on the Internet is very tricky due to network address translators (NATs), firewalls, and other network barriers.

Real-Time Messaging Protocol (RTMP) is a communication protocol for streaming audio, video, and data over the Internet. Originally developed as a proprietary protocol by Macromedia for streaming between Flash Player and a server, Adobe has released an incomplete version of the specification of the protocol for public use.

Chunked transfer encoding is a streaming data transfer mechanism available in version 1.1 of the Hypertext Transfer Protocol (HTTP). In chunked transfer encoding, the data stream is divided into a series of non-overlapping "chunks". The chunks are sent out and received independently of one another. No knowledge of the data stream outside the currently-being-processed chunk is necessary for both the sender and the receiver at any given time.

The Media Gateway Control Protocol (MGCP) is a signaling and call control communications protocol used in voice over IP (VoIP) telecommunication systems. It implements the media gateway control protocol architecture for controlling media gateways connected to the public switched telephone network (PSTN). The media gateways provide conversion of traditional electronic media to the Internet Protocol (IP) network. The protocol is a successor to the Simple Gateway Control Protocol (SGCP), which was developed by Bellcore and Cisco, and the Internet Protocol Device Control (IPDC).

The Session Initiation Protocol (SIP) is the signaling protocol selected by the 3rd Generation Partnership Project (3GPP) to create and control multimedia sessions with two or more participants in the IP Multimedia Subsystem (IMS), and therefore is a key element in the IMS framework.

JsSIP Library for JavaScript

JsSIP is a library for the programming language JavaScript. It takes advantage of SIP and WebRTC to provide a fully featured SIP endpoint in any website. JsSIP allows any website to get real-time communication features using audio and video. It makes it possible to build SIP user agents that send and receive audio and video calls as well as and text messages.