The Media Server Markup Language (MSML) is used to control and invoke many different types of services on IP Media Servers and is described in RFC 5707. [1] Clients can use it to define how multimedia sessions interact on a Media Server and to apply services to individuals or groups of users. MSML can be used, for example, to control Media Server conferencing features such as video layout and audio mixing, create sidebar conferences or personal mixes, and set the properties of media streams. As well, clients can use MSML to define media processing dialogs, which may be used as parts of application interactions with users or conferences. Transformation of media streams to and from users or conferences as well as IVR dialogs are examples of such interactions, which are specified using MSML. MSML clients may also invoke dialogs with individual users or with groups of conference participants using VoiceXML.
The fundamental model with MSML is that the Media Server is an appliance that is specialized in controlling/manipulating media streams (usually RTP), and the application server is a separate unit that deals with making and breaking call connections, and controlling the application (or business) logic, so for example the application server would deal with the billing engine and logging systems. The application server establishes a control 'tunnel' (through SIP or IP), which it uses to exchange requests/responses with the media server. In the case of MSML media servers, the messages are coded in MSML, which is a control language using the syntax of XML. MSML is designed so that an application server can interact with a number of different media servers at the same time, and of course these can be distributed across a wide geography, as long as they are reachable via IP. The converse is true, that a media server can have more than one application server talking to it, so this allows for resilience to failure.
MSML was originally created by Convedia (now part of RadiSys), and is an open standard, meaning that companies can use the technology without licensing intellectual property. A number of companies have adopted MSML including Intel (now Dialogic), NMS and Audiocodes.
MSML covers some of the same ground as the earlier MSCML markup language (originally from Snowshore), and both languages are important references for the IETF MediaCTRL (media control) working group, that aims to standardize control of media servers. MSML creator Adnan Saleem acknowledged [2] the MSCML had "shown the way" for driving media servers via scripting, and so a family line can be seen from MSCML through MSML to today's MediaCTRL [3] working group at the IETF.
The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It is often assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScript.
The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, where hypertext documents include hyperlinks to other resources that the user can easily access, for example by a mouse click or by tapping the screen in a web browser.
The Lightweight Directory Access Protocol is an open, vendor-neutral, industry standard application protocol for accessing and maintaining distributed directory information services over an Internet Protocol (IP) network. Directory services play an important role in developing intranet and Internet applications by allowing the sharing of information about users, systems, networks, services, and applications throughout the network. As examples, directory services may provide any organized set of records, often with a hierarchical structure, such as a corporate email directory. Similarly, a telephone directory is a list of subscribers with an address and a phone number.
The Session Initiation Protocol (SIP) is a signaling protocol used for initiating, maintaining, and terminating communication sessions that include voice, video and messaging applications. SIP is used in Internet telephony, in private IP telephone systems, as well as mobile phone calling over LTE (VoLTE).
VBScript is an Active Scripting language developed by Microsoft that is modeled on Visual Basic. It allows Microsoft Windows system administrators to generate powerful tools for managing computers without error handling and with subroutines and other advanced programming constructs. It can give the user complete control over many aspects of their computing environment.
The World Wide Web (WWW), commonly known as theWeb, is an information system enabling documents and other web resources to be accessed over the Internet.
VoiceXML (VXML) is a digital document standard for specifying interactive media and voice dialogs between humans and computers. It is used for developing audio and voice response applications, such as banking systems and automated customer service portals. VoiceXML applications are developed and deployed in a manner analogous to how a web browser interprets and visually renders the Hypertext Markup Language (HTML) it receives from a web server. VoiceXML documents are interpreted by a voice browser and in common deployment architectures, users interact with voice browsers via the public switched telephone network (PSTN).
Universal Plug and Play (UPnP) is a set of networking protocols that permits networked devices, such as personal computers, printers, Internet gateways, Wi-Fi access points and mobile devices to seamlessly discover each other's presence on the network and establish functional network services. UPnP is intended primarily for residential networks without enterprise-class devices.
Extensible Messaging and Presence Protocol is an open communication protocol designed for instant messaging (IM), presence information, and contact list maintenance. Based on XML, it enables the near-real-time exchange of structured data between two or more network entities. Designed to be extensible, the protocol offers a multitude of applications beyond traditional IM in the broader realm of message-oriented middleware, including signalling for VoIP, video, file transfer, gaming and other uses.
In computer and telecommunications networks, presence information is a status indicator that conveys ability and willingness of a potential communication partner—for example a user—to communicate. A user's client provides presence information via a network connection to a presence service, which is stored in what constitutes his personal availability record and can be made available for distribution to other users to convey their availability for communication. Presence information has wide application in many communication services and is one of the innovations driving the popularity of instant messaging or recent implementations of voice over IP clients.
Apache Cocoon, usually abbreviated as Cocoon, is a web application framework built around the concepts of Pipeline, separation of concerns, and component-based web development. The framework focuses on XML and XSLT publishing and is built using the Java programming language. Cocoon's use of XML is intended to improve compatibility of publishing formats, such as HTML and PDF. The content management systems Apache Lenya and Daisy have been created on top of the framework. Cocoon is also commonly used as a data warehousing ETL tool or as middleware for transporting data between systems.
Call Control eXtensible Markup Language (CCXML) is an XML standard designed to provide asynchronous event-based telephony support to VoiceXML. Its current status is a W3C recommendation, adopted May 10, 2011. Whereas VoiceXML is designed to provide a Voice User Interface to a voice browser, CCXML is designed to inform the voice browser how to handle the telephony control of the voice channel. The two XML applications are wholly separate and are not required by each other to be implemented - however, they have been designed with interoperability in mind
Web conferencing is used as an umbrella term for various types of online conferencing and collaborative services including webinars, webcasts, and web meetings. Sometimes it may be used also in the more narrow sense of the peer-level web meeting context, in an attempt to disambiguate it from the other types known as collaborative sessions. The terminology related to these technologies is exact and agreed relying on the standards for web conferencing but specific organizations practices in usage exist to provide also term usage reference.
A user interface markup language is a markup language that renders and describes graphical user interfaces and controls. Many of these markup languages are dialects of XML and are dependent upon a pre-existing scripting language engine, usually a JavaScript engine, for rendering of controls and extra scriptability.
JSON is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays. It is a common data format with diverse uses in electronic data interchange, including that of web applications with servers.
A media server is a computer appliance or an application software that stores digital media and makes it available over a network.
SCXML stands for State Chart XML: State Machine Notation for Control Abstraction. It is an XML-based markup language that provides a generic state-machine-based execution environment based on Harel statecharts.
The Media Server Control Markup Language (MSCML) is a protocol used in conjunction with the Session Initiation Protocol (SIP) to enable the delivery of advanced multimedia conferencing services over IP networks. The MSCML specification has been published by the IETF under RFC 4722, now obsoleted by the newer RFC 5022. MSCML was pioneered by the media server company Snowshore, now part of the Dialogic Corporation. MSCML built on ideas from the Netann protocol, and in turn inspired the MSML. An IETF working group called MediaCTRL have now embarked on a standardization of media server scripting languages, drawing on these earlier efforts. Voice scripting protocols like VoiceXML and CCXML are also inspiring sources, and in some cases need to be integrated with what media servers will need to operate in the real world.
MARIA is a universal, declarative, multiple abstraction level, XML-based user interface markup language for modelling interactive applications in ubiquitous environments.
Multimodal Architecture and Interfaces is an open standard developed by the World Wide Web Consortium since 2005. It was published as a Recommendation of the W3C on October 25, 2012. The document is a technical report specifying a multimodal system architecture and its generic interfaces to facilitate integration and multimodal interaction management in a computer system. It has been developed by the W3C's Multimodal Interaction Working Group.