Original author(s) | Justin Uberti Peter Thatcher |
---|---|
Initial release | 2011 |
Stable release | 1.0 [1] / May 4, 2018 |
Repository | webrtc |
Written in | C++, [2] JavaScript |
Standard(s) | w3 |
License | BSD license [3] |
Website | webrtc |
WebRTC (Web Real-Time Communication) is a free and open-source project providing web browsers and mobile applications with real-time communication (RTC) via application programming interfaces (APIs). It allows audio and video communication and streaming to work inside web pages by allowing direct peer-to-peer communication, eliminating the need to install plugins or download native apps. [4]
Supported by Apple, Google, Microsoft, Mozilla, and Opera, WebRTC specifications have been published by the World Wide Web Consortium (W3C) and the Internet Engineering Task Force (IETF). [5] [6]
In May 2010, Google bought Global IP Solutions or GIPS, a VoIP and videoconferencing software company that had developed many components required for RTC, such as codecs and echo cancellation techniques. Google open-sourced the GIPS technology and engaged with relevant standards bodies at the IETF and W3C to ensure industry consensus. [7] [8] In May 2011, Google released an open-source project for browser-based real-time communication known as WebRTC. [9] This has been followed by ongoing work to standardize the relevant protocols in the IETF [10] and browser APIs in the W3C. [11]
In January 2011, Ericsson Labs built the first implementation of WebRTC using a modified WebKit library. [12] [13] In October 2011, the W3C published its first draft for the spec. [14] WebRTC milestones include the first cross-browser video call (February 2013), first cross-browser data transfers (February 2014), and as of July 2014 Google Hangouts was "kind of" using WebRTC. [15]
The W3C draft API was based on preliminary work done in the WHATWG. [16] It was referred to as the ConnectionPeer API, and a pre-standards concept implementation was created at Ericsson Labs. [12] The WebRTC Working Group expects this specification to evolve significantly based on:
In November 2017, the WebRTC 1.0 specification transitioned from Working Draft to Candidate Recommendation. [20]
In January 2021, the WebRTC 1.0 specification transitioned from Candidate Recommendation to Recommendation. [5]
Major components of WebRTC include several JavaScript APIs:
getUserMedia
acquires the audio and video media (e.g., by accessing a device's camera and microphone). [21] RTCPeerConnection
enables audio and video communication between peers. It performs signal processing, codec handling, peer-to-peer communication, security, and bandwidth management. [22] RTCDataChannel
allows bidirectional communication of arbitrary data between peers. The data is transported using SCTP over DTLS. [23] It uses the same API as WebSockets and has very low latency. [24] The WebRTC API also includes a statistics function:
getStats
allows the web application to retrieve a set of statistics about WebRTC sessions. These statistics data are being described in a separate W3C document. [25] The WebRTC API includes no provisions for signaling, that is discovering peers to connect to and determine how to establish connections among them. Applications use Interactive Connectivity Establishment for connections and are responsible for managing sessions, possibly relying on any of Session Initiation Protocol, Extensible Messaging and Presence Protocol (XMPP), Message Queuing Telemetry Transport, Matrix, or another protocol. Signaling may depend on one or more servers. [26] [27]
RFC 7478 requires implementations to provide PCMA/PCMU ( RFC 3551), Telephone Event as DTMF ( RFC 4733), and Opus ( RFC 6716) audio codecs as minimum capabilities. The PeerConnection, data channel and media capture browser APIs are detailed in the W3C specification.
W3C is developing ORTC (Object Real-Time Communications) for WebRTC. [28]
WebRTC allows browsers to stream files directly to one another, reducing or entirely removing the need for server-side file hosting. WebTorrent uses a WebRTC transport to enable peer-to-peer file sharing using the BitTorrent protocol in the browser. [29] Some file-sharing websites use it to allow users to send files directly to one another in their browsers, although this requires the uploader to keep the tab open until the file has been downloaded. [30] [31] [32] A few CDNs, such as the Microsoft-owned Peer5, use the client's bandwidth to upload media to other connected peers, enabling each peer to act as an edge server. [33] [34]
Although initially developed for web browsers, WebRTC has applications for non-browser devices, including mobile platforms and IoT devices. Examples include browser-based VoIP telephony, also called cloud phones or web phones, which allow calls to be made and received from within a web browser, replacing the requirement to download and install a softphone. [35]
WebRTC is supported by the following browsers (incomplete list; oldest supported version specified):
WebRTC establishes a standard set of codecs which all compliant browsers are required to implement. Some browsers may also support other codecs. [43]
Codec name | Profile | Browser compatibility |
---|---|---|
H.264 | Constrained Baseline (CB) | Chrome (52+), Firefox[1], Safari |
VP8 | - | Chrome, Firefox, Safari (12.1+) [44] |
VP9 | - | Chrome (48+), Firefox |
Codec name | Browser compatibility |
---|---|
Opus | Chrome, Firefox, Safari |
G.711 PCM (A-law) | Chrome, Firefox, Safari |
G.711 PCM (μ-law) | Chrome, Firefox, Safari |
G.722 | Chrome, Firefox, Safari |
iLBC | Chrome, Safari |
iSAC | Chrome, Safari |
In January 2015, TorrentFreak reported a serious security flaw in browsers supporting WebRTC, that compromised the security of VPN tunnels by exposing a user's true IP address. [45] The IP address read requests are not visible in the browser's developer console, and they are not blocked by most ad blocking, privacy and security add-ons, enabling online tracking despite precautions. [46]
It has been reported that the cause of the address leak is not a bug that can be patched, but is foundational to the way WebRTC operates; however, there are several solutions to mitigate the problem. WebRTC leakage can be tested for, and solutions are offered for most browsers. [47] WebRTC can be disabled, if not required, in most browsers. The uBlock Origin add-on can fix this problem (as some browsers now fix this problem by themselves, from uBlock Origin v1.38 onwards this option has been disabled on these browsers [48] ).
HTTP is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, where hypertext documents include hyperlinks to other resources that the user can easily access, for example by a mouse click or by tapping the screen in a web browser.
In computer network engineering, an Internet Standard is a normative specification of a technology or methodology applicable to the Internet. Internet Standards are created and published by the Internet Engineering Task Force (IETF). They allow interoperation of hardware and software from different sources which allows internets to function. As the Internet became global, Internet Standards became the lingua franca of worldwide communications.
A Uniform Resource Identifier (URI), formerly Universal Resource Identifier, is a unique sequence of characters that identifies an abstract or physical resource, such as resources on a webpage, mail address, phone number, books, real-world objects such as people and places, concepts. URIs are used to identify anything described using the Resource Description Framework (RDF), for example, concepts that are part of an ontology defined using the Web Ontology Language (OWL), and people who are described using the Friend of a Friend vocabulary would each have an individual URI.
Transport Layer Security (TLS) is a cryptographic protocol designed to provide communications security over a computer network, such as the Internet. The protocol is widely used in applications such as email, instant messaging, and voice over IP, but its use in securing HTTPS remains the most publicly visible.
In computing, the User-Agent header is an HTTP header intended to identify the user agent responsible for making a given HTTP request. Whereas the character sequence User-Agent
comprises the name of the header itself, the header value that a given user agent uses to identify itself is colloquially known as its user agent string. The user agent for the operator of a computer used to access the Web has encoded within the rules that govern its behavior the knowledge of how to negotiate its half of a request-response transaction; the user agent thus plays the role of the client in a client–server system. Often considered useful in networks is the ability to identify and distinguish the software facilitating a network session. For this reason, the User-Agent HTTP header exists to identify the client software to the responding server.
Web standards are the formal, non-proprietary standards and other technical specifications that define and describe aspects of the World Wide Web. In recent years, the term has been more frequently associated with the trend of endorsing a set of standardized best practices for building web sites, and a philosophy of web design and development that includes those methods.
Datagram Transport Layer Security (DTLS) is a communications protocol providing security to datagram-based applications by allowing them to communicate in a way designed to prevent eavesdropping, tampering, or message forgery. The DTLS protocol is based on the stream-oriented Transport Layer Security (TLS) protocol and is intended to provide similar security guarantees. The DTLS protocol datagram preserves the semantics of the underlying transport—the application does not suffer from the delays associated with stream protocols, but because it uses User Datagram Protocol (UDP) or Stream Control Transmission Protocol (SCTP), the application has to deal with packet reordering, loss of datagram and data larger than the size of a datagram network packet. Because DTLS uses UDP or SCTP rather than TCP it avoids the TCP meltdown problem when being used to create a VPN tunnel.
Link prefetching allows web browsers to pre-load resources. This speeds up both the loading and rendering of web pages. Prefetching was first introduced in HTML5.
Internet Low Bitrate Codec (iLBC) is a royalty-free narrowband speech audio coding format and an open-source reference implementation (codec), developed by Global IP Solutions (GIPS) formerly Global IP Sound. It was formerly freeware with limitations on commercial use, but since 2011 it is available under a free software/open source license as a part of the open source WebRTC project. It is suitable for VoIP applications, streaming audio, archival and messaging. The algorithm is a version of block-independent linear predictive coding, with the choice of data frame lengths of 20 and 30 milliseconds. The encoded blocks have to be encapsulated in a suitable protocol for transport, usually the Real-time Transport Protocol (RTP).
In HTTP, "Referer" is an optional HTTP header field that identifies the address of the web page from which the resource has been requested. By checking the referrer, the server providing the new web page can see where the request originated.
HTML5 is a markup language used for structuring and presenting hypertext documents on the World Wide Web. It was the fifth and final major HTML version that is now a retired World Wide Web Consortium (W3C) recommendation. The current specification is known as the HTML Living Standard. It is maintained by the Web Hypertext Application Technology Working Group (WHATWG), a consortium of the major browser vendors.
WebSocket is a computer communications protocol, providing a simultaneous two-way communication channel over a single Transmission Control Protocol (TCP) connection. The WebSocket protocol was standardized by the IETF as RFC 6455 in 2011. The current specification allowing web applications to use this protocol is known as WebSockets. It is a living standard maintained by the WHATWG and a successor to The WebSocket API from the W3C.
Opus is a lossy audio coding format developed by the Xiph.Org Foundation and standardized by the Internet Engineering Task Force, designed to efficiently code speech and general audio in a single format, while remaining low-latency enough for real-time interactive communication and low-complexity enough for low-end embedded processors. Opus replaces both Vorbis and Speex for new applications, and several blind listening tests have ranked it higher-quality than any other standard audio format at any given bitrate until transparency is reached, including MP3, AAC, and HE-AAC.
HTML audio is a subject of the HTML specification, incorporating audio |speech to text]], all in the browser.
WebRTC Gateway connects between WebRTC and an established VoIP technology such as SIP. WebRTC is an API definition drafted by the World Wide Web Consortium (W3C) that supports browser-to-browser applications for voice calling, video chat, and messaging without the need of either internal or external plugins.
A uniform resource locator (URL), colloquially known as an address on the Web, is a reference to a resource that specifies its location on a computer network and a mechanism for retrieving it. A URL is a specific type of Uniform Resource Identifier (URI), although many people use the two terms interchangeably. URLs occur most commonly to reference web pages (HTTP/HTTPS) but are also used for file transfer (FTP), email (mailto), database access (JDBC), and many other applications.
A well-known URI is a Uniform Resource Identifier for URL path prefixes that start with /.well-known/
. They are implemented in webservers so that requests to the servers for well-known services or information are available at URLs consistent well-known locations across servers.
Token Binding is a proposed standard for a Transport Layer Security (TLS) extension that aims to increase TLS security by using cryptographic certificates on both ends of the TLS connection. Current practice often depends on bearer tokens, which may be lost or stolen. Bearer tokens are also vulnerable to man-in-the-middle attacks or replay attacks. In contrast, bound tokens are established by a user agent that generates a private-public key pair per target server, providing the public key to the server, and thereafter proving possession of the corresponding private key on every TLS connection to the server.