HTML5 audio

Last updated

[1] HTML5 Audio is a subject of the HTML5 specification, incorporating audio input, playback, and synthesis, as well as in the browser.

Contents

<audio> element

The <audio> element represents a sound, or an audio stream. [2] It is commonly used to play back a single audio file within a web page, showing a GUI widget with play/pause/volume controls.

The <audio> element has these attributes:

Example: [3]

Supporting browsers

On PC:

On mobile devices:

Supported audio coding formats

The adoption of HTML5 audio, as with HTML5 video, has become polarized between proponents of free and patent-encumbered formats. In 2007, the recommendation to use Vorbis was retracted from the specification by the W3C together with that to use Ogg Theora, citing the lack of a format accepted by all the major browser vendors.

Apple and Microsoft support the ISO/IEC-defined formats AAC and the older MP3. Mozilla and Opera support the free and open, royalty-free Vorbis format in Ogg and WebM containers, and criticize the patent-encumbered nature of MP3 and AAC, which are guaranteed to be “non-free”. Google has so far provided support for all common formats.

Most AAC files with finite length are wrapped in an MPEG-4 container (.mp4, .m4a), which is supported natively in Internet Explorer, Safari, and Chrome, and supported by the OS in Firefox and Opera. [5] Most AAC live streams with infinite length are wrapped in an Audio Data Transport Stream container (.aac, .adts), which is supported by Chrome, Safari, Firefox and Edge. [6] [7] [8]

Many browsers also support uncompressed PCM audio in a WAVE container. [9]

In 2012, the free and open royalty-free Opus format was released and standardized by IETF. It is supported by Mozilla, Google, Opera and Edge. [9] [10] [11] [12]

This table documents the current support for audio coding formats by the <audio> element.

Formats supported by different web browsers
FormatContainerMIME type Chrome Internet Explorer Edge Firefox Opera Safari
PCM WAV audio/wavYesNoYesYes, in v3.5[ citation needed ]Yes, in v11.00Yes, in v3.1
MP3 MP3 audio/mpegYes [13] Yes, in IE9 YesYes, in v71 [14] Yes [13] Yes, in v3.1
AAC MP4 audio/mp4YesYes, in IE9 YesFrom OS [lower-alpha 1] YesYes
ADTS [lower-alpha 2] audio/aac
audio/aacp
YesNoYesFrom OS [lower-alpha 1] in v45.0 [16] [17] YesYes
Vorbis Ogg audio/oggYes, in v9NoIn v79 [18]
In v17, with Web Media Extensions [19]
Yes, in v3.5 [4] Yes, in v10.50With Xiph QuickTime Components (macOS 10.11 and earlier)
WebM audio/webmYesNoIn v79 [18]
In v17, with Web Media Extensions [19]
Yes, in v4.0 [20] Yes, in v10.60No
Opus Ogg audio/oggYes, in v25
(in v31 for Windows)
NoIn v79 [21]
In v17, with Web Media Extensions [19]
Yes, in v15.0 [22] Yes, in v14No
WebM audio/webmYesNoIn v79 [21]
In v17, with Web Media Extensions [19]
Yes, in v28.0 [23] YesYes, in Safari 15+ and macOS Monterey [24] [25]
CAF audio/x-cafNoNoNoNoNoYes, in Safari 11 and macOS High Sierra
FLAC FLAC audio/flacYes, in v56 [26] NoYes, in v16 [27] Yes, in v51 [28] YesYes, in v11 [29]
Ogg audio/oggYes, in v56 [26] NoIn v79 [30]
In v17, with Web Media Extensions [19]
Yes, in v51 [28] YesNo

Web Audio API and MediaStream Processing API

The Web Audio API specification developed by W3C describes a high-level JavaScript API for processing and synthesizing audio in web applications. The primary paradigm is of an audio routing graph, where a number of AudioNode objects are connected together to define the overall audio rendering. The actual processing will primarily take place in the underlying implementation (typically optimized Assembly / C / C++ code), but direct JavaScript processing and synthesis is also supported. [31]

Mozilla's Firefox browser implements a similar Audio Data API extension since version 4, implemented in 2010 [32] and released in 2011, but Mozilla warns it is non-standard and deprecated, and recommends the Web Audio API instead. [33] Some JavaScript audio processing and synthesis libraries such as Audiolet Archived 2013-01-28 at the Wayback Machine support both APIs.

The W3C Audio Working Group is also considering the MediaStream Processing API specification developed by Mozilla. [34] In addition to audio mixing and processing, it covers more general media streaming, including synchronization with HTML elements, capture of audio and video streams, and peer-to-peer routing of such media streams. [35]

Supporting browsers

On PC:

On mobile devices:

Web Speech API

The Web Speech API aims to provide an alternative input method for web applications (without using a keyboard). With this API, developers can give web apps the ability to transcribe voice to text, from the computer's microphone. The recorded audio is sent to speech servers for transcription, after which the text is typed out for the user. The API itself is agnostic of the underlying speech recognition implementation and can support both server based as well as embedded recognizers. [38] The HTML Speech Incubator group has proposed the implementation of audio-speech technology in browsers in the form of uniform, cross-platform APIs. The API contains both: [39]

Google integrated this feature into Google Chrome in March 2011. [40] Letting its users search the web with their voice with code like:

Supporting browsers

See also

Notes

  1. 1 2 There is no native support for the AAC codec due to licensing reasons. Decoding of audio files requires the host OS to provide a compatible library. [15]
  2. An MPEG-4 file contains a header that includes metadata followed by "tracks" which can include video as well as audio data, for example, H.264 encoded Video and AAC encoded Audio. ADTS in contrast is a streaming format consisting of a series of frames, each frame having a header followed by the AAC data. [7]

Related Research Articles

<span class="mw-page-title-main">Favicon</span> Icon associated with a particular web site

A favicon, also known as a shortcut icon, website icon, tab icon, URL icon, or bookmark icon, is a file containing one or more small icons associated with a particular website or web page. A web designer can create such an icon and upload it to a website by several means, and graphical web browsers will then make use of it. Browsers that provide favicon support typically display a page's favicon in the browser's address bar and next to the page's name in a list of bookmarks. Browsers that support a tabbed document interface typically show a page's favicon next to the page's title on the tab, and site-specific browsers use the favicon as a desktop icon.

<span class="mw-page-title-main">Browser wars</span> Competition between web browsing applications for share of worldwide usage

A browser war is a competition for dominance in the usage share of web browsers. The "first browser war" (1995–2001) consisted of Internet Explorer and Netscape Navigator, and the "second browser war" (2004-2017) between Internet Explorer, Firefox, and Google Chrome.

This is a comparison of both historical and current web browsers based on developer, engine, platform(s), releases, license, and cost.

Netscape Plugin Application Programming Interface (NPAPI) is a deprecated application programming interface (API) for web browser plugins, initially developed for Netscape Navigator 2.0 in 1995 and subsequently adopted by other browsers.

A browser extension is a software module for customizing a web browser. Browsers typically allow users to install a variety of extensions, including user interface modifications, cookie management, ad blocking, and the custom scripting and styling of web pages.

<span class="mw-page-title-main">Web development tools</span> Software used to test the UI of a website or web application

Web development tools allow web developers to test, modify and debug their websites. They are different from website builders and integrated development environments (IDEs) in that they do not assist in the direct creation of a webpage, rather they are tools used for testing the user interface of a website or web application.

Web storage, sometimes known as DOM storage, is a standard JavaScript API provided by web browsers. It enables websites to store persistent data on users' devices similar to cookies, but with much larger capacity and no information sent in HTTP headers. There are two main web storage types: local storage and session storage, behaving similarly to persistent cookies and session cookies respectively. Web Storage is standardized by the World Wide Web Consortium (W3C) and WHATWG, and is supported by all major browsers.

<span class="mw-page-title-main">WebGL</span> JavaScript bindings for OpenGL in web browsers

WebGL is a JavaScript API for rendering interactive 2D and 3D graphics within any compatible web browser without the use of plug-ins. WebGL is fully integrated with other web standards, allowing GPU-accelerated usage of physics, image processing, and effects in the HTML canvas. WebGL elements can be mixed with other HTML elements and composited with other parts of the page or page background.

The Web Open Font Format (WOFF) is a font format for use in web pages. WOFF files are OpenType or TrueType fonts, with format-specific compression applied and additional XML metadata added. The two primary goals are first to distinguish font files intended for use as web fonts from fonts files intended for use in desktop applications via local installation, and second to reduce web font latency when fonts are transferred from a server to a client over a network connection.

<span class="mw-page-title-main">WebSocket</span> Computer network protocol

WebSocket is a computer communications protocol, providing a simultaneous two-way communication channel over a single Transmission Control Protocol (TCP) connection. The WebSocket protocol was standardized by the IETF as RFC 6455 in 2011. The current specification allowing web applications to use this protocol is known as WebSockets. It is a living standard maintained by the WHATWG and a successor to The WebSocket API from the W3C.

The HTML5 specification introduced the video element for the purpose of playing videos, partially replacing the object element. HTML5 video is intended by its creators to become the new standard way to show video on the web, instead of the previous de facto standard of using the proprietary Adobe Flash plugin, though early adoption was hampered by lack of agreement as to which video coding formats and audio coding formats should be supported in web browsers. As of 2020, HTML5 video is the only widely supported video playback technology in modern browsers, with the Flash plugin being phased out.

Server-Sent Events (SSE) is a server push technology enabling a client to receive automatic updates from a server via an HTTP connection, and describes how servers can initiate data transmission towards clients once an initial client connection has been established. They are commonly used to send message updates or continuous data streams to a browser client and designed to enhance native, cross-browser streaming through a JavaScript API called EventSource, through which a client requests a particular URL in order to receive an event stream. The EventSource API is standardized as part of HTML5 by the WHATWG. The media type for SSE is text/event-stream.

The Indexed Database API is a JavaScript application programming interface (API) provided by web browsers for managing a NoSQL database of objects. It is a standard maintained by the World Wide Web Consortium (W3C).

Web SQL Database is a deprecated web browser API specification for storing data in databases that can be queried using SQL variant. The technology was only ever implemented in Blink-based browsers like Google Chrome and the new Microsoft Edge, and WebKit-based browsers like Safari. As of February 2024, WebSQL is being phased out in favor of WebStorage and IndexedDB and OPFS, but still available in some contexts under restrictive conditions.

WebRTC is a free and open-source project providing web browsers and mobile applications with real-time communication (RTC) via application programming interfaces (APIs). It allows audio and video communication and streaming to work inside web pages by allowing direct peer-to-peer communication, eliminating the need to install plugins or download native apps.

Content Security Policy (CSP) is a computer security standard introduced to prevent cross-site scripting (XSS), clickjacking and other code injection attacks resulting from execution of malicious content in the trusted web page context. It is a Candidate Recommendation of the W3C working group on Web Application Security, widely supported by modern web browsers. CSP provides a standard method for website owners to declare approved origins of content that browsers should be allowed to load on that website—covered types are JavaScript, CSS, HTML frames, web workers, fonts, images, embeddable objects such as Java applets, ActiveX, audio and video files, and other HTML5 features.

Encrypted Media Extensions (EME) is a W3C specification for providing a communication channel between web browsers and the Content Decryption Module (CDM) software which implements digital rights management (DRM). This allows the use of HTML5 video to play back DRM-wrapped content such as streaming video services without the use of heavy third-party media plugins like Adobe Flash or Microsoft Silverlight. The use of a third-party key management system may be required, depending on whether the publisher chooses to scramble the keys.

Media Source Extensions (MSE) is a W3C specification that allows JavaScript to send byte streams to media codecs within web browsers that support HTML5 video and audio. Among other possible uses, this allows the implementation of client-side prefetching and buffering code for streaming media entirely in JavaScript. It is compatible with, but should not be confused with, the Encrypted Media Extensions (EME) specification, and neither requires the use of the other, although many EME implementations are only capable of decrypting media data provided via MSE.

WebXR Device API is a Web application programming interface (API) that describes support for accessing augmented reality and virtual reality devices, such as the HTC Vive, Oculus Rift, Oculus Quest, Google Cardboard, HoloLens, Apple Vision Pro, Magic Leap or Open Source Virtual Reality (OSVR), in a web browser. The WebXR Device API and related APIs are standards defined by W3C groups, the Immersive Web Community Group and Immersive Web Working Group. While the Community Group works on the proposals in the incubation period, the Working Group defines the final web specifications to be implemented by the browsers.

<span class="mw-page-title-main">Progressive web app</span> Specific form of single page web application

A progressive web application (PWA), or progressive web app, is a type of application software delivered through the web, built using common web technologies including HTML, CSS, JavaScript, and WebAssembly. It is intended to work on any platform with a standards-compliant browser, including desktop and mobile devices.

References

  1. 1 2 3 "Resources – Safari". Apple Developer. Retrieved 2022-11-18.
  2. "HTML5 audio element – W3C". Archived from the original on 2013-06-06. Retrieved 2013-07-02.
  3. "The Embed Audio element – HTML: HyperText Markup Language | MDN".
  4. 1 2 "Firefox Notes - Desktop".
  5. "TechFans.net – Technology and Business News blog". TechFans.net. Retrieved 2022-11-18.
  6. "MP4 container · Issue #95 · karlheyes/icecast-kh". GitHub. Retrieved 2022-11-18.
  7. 1 2 "Technical Note TN2236: High-Efficiency Advanced Audio Coding (HE-AAC)".
  8. "1224887 – Implement OpenMax IL AAC audio decoding client".
  9. 1 2 "Media type and format guide: image, audio, and video content – Web media technologies | MDN". developer.mozilla.org.
  10. "September 11, 2012: Opus audio codec is now RFC6716, Opus 1.0.1 reference source released".
  11. "It's Opus, it rocks and now it's an audio codec standard! – Mozilla Hacks – the Web developer blog".
  12. "WebM, VP9 and Opus Support in Microsoft Edge – Microsoft Edge Dev BlogMicrosoft Edge Dev Blog". blogs.windows.com. 18 April 2016. Retrieved 2017-03-22.
  13. 1 2 "Enable mp3 support in Chromium". Google. Retrieved 2018-05-01.
  14. "Firefox 71.0 release notes". Mozilla. December 3, 2019.
  15. "Media type and format guide: image, audio, and video content". Mozilla Developer Network. Mozilla. Retrieved 2019-12-06.
  16. "1190341 - audio/aacp shoutcast is not supported".
  17. "1169212 - Create ADTSDemuxer, a MediaDataDemuxer".
  18. 1 2 "Platform Status – Microsoft Edge Developer".
  19. 1 2 3 4 5 "Introducing the Web Media Extension Package with OGG Vorbis and Theora support for Microsoft Edge". Microsoft Edge Dev Blog. Microsoft. December 5, 2017.
  20. "Firefox Notes - Desktop".
  21. 1 2 "Platform Status – Microsoft Edge Developer". developer.microsoft.com.
  22. "Firefox Notes - Desktop".
  23. "Firefox 28.0, See All New Features, Updates and Fixes".
  24. Simmons, Jen (October 26, 2021). "New WebKit Features in Safari 15".
  25. "Apple Developer Documentation". developer.apple.com.
  26. 1 2 "FLAC codec support for <audio> and WebAudio". Chrome Platform Status. Retrieved 2016-12-27.
  27. "Platform Status – Microsoft Edge Developer". developer.microsoft.com.
  28. 1 2 "Firefox 51 for developers". Mozilla Developer Network. Retrieved 2016-12-27.
  29. Chaim Gartenberg (June 6, 2017). "Apple reportedly adds support for FLAC lossless audio in iOS 11". The Verge.
  30. "Platform Status – Microsoft Edge Developer".
  31. Chris Rogers (2012-03-15). "Web Audio API". W3C. Archived from the original on 2012-07-20. Retrieved 2012-07-04.{{cite web}}: CS1 maint: bot: original URL status unknown (link)
  32. "Audio Data API".
  33. "Introducing the Audio API extension". Mozilla Developer Network . Mozilla. 2012-03-05. Archived from the original on 2012-05-05. Retrieved 2012-07-04.
  34. "Audio Processing API". W3C. 2011-12-15. Archived from the original on 2012-06-14. Retrieved 2012-07-04.{{cite web}}: CS1 maint: bot: original URL status unknown (link)
  35. Robert O'Callahan (2012-05-31). "MediaStream Processing API". W3C . Retrieved 2012-07-04.
  36. "Web Audio API is now available in Chrome from Chris Rogers on 2011-02-01 (public-xg-audio@w3.org from February 2011)". lists.w3.org. Retrieved 2022-11-18.
  37. Scott Gilbertson (2011-09-19). "Chrome 14 Adds Better Audio, 'Native Client' Support". Webmonkey . Wired . Retrieved 2012-07-04.
  38. "API draft" . Retrieved January 28, 2012.
  39. "HTML5 Speech API" . Retrieved January 28, 2012.
  40. "Talking to your computer" . Retrieved January 28, 2012.
  41. "Firefox 44 for developers – Mozilla | MDN" . Retrieved March 9, 2016.
  42. "Firefox — Notes (45.0) — Mozilla" . Retrieved March 9, 2016.
  43. "Web Speech API – Web APIs | MDN" . Retrieved March 9, 2016.