Gapless playback

Last updated

Gapless playback is the uninterrupted playback of consecutive audio tracks, such that relative time distances in the original audio source are preserved over track boundaries on playback. For this to be useful, other artifacts (than timing-related ones) at track boundaries should not be severed either. Gapless playback is common with compact discs, gramophone records, or tapes, but is not always available with other formats that employ compressed digital audio. The absence of gapless playback is a source of annoyance to listeners of music where tracks are meant to segue into each other, such as some classical music (opera in particular), progressive rock, concept albums, electronic music, and live recordings with audience noise between tracks.

Contents

Causes of gaps

Playback latency

Various software, firmware, and hardware components may add up to a substantial delay associated with starting playback of a track. If not accounted for, the listener is left waiting in silence as the player fetches the next file (see harddisk access time), updates metadata, decodes the whole first block, before having any data to feed the hardware buffer. The gap can be as much as half a second or more — very noticeable in "continuous" music such as certain classical or dance genres. In extreme cases, the hardware is even reset between tracks, creating a very short "click".

To account for the whole chain of delays, the start of the next track should ideally be readily decoded before the currently playing track finishes. The two decoded pieces of audio must be fed to the hardware continuously over the transition, as if the tracks were concatenated in software.

Many older audio players on personal computers do not implement the required buffering to play gapless audio. Some of these rely on third-party gapless audio plug-ins to buffer output. Most recent players and newer versions of old players now support gapless playback directly.

Compression artifacts

Lossy audio compression schemes that are based on overlapping time/frequency transforms add a small amount of padding silence to the beginning and end of each track. These silences increase the playtime of the compressed audio data. [1] If not trimmed off upon playback, the two silences played consecutively over a track boundary will appear as a pause in the original audio content. Lossless formats are not prone to this problem.

For some audio formats (e.g. Ogg Vorbis), where the start and end are precisely defined, the padding is implicitly trimmed off in the decoding process. Other formats may require extra metadata for the player to achieve the same. The popular MP3 format defines no way to record the amount of delay or padding for later removal. [notes 1] Also, the encoder delay may vary from encoder to encoder, making automatic removal difficult. [2] Even if two tracks are decompressed and merged into a single track, a pause will usually remain between them.

CD recorded in TAO mode

Audio-CDs can be recorded in either disc at once (DAO) or track at once (TAO) mode. The latter is more flexible, but has the drawback of inserting approximately 2 seconds of silence between tracks.Disc at once (DAO) mode allows you to record the entire CD in one continuous session, without any pauses between tracks. [3] This mode is suitable when you want a seamless playback experience with no interruptions between songs. DAO is commonly used for live recordings, DJ mixes, or concept albums where tracks blend into each other. [4]

Ways to eliminate the gaps

Precise gapless playback

As opposed to heuristic techniques, what is often meant by precise gapless playback, is that playback timing is guaranteed to be identical to the source. By this definition, a precise gapless player is not allowed to introduce either gaps or overlaps (crossfading) between successive tracks, and is not allowed to use guesswork.

Apart from accounting for playback latency, the preciseness here lies in treating lossless data as-is, and removing the correct amount of padding from lossy data. This is not possible for file formats with loosely defined encoder specifications and no metadata and therefore no way for encoders to record the duration of extraneous silence.

Approximate methods

Heuristics are used by some music players to detect silence between tracks and trim the audio as necessary on playback. Due to the loss of time resolution of lossy compression, this method is inexact. In particular, the silence is not exactly zero. If the silence threshold is too low, some silences go undetected. Too high, and entire sections of quiet music at the beginning or end of a track may be removed.

Digital signal processing (DSP) algorithms can also be used to crossfade between tracks. This eliminates gaps that some listeners find distracting, but also greatly alters the audio signal, which may have undesirable effects on the listening experience. Some listeners dislike these effects more than the gap they attempt to remove. For example, crossfading is inappropriate for files that are already gapless, in which case the transition may feel artificially short and disturb the rhythm. [5] Also, depending on the length of untrimmed silence and the particular crossfader, it may cause a large volume drop.

These methods defeat the purpose of intentional spacing between tracks. Not all albums are mix albums; perhaps more typically, there is an aesthetic pause between unrelated tracks. Also, the artist may intentionally leave in silences for dramatic effect, which should arguably be preserved regardless of whether there is a track boundary there.

Compared to precise gapless playback, these methods are a different approach to erroneous silence in audio files, but other required features are the same. However, this approach requires more computation. In portable digital audio players, this means a reduced playing time on batteries.

User workarounds

A common workaround is to encode consecutive tracks as one single file, relying on cue sheets (or something similar) for navigation. While this method results in gapless playback within consecutive tracks, it can be unwieldy because of the possibly large size of the resulting compressed file. Furthermore, unless the playback software or hardware can recognize the cue sheets, navigating between tracks may be difficult.

It may be possible to add gapless metadata to existing files. If the encoder is known, it is possible to guess the encoder delay. Also, if the compression was performed on CD audio, the original playback length will be an integer multiple of 588 samples, the size of one CD sector. Thus the total playback time can also be guessed. Adding such information to audio files will enable precise gapless playback in players that support this.

Prerequisites

Format support

Since lossless data compression excludes the possibility of the introduction of padding, all lossless audio file formats are inherently gapless.

These lossy audio file formats have provisions for gapless encoding:

Some other formats do not officially support gapless encoding, but some implementations of encoders or decoders may handle gapless metadata.

Player support

Optimal solutions:

Hardware

  • Apple:
  • Archos Gmini XS202S
  • Cowon S9 supports gapless playback without software dependency since 2.31b firmware. Most newer Cowon players support gapless playback right out of the box (J3, X7, iAudio 9)
  • Linn Products DS network players
  • All players in the Logitech/Slim Devices Squeezebox range support gapless playback for all gapless formats (lame MP3, FLAC, Vorbis, etc.). Crossfading is also optionally available.
  • Microsoft Zune supports gapless playback with Zune 2.5 or later firmware, though some bugs remain and occasionally small pops or skips can be heard. [11]
  • Panasonic RX-D55AEG-K, a portable radio recorder with CD player
  • Rio Karma gapless hardware player with no software dependency (FLAC, Ogg, MP3, WMA), first portable DAP with the feature [12]
  • Roberts Sound 48, a clock radio with CD player
  • Rockbox for various digital audio players.
  • Sony:
    • MiniDisc Walkman supports gapless playback (including non-Sony Walkman MiniDisc players)
    • CD Walkman (such as D-NE330) supports gapless playback of ATRAC-encoded CDs
    • VAIO Pocket supports gapless playback (through a firmware update) of ATRAC files
    • Network Walkman NW-HDx and NW-A (1x00, 3000, 60x, 80x) DAPs supports gapless playback of ATRAC files - after this Walkman DAPs lost the feature when ATRAC support ceased, but continued in Japan where players still came with ATRAC. Gapless playback returned outside Japan 5 years later with Walkman NWZ-F80x through the FLAC format. [13]
  • Trekstor Vibes gapless hardware player with no software dependency
  • Victor Alneo V Series and C Series [14] [15]

Software

Alternative or partial solutions:

  • XMMS2 – has native support for gapless MP3 / Ogg Vorbis and FLAC

See also

Related Research Articles

<span class="mw-page-title-main">Ogg</span> Open container format maintained by the Xiph.Org Foundation

Ogg is a free, open container format maintained by the Xiph.Org Foundation. The authors of the Ogg format state that it is unrestricted by software patents and is designed to provide for efficient streaming and manipulation of high-quality digital multimedia. Its name is derived from "ogging", jargon from the computer game Netrek.

<span class="mw-page-title-main">Vorbis</span> Royalty-free lossy audio encoding format

Vorbis is a free and open-source software project headed by the Xiph.Org Foundation. The project produces an audio coding format and software reference encoder/decoder (codec) for lossy audio compression, libvorbis. Vorbis is most commonly used in conjunction with the Ogg container format and it is therefore often referred to as Ogg Vorbis.

Windows Media Audio (WMA) is a series of audio codecs and their corresponding audio coding formats developed by Microsoft. It is a proprietary technology that forms part of the Windows Media framework. WMA consists of four distinct codecs. The original WMA codec, known simply as WMA, was conceived as a competitor to the popular MP3 and RealAudio codecs. WMA Pro, a newer and more advanced codec, supports multichannel and high resolution audio. A lossless codec, WMA Lossless, compresses audio data without loss of audio fidelity. WMA Voice, targeted at voice content, applies compression using a range of low bit rates. Microsoft has also developed a digital container format called Advanced Systems Format to store audio encoded by WMA.

Adaptive Transform Acoustic Coding (ATRAC) is a family of proprietary audio compression algorithms developed by Sony. MiniDisc was the first commercial product to incorporate ATRAC, in 1992. ATRAC allowed a relatively small disc like MiniDisc to have the same running time as CD while storing audio information with minimal perceptible loss in quality. Improvements to the codec in the form of ATRAC3, ATRAC3plus, and ATRAC Advanced Lossless followed in 1999, 2002, and 2006 respectively.

<span class="mw-page-title-main">FLAC</span> Lossless digital audio coding format

FLAC is an audio coding format for lossless compression of digital audio, developed by the Xiph.Org Foundation, and is also the name of the free software project producing the FLAC tools, the reference software package that includes a codec implementation. Digital audio compressed by FLAC's algorithm can typically be reduced to between 50 and 70 percent of its original size and decompresses to an identical copy of the original audio data.

Monkey's Audio is an algorithm and file format for lossless audio data compression. Lossless data compression does not discard data during the process of encoding, unlike lossy compression methods such as Advanced Audio Coding, MP3, Vorbis, and Opus. Therefore, it may be decompressed to a file that is identical to the source material.

Xiph.Org Foundation is a nonprofit organization that produces free multimedia formats and software tools. It focuses on the Ogg family of formats, the most successful of which has been Vorbis, an open and freely licensed audio format and codec designed to compete with the patented WMA, MP3 and AAC. As of 2013, development work was focused on Daala, an open and patent-free video format and codec designed to compete with VP9 and the patented High Efficiency Video Coding.

foobar2000 Freeware audio player

foobar2000 is a freeware audio player for Microsoft Windows, iOS, Android and macOS developed by Peter Pawłowski. It has a modular design, which provides user flexibility in configuration and customization. Standard "skin" elements can be individually augmented or replaced with different dials and buttons, as well as visualizers such as waveform, oscilloscope, spectrum, spectrogram (waterfall), peak and smoothed VU meters, which all of them are analysis-oriented, at least for built-in visualizations. foobar2000 offers third-party user interface modifications through a software development kit (SDK).

Musepack or MPC is an open source lossy audio codec, specifically optimized for transparent compression of stereo audio at bitrates of 160–180 kbit/s. It was formerly known as MPEGplus, MPEG+ or MP+.

ReplayGain is a proposed technical standard published by David Robinson in 2001 to measure and normalize the perceived loudness of audio in computer audio formats such as MP3 and Ogg Vorbis. It allows media players to normalize loudness for individual tracks or albums. This avoids the common problem of having to manually adjust volume levels between tracks when playing audio files from albums that have been mastered at different loudness levels.

<span class="mw-page-title-main">Rockbox</span> Firmware replacement for various devices

Rockbox is a free and open-source software replacement for the OEM firmware in various forms of digital audio players (DAPs) with an original kernel. It offers an alternative to the player's operating system, in many cases without removing the original firmware, which provides a plug-in architecture for adding various enhancements and functions. Enhancements include personal digital assistant (PDA) functions, applications, utilities, and games. Rockbox can also retrofit video playback functions on players first released in mid-2000. Rockbox includes a voice-driven user-interface suitable for operation by visually impaired users.

Music Player Daemon (MPD) is a free and open source music player server. It plays audio files, organizes playlists and maintains a music database. In order to interact with it, a client program is needed. The MPD distribution includes mpc, a simple command line client.

<span class="mw-page-title-main">Music on Console</span> Console audio player

Music On Console (MOC) is an ncurses-based console audio player for Linux/UNIX. It was originally written by Damian Pietras, and is currently maintained by John Fitzgerald. It is designed to be powerful and easy to use, with an interface inspired by the Midnight Commander console file manager. The default interface layout comprises a file list in the left pane with the playlist on the right. It is configurable with customizable key bindings, color schemes and interface layouts. MOC comes with several themes defined in text files, which can be modified to create new layouts. It supports ALSA, OSS or JACK outputs.

<span class="mw-page-title-main">Tag editor</span> Software for editing the metadata of media files

A tag editor is an app that can add, edit, or remove embedded metadata on multimedia file formats. Content creators, such as musicians, photographers, podcasters, and video producers, may need to properly label and manage their creations, adding such details as title, creator, date of creation, and copyright notice.

aTunes Open source audio player

aTunes is a free and open source audio player with MPlayer as its playback engine. aTunes supports MP3, Ogg Vorbis, FLAC and other formats. aTunes allows users to edit tags, organize music and rip audio CDs easily.

<span class="mw-page-title-main">Sansa Fuze</span> 2008 portable media player by SanDisk

The Sansa Fuze is a portable media player developed by SanDisk and released on March 8, 2008. The Fuze is available in three different Flash memory capacities: 2 GB, 4 GB, and 8 GB and comes in six different colors: black, blue, pink, red, silver, and white. Storage is expandable via a microSDHC slot with capacity up to 32 GB, and unofficially to 64 GB or more via FAT32 formatted SDXC cards. All models have a 1.9 inch TFT LCD display with a resolution of 220 by 176 pixels and a built-in monaural microphone and FM tuner; recordings of the latter two are saved as PCM WAV files.

<span class="mw-page-title-main">Nightingale (software)</span> Open source audio player

Nightingale is a discontinued free, open source audio player based on the Songbird media player source code. As such, Nightingale's engine is based on the Mozilla XULRunner with libraries such as the GStreamer media framework and libtag providing media tagging and playback support, amongst others.

<span class="mw-page-title-main">Puddletag</span> Tag editor for Unix-like operating systems

Puddletag is a graphical audio file metadata editor ("tagger") for Unix-like operating systems.

References

  1. Taylor, Mark (2003). "LAME Technical FAQ" . Retrieved 2006-07-06.
  2. Robinson, David (2001). "lame v3.81 and 3.87 beta mp3 decoding quality test results" . Retrieved 2006-08-24. Features a table of encoder delay values.
  3. Taimoor, Taimoor (2023-06-15). "Gapless Playback/Cross Fade".
  4. Hassan, Taimoor (2023-06-08). "What Is & How To Enable Gapless Playback Spotify 2023?". spotifmania.com. Retrieved 2023-06-19.
  5. "256734 – precise gapless playback". bugs.kde.org. Retrieved 7 December 2017.
  6. "Speex News". 2004-07-28. Retrieved 2008-04-25.
  7. "LAME Technical FAQ". June 2000. Retrieved 2012-01-28.
  8. "Guides and Sample Code". developer.apple.com. Retrieved 7 December 2017.
  9. "再生制御". www.project9k.jp. Retrieved 7 December 2017.
  10. 1 2 3 "What is Gapless Playback?". Apple Inc. Archived from the original on 2008-05-08. Retrieved 2008-05-13.
  11. "Thread on gapless playback on Zune HD". 2010-02-25. Retrieved 2010-05-04.
  12. "Rio Karma 20Gb MP3 Player". 24 April 2004.
  13. "Sony NWZ-F806 Specification Guide - Page 1 of 4".
  14. Ittousai. "ビクターAlneo にギャップレス再生・AAC対応の新モデル - Engadget Japanese" . Retrieved 7 December 2017.
  15. "【新製品レビュー】". av.watch.impress.co.jp. Retrieved 7 December 2017.
  16. "Thread on Gapless Playback on Amarok Mailing List". 2006-09-06. Retrieved 2007-01-19.
  17. "[Implemented] Gapless Playback". 23 December 2018.

Notes

  1. Despite this, there are encoders which store the amount of padding introduced in metadata to allow gapless playback. This can only be used if the playback software is able to interpret the metadata information.
  2. 1 2 3 Vorbis and Speex feature gapless support through the Ogg layer. The reference implementation of Speex did not initially ship with gapless metadata support.