Digital Speech Standard

Last updated

Digital Speech Standard (DSS) is a proprietary compressed digital audio file format defined by the International Voice Association, a co-operative venture by Olympus, Philips and Grundig Business Systems.

Contents

DSS was originally developed in 1994 by Grundig with the University of Nuremberg. In 1997, the digital speech standard was released, which was based on the previous codec. It is commonly used on digital dictation recorders. Modern psychoacoustical codecs that perform nearly as well at only slightly higher bitrates have led to this speech coding standard being less used in modern voice recording equipment.

Operation

The DSS file format stores voice audio data in a highly compressed format that allows basic recording functionality (such as recording, playing, rewinding, etc.) as well as the ability to record in either insert or overwrite mode making it ideal for dictation. This along with ability to include additional information in the file header for the transcriptionist including priority mark, author, job type, etc.

DSS is a format designed specifically for speech, equivalent to MP3 for music. In contrast with MP3, however, the quality usually is as low as possible, to minimize the size of the file.

See also


Related Research Articles

An audio file format is a file format for storing digital audio data on a computer system. The bit layout of the audio data is called the audio coding format and can be uncompressed, or compressed to reduce the file size, often using lossy compression. The data can be a raw bitstream in an audio coding format, but it is usually embedded in a container format or an audio data format with defined storage layer.

A codec is a device or computer program which encodes or decodes a digital data stream or signal. Codec is a portmanteau of coder-decoder.

In signal processing, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information. Typically, a device that performs data compression is referred to as an encoder, and one that performs the reversal of the process (decompression) as a decoder.

MP3 is a coding format for digital audio developed largely by the Fraunhofer Society in Germany, with support from other digital scientists in the US and elsewhere. Originally defined as the third audio format of the MPEG-1 standard, it was retained and further extended — defining additional bit-rates and support for more audio channels — as the third audio format of the subsequent MPEG-2 standard. A third version, known as MPEG 2.5 — extended to better support lower bit rates — is commonly implemented, but is not a recognized standard.

Windows Media Audio (WMA) is a series of audio codecs and their corresponding audio coding formats developed by Microsoft. It is a proprietary technology that forms part of the Windows Media framework. WMA consists of four distinct codecs. The original WMA codec, known simply as WMA, was conceived as a competitor to the popular MP3 and RealAudio codecs. WMA Pro, a newer and more advanced codec, supports multichannel and high resolution audio. A lossless codec, WMA Lossless, compresses audio data without loss of audio fidelity. WMA Voice, targeted at voice content, applies compression using a range of low bit rates. Microsoft has also developed a digital container format called Advanced Systems Format to store audio encoded by WMA.

Digital audio

Digital audio is a representation of sound recorded in, or converted into, digital form. In digital audio, the sound wave of the audio signal is typically encoded as numerical samples in a continuous sequence. For example, in CD audio, samples are taken 44,100 times per second, each with 16-bit sample depth. Digital audio is also the name for the entire technology of sound recording and reproduction using audio signals that have been encoded in digital form. Following significant advances in digital audio technology during the 1970s and 1980s, it gradually replaced analog audio technology in many areas of audio engineering and telecommunications in the 1990s and 2000s.

MPEG-1 Audio Layer II or MPEG-2 Audio Layer II is a lossy audio compression format defined by ISO/IEC 11172-3 alongside MPEG-1 Audio Layer I and MPEG-1 Audio Layer III (MP3). While MP3 is much more popular for PC and Internet applications, MP2 remains a dominant standard for audio broadcasting.

In telecommunications and computing, bit rate is the number of bits that are conveyed or processed per unit of time.

Hi-MD MiniDisc-based magneto-optical media data storage format

In January 2004, Sony announced the Hi-MD media storage format as a further development of the MiniDisc format. With its release in later 2004, came the ability to use newly developed, high-capacity 1 gigabyte Hi-MD discs, sporting the same dimensions as regular MiniDiscs. The Hi-MD format can be considered obsolete as the last recorder/player was discontinued in 2011. The discs themselves were withdrawn from sale in September 2012, though regular MiniDiscs are still available.

The Adaptive Multi-Rateaudio codec is an audio compression format optimized for speech coding. AMR speech codec consists of a multi-rate narrowband speech codec that encodes narrowband (200–3400 Hz) signals at variable bit rates ranging from 4.75 to 12.2 kbit/s with toll quality speech starting at 7.4 kbit/s.

Rockbox

Rockbox is a free and open-source software replacement for the OEM firmware in various forms of digital audio players (DAPs) with an original kernel. It offers an alternative to the player's operating system, in many cases without removing the original firmware, which provides a plug-in architecture for adding various enhancements and functions. Enhancements include personal digital assistant (PDA) functions, applications, utilities, and games. Rockbox can also retrofit video playback functions on players first released in mid-2000. Rockbox includes a voice-driven user-interface suitable for operation by visually impaired users.

MacSpeech, Inc. was a New Hampshire-based technology company that produced software-based speech recognition and voice dictation solutions for the Apple ecosystem. The company's products included iListen, MacSpeech Dictate, MacSpeech Dictate Medical, MacSpeech Dictate Legal, MacSpeech Dictate International, and MacSpeech Scribe. On February 12, 2010, Nuance Communications, Inc. acquired MacSpeech.

Mini-Cassette Audio cassette format

The Mini-Cassette, often written minicassette, is a magnetic tape audio cassette format introduced by Philips in 1967. It is used primarily in dictation machines and was also employed as a data storage for the Philips P2000 home computer. Unlike the Compact Cassette, also designed by Philips, and the later Microcassette, introduced by Olympus, the Mini-Cassette does not use a capstan drive system; instead, the tape is propelled past the tape head by the reels. This is mechanically simple and allows the cassette to be made smaller and easier to use, but produces a system unsuited to any task other than voice recording, as the tape speed is not constant and prone to wow and flutter. However, the lack of a capstan and a pinch roller drive means that the tape is well-suited to being repeatedly shuttled forward and backward short distances as compared to microcassettes, leading to the Mini-Cassette's use in the first generations of telephone answering machines, and continuing use in the niche markets of dictation and transcription, where fidelity is not critical, but robustness of storage is, and where analog media remained in use long after digital media had been introduced.

MPEG-4 Part 14 MP4; digital format for storing video and audio

MPEG-4 Part 14 or MP4 is a digital multimedia container format most commonly used to store video and audio, but it can also be used to store other data such as subtitles and still images. Like most modern container formats, it allows streaming over the Internet. The only filename extension for MPEG-4 Part 14 files as defined by the specification is .mp4. MPEG-4 Part 14 is a standard specified as a part of MPEG-4.

Dictation machine Sound recording device most commonly used to record speech for later playback or to be typed into print

A dictation machine is a sound recording device most commonly used to record speech for playback or to be type into print. It includes digital voice recorders and tape recorder.

Adaptive differential pulse-code modulation (ADPCM) is a variant of differential pulse-code modulation (DPCM) that varies the size of the quantization step, to allow further reduction of the required data bandwidth for a given signal-to-noise ratio.

Grundig Business Systems (GBS) is a German company located in Bayreuth and Nuremberg in Germany and employs 170 people. Since 2001, it has been an independent corporation that manufactures analogue and digital dictation devices featuring the "Made in Germany" quality label.

Speech Processing Solutions

Speech Processing Solutions is an international electronics company headquartered in Vienna, Austria. The company designs, develops, manufactures and markets speech processing devices, such as those used in digital dictation and speech recognition. Speech Processing Solutions was formed on 1 July 2012. Philips Speech Processing was part of the Philips Consumer Lifestyle sector. Speech Processing Solutions is now an official licensee of the Philips brand. The company has subsidiaries in the US, Canada, Australia, the United Kingdom, Belgium, France and Germany, and employs around 170 people worldwide.

Audio coding format Digitally coded format for audio signals

An audio coding format is a content representation format for storage or transmission of digital audio. Examples of audio coding formats include MP3, AAC, Vorbis, FLAC, and Opus. A specific software or hardware implementation capable of audio compression and decompression to/from a specific audio coding format is called an audio codec; an example of an audio codec is LAME, which is one of several different codecs which implements encoding and decoding audio in the MP3 audio coding format in software.