Harvard sentences

Last updated

The Harvard sentences, or Harvard lines, [1] is a collection of 720 sample phrases, divided into lists of 10, used for standardized testing of Voice over IP, cellular, and other telephone systems. They are phonetically balanced sentences that use specific phonemes at the same frequency they appear in English.

Contents


IEEE Recommended Practice for Speech Quality Measurements [3] sets out seventy-two lists of ten phrases each, described as the "1965 Revised List of Phonetically Balanced Sentences (Harvard Sentences)." They are widely used in research on telecommunications, speech, and acoustics, where standardized and repeatable sequences of speech are needed. The Open Speech Repository [4] provides some freely usable, prerecorded WAV files of Harvard Sentences in American and British English, in male and female voices.

Harvard lines are also used to observe how an actor's mouth can move when they are talking. This can be used when creating more realistic CGI models. [1]

Sample Harvard sentences

The first three lists are as follows: [5]

List 1

  1. The birch canoe slid on the smooth planks.
  2. Glue the sheet to the dark blue background.
  3. It's easy to tell the depth of a well.
  4. These days a chicken leg is a rare dish.
  5. Rice is often served in round bowls.
  6. The juice of lemons makes fine punch.
  7. The box was thrown beside the parked truck.
  8. The hogs were fed chopped corn and garbage.
  9. Four hours of steady work faced us.
  10. A large size in stockings is hard to sell.

List 2

  1. The boy was there when the sun rose.
  2. A rod is used to catch pink salmon.
  3. The source of the huge river is the clear spring.
  4. Kick the ball straight and follow through.
  5. Help the woman get back to her feet.
  6. A pot of tea helps to pass the evening.
  7. Smoky fires lack flame and heat.
  8. The soft cushion broke the man's fall.
  9. The salt breeze came across from the sea.
  10. The girl at the booth sold fifty bonds.

List 3

  1. The small pup gnawed a hole in the sock.
  2. The fish twisted and turned on the bent hook.
  3. Press the pants and sew a button on the vest.
  4. The swan dive was far short of perfect.
  5. The beauty of the view stunned the young boy.
  6. Two blue fish swam in the tank.
  7. Her purse was full of useless trash.
  8. The colt reared and threw the tall rider.
  9. It snowed, rained, and hailed the same morning.
  10. Read verse out loud for pleasure.

Related Research Articles

<span class="mw-page-title-main">Audio file format</span> Computer format for digital audio

An audio file format is a file format for storing digital audio data on a computer system. The bit layout of the audio data is called the audio coding format and can be uncompressed, or compressed to reduce the file size, often using lossy compression. The data can be a raw bitstream in an audio coding format, but it is usually embedded in a container format or an audio data format with defined storage layer.

A codec is a computer hardware or software component that encodes or decodes a data stream or signal. Codec is a portmanteau of coder/decoder.

<span class="mw-page-title-main">Ogg</span> Open container format maintained by the Xiph.Org Foundation

Ogg is a free, open container format maintained by the Xiph.Org Foundation. The authors of the Ogg format state that it is unrestricted by software patents and is designed to provide for efficient streaming and manipulation of high-quality digital multimedia. Its name is derived from "ogging", jargon from the computer game Netrek.

Speech coding is an application of data compression to digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic data compression algorithms to represent the resulting modeled parameters in a compact bitstream.

<span class="mw-page-title-main">Digital audio</span> Technology that records, stores, and reproduces sound

Digital audio is a representation of sound recorded in, or converted into, digital form. In digital audio, the sound wave of the audio signal is typically encoded as numerical samples in a continuous sequence. For example, in CD audio, samples are taken 44,100 times per second, each with 16-bit resolution. Digital audio is also the name for the entire technology of sound recording and reproduction using audio signals that have been encoded in digital form. Following significant advances in digital audio technology during the 1970s and 1980s, it gradually replaced analog audio technology in many areas of audio engineering, record production and telecommunications in the 1990s and 2000s.

Speex is an audio compression codec specifically tuned for the reproduction of human speech and also a free software speech codec that may be used on voice over IP applications and podcasts. It is based on the code excited linear prediction speech coding algorithm. Its creators claim Speex to be free of any patent restrictions and it is licensed under the revised (3-clause) BSD license. It may be used with the Ogg container format or directly transmitted over UDP/RTP. It may also be used with the FLV container format.

<span class="mw-page-title-main">G.711</span> ITU-T recommendation

G.711 is a narrowband audio codec originally designed for use in telephony that provides toll-quality audio at 64 kbit/s. It is an ITU-T standard (Recommendation) for audio encoding, titled Pulse code modulation (PCM) of voice frequencies released for use in 1972.

In telecommunications and computing, bit rate is the number of bits that are conveyed or processed per unit of time.

<span class="mw-page-title-main">Sound quality</span> Assessment of the audio output from an electronic device

Sound quality is typically an assessment of the accuracy, fidelity, or intelligibility of audio output from an electronic device. Quality can be measured objectively, such as when tools are used to gauge the accuracy with which the device reproduces an original sound; or it can be measured subjectively, such as when human listeners respond to the sound or gauge its perceived similarity to another sound.

Internet Low Bitrate Codec (iLBC) is a royalty-free narrowband speech audio coding format and an open-source reference implementation (codec), developed by Global IP Solutions (GIPS) formerly Global IP Sound. It was formerly freeware with limitations on commercial use, but since 2011 it is available under a free software/open source license as a part of the open source WebRTC project. It is suitable for VoIP applications, streaming audio, archival and messaging. The algorithm is a version of block-independent linear predictive coding, with the choice of data frame lengths of 20 and 30 milliseconds. The encoded blocks have to be encapsulated in a suitable protocol for transport, usually the Real-time Transport Protocol (RTP).

This article describes audio APIs and components in Microsoft Windows which are now obsolete or deprecated.

Adaptive differential pulse-code modulation (ADPCM) is a variant of differential pulse-code modulation (DPCM) that varies the size of the quantization step, to allow further reduction of the required data bandwidth for a given signal-to-noise ratio.

Constrained Energy Lapped Transform (CELT) is an open, royalty-free lossy audio compression format and a free software codec with especially low algorithmic delay for use in low-latency audio communication. The algorithms are openly documented and may be used free of software patent restrictions. Development of the format was maintained by the Xiph.Org Foundation and later coordinated by the Opus working group of the Internet Engineering Task Force (IETF).

<span class="mw-page-title-main">Sub-band coding</span>

In signal processing, sub-band coding (SBC) is any form of transform coding that breaks a signal into a number of different frequency bands, typically by using a fast Fourier transform, and encodes each one independently. This decomposition is often the first step in data compression for audio and video signals.

Pulse-code modulation (PCM) is a method used to digitally represent analog signals. It is the standard form of digital audio in computers, compact discs, digital telephony and other digital audio applications. In a PCM stream, the amplitude of the analog signal is sampled at uniform intervals, and each sample is quantized to the nearest value within a range of digital steps. Alec Reeves, Claude Shannon, Barney Oliver and John R. Pierce are credited with its invention.

<span class="mw-page-title-main">Opus (audio format)</span> Lossy audio coding format

Opus is a lossy audio coding format developed by the Xiph.Org Foundation and standardized by the Internet Engineering Task Force, designed to efficiently code speech and general audio in a single format, while remaining low-latency enough for real-time interactive communication and low-complexity enough for low-end embedded processors. Opus replaces both Vorbis and Speex for new applications, and several blind listening tests have ranked it higher-quality than any other standard audio format at any given bitrate until transparency is reached, including MP3, AAC, and HE-AAC.

Codec 2 is a low-bitrate speech audio codec that is patent free and open source. Codec 2 compresses speech using sinusoidal coding, a method specialized for human speech. Bit rates of 3200 to 450 bit/s have been successfully created. Codec 2 was designed to be used for amateur radio and other high compression voice applications.

<span class="mw-page-title-main">Audio coding format</span> Digitally coded format for audio signals

An audio coding format is a content representation format for storage or transmission of digital audio. Examples of audio coding formats include MP3, AAC, Vorbis, FLAC, and Opus. A specific software or hardware implementation capable of audio compression and decompression to/from a specific audio coding format is called an audio codec; an example of an audio codec is LAME, which is one of several different codecs which implements encoding and decoding audio in the MP3 audio coding format in software.

Fraunhofer FDK AAC is an open-source library for encoding and decoding digital audio in the Advanced Audio Coding (AAC) format. Fraunhofer IIS developed this library for Android 4.1. It supports several Audio Object Types including MPEG-2 and MPEG-4 AAC LC, HE-AAC, HE-AACv2 as well AAC-LD and AAC-ELD for real-time communication. The encoding library supports sample rates up to 96 kHz and up to eight channels.

Lyra is a lossy audio codec developed by Google that is designed for compressing speech at very low bitrates. Unlike most other audio formats, it compresses data using a machine learning-based algorithm.

References

  1. 1 2 "Why it’s so hard to make CGI skin look real" (at 7m13s), Vox, 3 August 2021 archived at Ghostarchive.org on 4 May 2022
  2. "Examples – Opus Codec". Opus Interactive Audio Codec. IETF / Opus working group. Retrieved 21 December 2024.
  3. "IEEE Recommended Practice for Speech Quality Measurements". IEEE Transactions on Audio and Electroacoustics. 17 (3): 225–246. September 1969. doi:10.1109/TAU.1969.1162058 . Retrieved 2012-01-05.
  4. "The Open Speech Repository" . Retrieved 2012-01-05.
  5. "Harvard Sentences". www.cs.columbia.edu. Archived from the original on 2022-02-24. Retrieved 2022-03-04.