Electroglottographic wavegram

Last updated

An electroglottographic wavegram (short: EGG wavegram) is a tool for analyzing the voice source in speech and singing, based on electroglottographic (EGG) signals (and their first derivative, DEGG). [1]

Contents

Assessing the singing and speaking voice

How to construct a wavegram Electroglottographic Wavegram.png
How to construct a wavegram

The wavegram, invented by Christian T. Herbst, provides an intuitive means for quickly assessing vocal fold contact phenomena and their variation over time. Vocal fold closings and openings appear here as a sequence of events rather than single incidents, taking place over a certain period of time, and changing with pitch, loudness and register. Wavegrams document systematic phenomena[ clarification needed ], indicating subtle changes of the vocal fold oscillatory regime.

Electroglottographic wavegrams are created in 5 steps (see illustration):

  1. extraction of consecutive glottal cycles from the EGG signal;
  2. locally normalized data values are converted into monochrome color information, and are plotted as a strip representing one glottal cycle each;
  3. strips are rotated 90 degrees counter-clockwise;
  4. Glottal cycle duration is normalized by scaling the individual glottal cycle plots to the same height;
  5. the resulting graphs are combined to form the final display, the EGG wavegram.

Wavegram data can be influenced by

  1. the anatomical baseline of the individual;
  2. physiological and habitual muscular patterns in phonation, e.g. degree of vocal fold adduction, register in singing and speech;
  3. organic voice disorders, i.e. pathological deviations from the norm.

Wavegrams show a potential to be used in:

To construct a wavegram, the time-varying fundamental frequency is calculated and consecutive individual glottal cycles are identified in the EGG or DEGG signal. Each cycle is locally normalized in duration and amplitude, the signal values are encoded by color intensity and the cycles are concatenated to display the entire phonation in a single image, much as in sound spectrography.

The idea of wavegrams can be extended to displaying other data, such as acoustic signals or high-speed video endoscopic recordings of the vibrating vocal folds. [3] [4]

Related Research Articles

<span class="mw-page-title-main">Formant</span> Spectrum of phonetic resonance in speech production, or its peak

In speech science and phonetics, a formant is the broad spectral maximum that results from an acoustic resonance of the human vocal tract. In acoustics, a formant is usually defined as a broad peak, or local maximum, in the spectrum. For harmonic sounds, with this definition, the formant frequency is sometimes taken as that of the harmonic that is most augmented by a resonance. The difference between these two definitions resides in whether "formants" characterise the production mechanisms of a sound or the produced sound itself. In practice, the frequency of a spectral peak differs slightly from the associated resonance frequency, except when, by luck, harmonics are aligned with the resonance frequency.

<span class="mw-page-title-main">Phonetics</span> Study of the sounds of human language

Phonetics is a branch of linguistics that studies how humans produce and perceive sounds, or in the case of sign languages, the equivalent aspects of sign. Linguists who specialize in studying the physical properties of speech are phoneticians. The field of phonetics is traditionally divided into three sub-disciplines based on the research questions involved such as how humans plan and execute movements to produce speech, how various movements affect the properties of the resulting sound, or how humans convert sound waves to linguistic information. Traditionally, the minimal linguistic unit of phonetics is the phone—a speech sound in a language which differs from the phonological unit of phoneme; the phoneme is an abstract categorization of phones, and it is also defined as the smallest unit that discerns meaning between sounds in any given language.

The term phonation has slightly different meanings depending on the subfield of phonetics. Among some phoneticians, phonation is the process by which the vocal folds produce certain sounds through quasi-periodic vibration. This is the definition used among those who study laryngeal anatomy and physiology and speech production in general. Phoneticians in other subfields, such as linguistic phonetics, call this process voicing, and use the term phonation to refer to any oscillatory state of any part of the larynx that modifies the airstream, of which voicing is just one example. Voiceless and supra-glottal phonations are included under this definition.

Vocal loading is the stress inflicted on the speech organs when speaking for long periods.

Voice analysis is the study of speech sounds for purposes other than linguistic content, such as in speech recognition. Such studies include mostly medical analysis of the voice (phoniatrics), but also speaker identification. More controversially, some believe that the truthfulness or emotional state of speakers can be determined using voice stress analysis or layered voice analysis.

<span class="mw-page-title-main">Human voice</span> Sound made by a human being using the vocal tract

The human voice consists of sound made by a human being using the vocal tract, including talking, singing, laughing, crying, screaming, shouting, humming or yelling. The human voice frequency is specifically a part of human sound production in which the vocal folds are the primary sound source.

Vocal cord nodules are bilaterally symmetrical benign white masses that form at the midpoint of the vocal folds. Although diagnosis involves a physical examination of the head and neck, as well as perceptual voice measures, visualization of the vocal nodules via laryngeal endoscopy remains the primary diagnostic method. Vocal fold nodules interfere with the vibratory characteristics of the vocal folds by increasing the mass of the vocal folds and changing the configuration of the vocal fold closure pattern. Due to these changes, the quality of the voice may be affected. As such, the major perceptual signs of vocal fold nodules include vocal hoarseness and breathiness. Other common symptoms include vocal fatigue, soreness or pain lateral to the larynx, and reduced frequency and intensity range. Airflow levels during speech may also be increased. Vocal fold nodules are thought to be the result of vocal fold tissue trauma caused by excessive mechanical stress, including repeated or chronic vocal overuse, abuse, or misuse. Predisposing factors include profession, gender, dehydration, respiratory infection, and other inflammatory factors.

Falsetto is the vocal register occupying the frequency range just above the modal voice register and overlapping with it by approximately one octave.

<span class="mw-page-title-main">Vocal register</span> Range of tones a certain voice type can reliably produce

A vocal register is a range of tones in the human voice produced by a particular vibratory pattern of the vocal folds. These registers include modal voice, vocal fry, falsetto, and the whistle register. Registers originate in laryngeal function. They occur because the vocal folds are capable of producing several different vibratory patterns. Each of these vibratory patterns appears within a particular range of pitches and produces certain characteristic sounds.

A hoarse voice, also known as dysphonia or hoarseness, is when the voice involuntarily sounds breathy, raspy, or strained, or is softer in volume or lower in pitch. A hoarse voice can be associated with a feeling of unease or scratchiness in the throat. Hoarseness is often a symptom of problems in the vocal folds of the larynx. It may be caused by laryngitis, which in turn may be caused by an upper respiratory infection, a cold, or allergies. Cheering at sporting events, speaking loudly in noisy situations, talking for too long without resting one's voice, singing loudly, or speaking with a voice that is too high or too low can also cause temporary hoarseness. A number of other causes for losing one's voice exist, and treatment is generally by resting the voice and treating the underlying cause. If the cause is misuse or overuse of the voice, drinking plenty of water may alleviate the problems.

Vocal cord paresis, also known as recurrent laryngeal nerve paralysis or vocal fold paralysis, is an injury to one or both recurrent laryngeal nerves (RLNs), which control all intrinsic muscles of the larynx except for the cricothyroid muscle. The RLN is important for speaking, breathing and swallowing.

Spasmodic dysphonia, also known as laryngeal dystonia, is a disorder in which the muscles that generate a person's voice go into periods of spasm. This results in breaks or interruptions in the voice, often every few sentences, which can make a person difficult to understand. The person's voice may also sound strained or they may be nearly unable to speak. Onset is often gradual and the condition is lifelong.

Puberphonia is a functional voice disorder that is characterized by the habitual use of a high-pitched voice after puberty, hence why many refer to the disorder as resulting in a 'falsetto' voice. The voice may also be heard as breathy, rough, and lacking in power. The onset of puberphonia usually occurs in adolescence, between the ages of 11 and 15 years, at the same time as changes related to puberty are occurring. This disorder usually occurs in the absence of other communication disorders.

The vocal fry register is the lowest vocal register and is produced through a loose glottal closure that permits air to bubble through slowly with a popping or rattling sound of a very low frequency. During this phonation, the arytenoid cartilages in the larynx are drawn together, which causes the vocal folds to compress rather tightly and become relatively slack and compact. This process forms a large and irregularly vibrating mass within the vocal folds that produces the characteristic low popping or rattling sound when air passes through the glottal closure. The register can extend far below the modal voice register, in some cases up to 8 octaves lower, such as in the case of Tim Storms who holds the world record for lowest frequency note ever produced by a human, a G−7, which is only 0.189 Hz, inaudible to the human ear.

Modal voice is the vocal register used most frequently in speech and singing in most languages. It is also the term used in linguistics for the most common phonation of vowels. The term "modal" refers to the resonant mode of vocal folds; that is, the optimal combination of airflow and glottal tension that yields maximum vibration.

Videokymography is a high-speed medical imaging method to visualize the human vocal fold vibration dynamics. It was invented by Jan G. Švec under the guidance of Harm K. Schutte.

<span class="mw-page-title-main">Electroglottograph</span>

The electroglottograph, or EGG, is a device used for the noninvasive measurement of the degree of contact between the vibrating vocal folds during voice production. Though it is difficult to verify the assumption precisely, the aspect of contact being measured by a typical EGG unit is considered to be the vocal fold contact area (VFCA). To measure VFCA, electrodes are applied on the surface of the neck so that the EGG records variations in the transverse electrical impedance of the larynx and nearby tissues by means of a small A/C electric current. This electrical impedance will vary slightly with the area of contact between the moist vocal folds during the segment of the glottal vibratory cycle in which the folds are in contact. However, because the percentage variation in the neck impedance caused by vocal fold contact can be extremely small and varies considerably between subjects, no absolute measure of contact area is obtained, only the pattern of variation for a given subject.

<span class="mw-page-title-main">Voice therapy</span> Used to aid voice disorders or altering quality of voice

Voice therapy consists of techniques and procedures that target vocal parameters, such as vocal fold closure, pitch, volume, and quality. This therapy is provided by speech-language pathologists and is primarily used to aid in the management of voice disorders, or for altering the overall quality of voice, as in the case of transgender voice therapy. Vocal pedagogy is a related field to alter voice for the purpose of singing. Voice therapy may also serve to teach preventive measures such as vocal hygiene and other safe speaking or singing practices.

Jan Švec is a Czech voice scientist. He is the inventor of videokymography, a method for high-speed visualization of vocal-fold vibrations, which is being used for advanced diagnosis of voice disorders.

<span class="mw-page-title-main">Elephant communication</span> Communication between elephants

Elephants communicate with each other in various ways, including touching, visual displays, vocalisations, seismic vibrations, and semiochemicals.

References

  1. Herbst, C. T.; Fitch, W. T.; Svec, J. G. (2010). "Electroglottographic wavegrams: a technique for visualizing vocal fold dynamics noninvasively" (PDF). J Acoust Soc Am. 128 (5): 3070–3078. doi:10.1121/1.3493423. PMID   21110602. S2CID   6208240. Archived from the original (PDF) on 2019-12-31.
  2. Herbst, C. T., Fitch, W. T., Schlömicher-Thier, J., and Svec, J. G. (2011). "Observing the female middle register using EGG wavegrams.", in 9th Pan-European Voice Conference (PEVOC) (Marseilles, France).
  3. Manfredi, Claudia (2011). Models and analysis of vocal emissions for biomedical applications : 7th international workshop; August 25 - 27, 2011. Firenze University Press. ISBN   978-88-6655-011-2.
  4. Unger, J., Meyer, T., Herbst, C. T., Döllinger, M., and Lohscheller, J. (2011). "PVG-Wavegramm: Dreidimensionale Visualisierung von Stimmlippendynamik", in 28. Wissenschaftliche Jahrestagung der Deutschen Gesellschaft für Phoniatrie und Pädaudiologie e. V. (Zurich, Switzerland). - http://www.egms.de/static/en/meetings/dgpp2011/11dgpp40.shtml