INTSINT

Last updated October 28, 2024

INTSINT is an acronym for INternational Transcription System for INTonation.

Description

INTSINT codes the intonation of an utterance by means of an alphabet of 8 discrete symbols constituting a surface phonological representation of the intonation:

T (Top), H (Higher), U (Upstepped), S (Same), M (mid), D (Downstepped), L (Lower), B (Bottom).

These tonal symbols are considered phonological in that they represent discrete categories and surface since each tonal symbol corresponds to a directly observable property of the speech signal.

Tonal alignment

The tones can be aligned with phonological constituents by means of the following alignment diacritics following the tonal symbol:

[ (initial), < (early), : (medial), > (late), ] (final)

The relevant phonological constituent with which the tonal segments are aligned can be taken as the sequence of symbols between the following pair of slashes /.../.

The following is an example of a transcription using the IPA (International Phonetic Alphabet) of a possible reading of the sentence "It's time to go :

M:/ɪts/T:/taɪmtə/D<B]/ɡoʊ/

This corresponds to a Mid tone aligned with the middle of the syllable "It's" then a Top tone aligned with the middle of the unit "time to" and then a Downstepped tone aligned early in the syllable "go" and a Bottom tone aligned with the end of the same syllable.

Phonetic interpretation

The phonetic interpretation of the INTSINT tonal segments can be carried out using two speaker dependent (or even utterance dependent) parameters.

key: like a musical key, this establishes an absolute point of reference defined by a fundamental frequency value (in hertz).
range: this determines the interval between the highest and lowest pitches of the utterance.

In the current algorithm (Hirst 2004, 2005) the tonal segments can be converted to target points, like those generated by the Momel algorithm, using the following equivalences. P(i) in the following formulae refers to the current Pitch target, P(i-1) to the preceding pitch target. Pitch targets are normally calculated on a logarithmic scale.

The targets T, M and B are defined 'absolutely' without regard to the preceding targets

T: P(i) := key + range/2
M: P(i) := key
B: P(i) := key - range/2

Other targets are defined with respect to the preceding target:

H: P(i) := (P(i-1) + T) / 2
U: P(i) := (3*P(i-1) + T) / 4
S: P(i) := P(i-1)
D: P(i) := (3*P(i-1) + B) / 4
L: P(i) := (P(i-1) + B) / 2

Example

A sequence of tonal targets such as:

[M T L H L H D B]

assuming values for a female speaker of key as 240 Hz and range as 1 octave, would be converted to the following F0 targets:

[240 340 240 286 220 273 242 170]

An interesting consequence of this model is that it automatically introduces an asymptotic lowering of sequences such as H L H... such as has often been described both for languages with lexical tone and for languages where tone is only introduced by the intonation system, without the need to introduce a specific downdrift or declination component.

The particular values used for calculating the value of D and U were chosen so that in a sequence [T D] for example, the D tone is lowered by the same amount as the H tone in the sequence [T L H]. In many phonological accounts, Downstepped tones are analysed as a High tone which is lowered by the presence of a "floating" low tone, so that the surface tone [D] can be considered as underlyingly [L H].

Related Research Articles

The International Phonetic Alphabet (IPA) is an alphabetic system of phonetic notation based primarily on the Latin script. It was devised by the International Phonetic Association in the late 19th century as a standard written representation for the sounds of speech. The IPA is used by lexicographers, foreign language students and teachers, linguists, speech–language pathologists, singers, actors, constructed language creators, and translators.

A vowel is a syllabic speech sound pronounced without any stricture in the vocal tract. Vowels are one of the two principal classes of speech sounds, the other being the consonant. Vowels vary in quality, in loudness and also in quantity (length). They are usually voiced and are closely involved in prosodic variation such as tone, intonation and stress.

Tone is the use of pitch in language to distinguish lexical or grammatical meaning—that is, to distinguish or to inflect words. All oral languages use pitch to express emotional and other para-linguistic information and to convey emphasis, contrast and other such features in what is called intonation, but not all languages use tones to distinguish words or their inflections, analogously to consonants and vowels. Languages that have this feature are called tonal languages; the distinctive tone patterns of such a language are sometimes called tonemes, by analogy with phoneme. Tonal languages are common in East and Southeast Asia, Africa, the Americas and the Pacific.

A syllable is a unit of organization for a sequence of speech sounds, typically made up of a syllable nucleus with optional initial and final margins. Syllables are often considered the phonological "building blocks" of words. They can influence the rhythm of a language, its prosody, its poetic metre and its stress patterns. Speech can usually be divided up into a whole number of syllables: for example, the word ignite is made of two syllables: ig and nite.

In linguistics, and particularly phonology, stress or accent is the relative emphasis or prominence given to a certain syllable in a word or to a certain word in a phrase or sentence. That emphasis is typically caused by such properties as increased loudness and vowel length, full articulation of the vowel, and changes in tone. The terms stress and accent are often used synonymously in that context but are sometimes distinguished. For example, when emphasis is produced through pitch alone, it is called pitch accent, and when produced through length alone, it is called quantitative accent. When caused by a combination of various intensified properties, it is called stress accent or dynamic accent; English uses what is called variable stress accent.

In the phonology of the Romanian language, the phoneme inventory consists of seven vowels, two or four semivowels, and twenty consonants. In addition, as with other languages, other phonemes can occur occasionally in interjections or recent borrowings.

In linguistics, prosody is the study of elements of speech that are not individual phonetic segments but which are properties of syllables and larger units of speech, including linguistic functions such as intonation, stress, and rhythm. Such elements are known as suprasegmentals.

The sound system of Norwegian resembles that of Swedish. There is considerable variation among the dialects, and all pronunciations are considered by official policy to be equally correct – there is no official spoken standard, although it can be said that Eastern Norwegian Bokmål speech has an unofficial spoken standard, called Urban East Norwegian or Standard East Norwegian, loosely based on the speech of the literate classes of the Oslo area. This variant is the most common one taught to foreign students.

This article describes the phonology of the Somali language.

Downstep is a phenomenon in tone languages in which if two syllables have the same tone, the second syllable is lower in pitch than the first.

Momel is an algorithm developed by Daniel Hirst and Robert Espesser at the CNRS Laboratoire Parole et Langage, Aix-en-Provence: for the analysis and synthesis of intonation patterns.

In linguistics, intonation is the variation in pitch used to indicate the speaker's attitudes and emotions, to highlight or focus an expression, to signal the illocutionary act performed by a sentence, or to regulate the flow of discourse. For example, the English question "Does Maria speak Spanish or French?" is interpreted as a yes-or-no question when it is uttered with a single rising intonation contour, but is interpreted as an alternative question when uttered with a rising contour on "Spanish" and a falling contour on "French". Although intonation is primarily a matter of pitch variation, its effects almost always work hand-in-hand with other prosodic features. Intonation is distinct from tone, the phenomenon where pitch is used to distinguish words or to mark grammatical features.

In linguistics, a prosodic unit is a segment of speech that occurs with specific prosodic properties. These properties can be those of stress, intonation, or tonal patterns.

Phonological development refers to how children learn to organize sounds into meaning or language (phonology) during their stages of growth.

The phonology of second languages is different from the phonology of first languages in various ways. The differences are considered to come from general characteristics of second languages, such as slower speech rate, lower proficiency than native speakers, and from the interaction between non-native speakers' first and second languages.

Lithuanian has eleven vowels and 45 consonants, including 22 pairs of consonants distinguished by the presence or absence of palatalization. Most vowels come in pairs which are differentiated through length and degree of centralization.

ToBI is a set of conventions for transcribing and annotating the prosody of speech. The term "ToBI" is sometimes used to refer to the conventions used for describing American English specifically, which was the first ToBI system, developed by Mary Beckman and Janet Pierrehumbert, among others. Other ToBI systems have been defined for a number of languages; for example, J-ToBI refers to the ToBI conventions for Tokyo Japanese, and an adaptation of ToBI to describe Dutch intonation was developed by Carlos Gussenhoven, and called ToDI. Another variation of ToBI, called IViE, was established in 1998 to enable comparison between several dialects of British English.

The phonology of Danish is similar to that of the other closely related Scandinavian languages, Swedish and Norwegian, but it also has distinct features setting it apart. For example, Danish has a suprasegmental feature known as stød which is a kind of laryngeal phonation that is used phonemically. It also exhibits extensive lenition of plosives, which is noticeably more common than in the neighboring languages. Because of these and a few other features, spoken Danish can be challenging for Norwegians and Swedes to understand without training, although they can easily read written Danish.

Pitch accent is a term used in autosegmental-metrical theory for local intonational features that are associated with particular syllables. Within this framework, pitch accents are distinguished from both the abstract metrical stress and the acoustic stress of a syllable. Different languages specify different relationships between pitch accent and stress placement.

The term boundary tone refers to a rise or fall in pitch that occurs in speech at the end of a sentence or other utterance, or, if a sentence is divided into two or more intonational phrases, at the end of each intonational phrase. It can also refer to a low or high intonational tone at the beginning of an utterance or intonational phrase.

References

Hirst, D.J. & Di Cristo, A. (eds) 1998. Intonation Systems. A survey of Twenty Languages. (Cambridge, Cambridge University Press). [ ISBN 0-521-39513-5 (Hardback); 052139550X (Paperback)].
Hirst, D.J. 2004. Lexical and Non-lexical Tone and Prosodic Typology. in Proceedings of International Symposium on Tonal Aspects of Languages. Beijing, March 2004, 81-88
Hirst, D.J. 2005. Form and function in the representation of speech prosody. in K.Hirose, D.J.Hirst & Y.Sagisaka (eds) Quantitative prosody modeling for natural speech description and generation (=Speech Communication 46 (3-4)), 334-347

External links

INTSINT automatic annotation can be performed with SPPAS: SPPAS - Automatic Annotation of Speech

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.