In theoretical linguistics, underspecification is an analytic strategy in which a linguistic representation omits the value of one or more features, leaving those values to be supplied by general principles (e.g. redundancy rules, harmony, default inheritance, or constraint interaction). [1] [2] The term is used in phonology and phonetics for segmental and suprasegmental features, [1] [2] in morphology for partially specified feature bundles in inflection and syncretism (often in analyses of exponent choice), [3] [4] and in computational semantics and natural-language processing for representing ambiguities (especially scope) without committing to a fully resolved reading (“scope underspecification”). [5]
In phonology, underspecification is often used to distinguish contrastive from predictable feature values in underlying representations. Predictable values may be supplied by redundancy/defaults or by processes such as assimilation and harmony, helping capture patterns over natural classes without listing every feature value in the lexicon. [1] [6]
Restricted underspecification holds that features should be underspecified only when their values are predictable from general principles (i.e. redundant or non-contrastive in a given context). [1] For example, in many analyses of English vowel systems, lip rounding can be treated as redundant for (most) front vowels, so a representation may omit an explicit [−round] specification for those segments and derive unroundedness by redundancy. [7]
Radical underspecification allows traditionally binary features to be lexically specified for only one value (often the marked value), with the opposite value supplied by default when no specification is present. [6] A frequently discussed illustration involves laryngeal contrast: instead of [+voice] and [−voice], a system may specify only [+voice], treating voicelessness as the default for segments lacking a voicing specification. [6]
In descriptions of Tuvan vowel harmony, a high vowel in certain suffixes alternates in backness and rounding according to its harmonic environment. In one analysis, the suffix vowel is represented as a high vowel unspecified for backness and rounding in the underlying representation, with those values supplied by harmony. [8]
| vowel (UR / surface) | height | backness | roundedness |
|---|---|---|---|
| /I/ | high | — | — |
| [i] | high | front | unrounded |
| [y] | high | front | rounded |
| [ɯ] | high | back | unrounded |
| [u] | high | back | rounded |
In this analysis, the same underlying high vowel may surface as [i,y,ɯ,u] depending on the harmony class of adjacent vowels. [8]
Harrison represents the 3rd-person possessive suffix as /-(z)I/, which surfaces with different backness and rounding values in different harmonic environments (examples and transcription conventions follow the source): [8]
| suffix (UR) | → | surface form | gloss |
|---|---|---|---|
| -(z)I | '3' (third-person possessive) | ||
| o˚g-(z)I | → | o˚g-u˚ | 'glottis-3' |
| xol-(z)I | → | xol-u | 'hand-3' |
| suur-(z)I | → | suur-u | 'village-3' |
| er-(z)I | → | er-i | 'man-3' |
| xar-(z)I | → | xar-Ï | 'snow-3' |
| ava-(z)I | → | ava-zÏ | 'mother-3' |
These forms illustrate an underspecification analysis of suffix harmony in which a single morphological suffix is associated with an underlying vowel lacking backness and rounding specifications, while the surface values are determined by vowel harmony. [8]
In morphology, underspecification is used in analyses where a morpheme is not specified for the full set of morphosyntactic features that characterize the environments in which it can occur. Such partially specified morphemes are therefore compatible with multiple contexts, and the approach is often used to account for syncretism and default realization patterns. [3] [4] For example, some analyses of English treat bound-variable singular they as lacking a fixed gender specification in contexts where it can be used with a wide range of antecedents. [9]
A common illustration in inflectional morphology comes from syncretic paradigms in languages such as German. German determiners, adjectives, and nouns show extensive syncretism across case, number, and gender, and the definite-article paradigm illustrates this pattern: [10]
| Singular | Plural | |||
|---|---|---|---|---|
| Masc. | Fem. | Neut. | ||
| Nominative | der | die | das | die |
| Genitive | des | der | des | der |
| Dative | dem | der | dem | den |
| Accusative | den | die | das | die |
The paradigm exhibits syncretism, with the same surface form realizing multiple case–number–gender combinations; Die occurs in several nominative/accusative and plural cells, and der occurs in genitive/dative feminine as well as genitive plural. [10]
Some underspecification-based accounts propose decomposing morphosyntactic categories into smaller binary features, so that rules or morphological exponents can refer to natural classes involved in agreement and syncretism (for example, “non-feminine”) rather than to individual genders. [4] [11] In one proposal, German gender is represented using the binary features [±masc] and [±fem], with neuter analyzed as [−masc, −fem] and the combination [+masc, +fem] not used for any gender in the system described. [11]
| Masculine | [+masc, −fem] |
|---|---|
| Feminine | [−masc, +fem] |
| Neuter | [−masc, −fem] |
| (unused combination) | [+masc, +fem] |
In such representations, masculine and neuter share the feature [−fem]; this shared specification has been used in analyses of syncretic exponents that are compatible with both genders without being fully specified for gender. [4]
In computational and formal-semantic work, underspecification is often used to represent ambiguities (especially scope) while avoiding an immediate explosion of fully resolved readings. Frameworks such as Minimal recursion semantics encode constraints on scope without forcing a choice among all resolved interpretations, which can be useful for parsing and generation. [5] Related work develops solvers and conversions among scope-underspecification formalisms (including MRS-style representations). [12]
In constraint-based grammar formalisms, a related notion is feature indeterminacy, where an item can satisfy conflicting requirements because its feature value is not fully determined. This has been studied for case and agreement phenomena, and is sometimes contrasted with (or related to) underspecification depending on the formal machinery adopted. [13]