Discontinuity (linguistics)

Last updated

In linguistics, a discontinuity occurs when a given word or phrase is separated from another word or phrase that it modifies in such a manner that a direct connection cannot be established between the two without incurring crossing lines in the tree structure. The terminology that is employed to denote discontinuities varies depending on the theory of syntax at hand. The terms discontinuous constituent, displacement, long distance dependency, unbounded dependency, and projectivity violation are largely synonymous with the term discontinuity.[ clarification needed ] There are various types of discontinuities, the most prominent and widely studied of these being topicalization, wh-fronting, scrambling, [1] and extraposition.

Contents

Natural languages vary with respect to the types of discontinuities that they permit. The fixed word order of English allows for relatively few discontinuities compared to, for instance, the Slavic languages, which are much more permissive. Even compared to a closely related language such as German, English is rigid, allowing few discontinuities.

Projectivity

Projectivity is a principle of tree structures by which discontinuities are identified and defined. A tree structure is said to be projective if there are no crossing dependency edges and/or projection lines. [2] If there are crossing edges/lines however, then the structure is non-projective and contains one or more projectivity violations. The concept and terminology of projectivity is associated most with the structures of dependency grammar , although the concept is just as applicable to the structures of phrase structure grammars (= constituency grammars). The discussion here considers projectivity first in terms of dependency-based structures and then in terms of constituency-based structures.

The flatter the tree structures are, the fewer projectivity violations they will contain. As structures become more layered, the number of projectivity violations can increase.

Dependency-based projectivity

The following trees illustrate projective and non-projective structures in a dependency-based analysis of sentence structure: [3]

Discontinuity.jpg

The trees show all six possible three-level structures conceivable for the three element hierarchy circle-square-triangle. Four of these six trees are projective because they contain no crossing lines. Trees (c) and (e), however, are not projective due to the crossing lines; each of (c) and (e) contains a projectivity violation because a solid dependency edge crosses a dotted projection line. In other words, each of (c) and (e) contains a discontinuity. The trees illustrate when discontinuities occur: if a given constituent (= complete subtree) is separated from its head by an element that dominates its head, a discontinuity is obtained.

Dependency grammars have explored the projectivity principle in great detail and have formalized it rigorously. [4] The concept is, however, a simple one. If crossing lines are obtained in the tree, projectivity has been violated, meaning a discontinuity is present.

Constituency-based projectivity

The terminology that constituency grammars (= phrase structure grammars) employ to identify and define discontinuities is different. The projectivity principle certainly exists, although it is acknowledged in terms of discontinuous constituents, [5] long distance dependencies, and/or unbounded dependencies. The constituency-based versions of the six hierarchies from the previous section are rendered as follows. The solid shapes represent phrasal categories, and the empty shapes lexical categories:

Discontinuity-3.jpg

The crossing lines again identify projectivity violations. Hence trees (c) and (e) here contain discontinuities, just as they do further above. The solid square constituents in tree (c) and tree (e) are discontinuous in a manner similar to that of the empty square constituents in the dependency-based trees (c) and (e) above.

Types

Discontinuities occur in most if not all natural languages. But the types of discontinuities that a given language allows may differ from the discontinuities of another language, even if the languages are closely related. Worth noting is that a large majority of sentences in most languages are projective, i.e. they do not contain discontinuities. Only about 15–25% of actual sentences contain a discontinuity, and the percentage of discontinuous dependencies is even much less, approximately 1–2%. [6]

The following discussion briefly considers four widely acknowledged types of discontinuities: 1) topicalization, 2) wh-fronting, 3) scrambling, and 4) extraposition. English allows three of the four; it does not allow scrambling. Examples from German are therefore used to illustrate scrambling discontinuities.

Topicalization

Topicalization occurs when a constituent is fronted in order to establish it as the topic. The topicalization of argument NPs is rare in English, but adjunct prepositional phrases (PPs) that establish situational context are frequently topicalized. Topicalization is illustrated here (and further below) using both a constituency- and a dependency-based tree, the constituency-based tree on the left and the dependency-based tree on the right. The convention is employed throughout whereby the words themselves are used to label the nodes in the trees:

That idea.jpg

The crossing lines are clearly visible in these trees. Further examples of topicalization discontinuities (but without the trees):

Due to the weather, we are sure that they left.
After school, Connor has been practicing piano.
To his wife, Jim gave a fantastic present.

The material in italics has been topicalized, which results in a discontinuity each time.

Wh-fronting

Wh-fronting (= wh-movement) occurs when a wh-expression appears at the front of the clause in order to focus it. Wh-fronting occurs in direct and indirect questions with interrogative words (e.g. how, what, when, where, which, who, why, etc.) and in relative clauses with relative pro-forms (that, what, where, which, when, who, whose, etc.). The constituency-based tree appears again on the left, and the dependency-based tree on the right:

Who has Sarah.jpg

These trees are (merely) representative of the type of analyses that one encounters in various grammars. Especially the constituency-based tree here may be disputed. Nevertheless, the crossing lines are again clearly visible. Further examples of wh-fronting discontinuities in matrix and embedded clauses:

Which house does Jim plan to buy?
Why has that been annoying you?
Nobody knows what they have been saying.
the person who we have been seeking
dangers which you have been ignoring
a politician whose flip-flops nobody is willing to forget

The material in italics has been wh-fronted. The first three examples show wh-fronting in direct or indirect interrogative clauses, and the second three illustrate wh-fronting in relative clauses.

Scrambling

Scrambling can result in a discontinuity (although it need not; sometimes scrambling is accomplished by shifting alone). Scrambling discontinuities often occur in order to accommodate the informational content of constituents, old information preferring to precede new information. Scrambling is frequently visible in the leftward striving of definite pronouns. Scrambling discontinuities are absent from English, but they are very common in languages with freer word order such as German. The following trees illustrate a scrambling discontinuity in a subordinate clause of German:

Dass sich die Leute.jpg

The crossing lines identifying the discontinuity are again completely visible. The reflexive pronoun sich strives leftward, which results in the discontinuity. Discontinuities of this sort occur frequently in German. The italicized expressions in the following examples are separated from their heads in such a manner that scrambling discontinuities obtain:

Natürlich kann uns nichts überraschen. (naturally can us nothing surprise; 'Naturally nothing can surprise us.')
Sie haben den versucht zu lesen. (they have it tried to read; 'They tried to read it.')
dass ihn viele Leute bewundert haben (that him many people admired have; 'That many people have admired him.')
Essen werde ich das nicht. (eat shall I that not; 'I shall not eat that.')

In each of these examples, the italicized constituent is displaced, resulting in a discontinuity. An important aspect of scrambling is that it can operate in both directions. The displaced unit can both precede its head as in the first three examples and follow it as in the fourth example.

Extraposition

Extraposition occurs when an expression appears further to the right of where it would appear under other (but similar) circumstances. It is motivated by the desire to focus or emphasize the extraposed expression, or it serves to reduce center embedding and in doing so, it increases right-branching, right-branching structures being easier to process in English than left-branching structures:

Something happened.jpg

The crossing lines identifying the discontinuity are again easily visible. In canonical cases, extraposition is optional, e.g.

Something that was unexpected then occurred.
Something then occurred that was unexpected.
Did anyone who you expected to help actually help?
Did anyone actually help who you expected to help?
They called someone to pick up the kids before school.
They called someone before school to pick up the kids.

One can also distinguish between two types of extraposition. The examples above are canonical cases where extraposition is optional. In cases of it-extraposition, the optionality disappears; extraposition is obligatory:

*It that it rained surprised us.
It surprised us that it rained.
*Did it that they had to study disturb them?
Did it disturb them that they had to study?

The star * indicates that the sentence is nonidiomatic. Apparently when it appears in subject or object position, it forces the modifying expression to be extraposed.

Theoretical accounts

The challenge posed by discontinuities has fascinated and vexed theories of syntax since the 1950s. Early transformational grammar, which is based on phrase structure, addressed discontinuities in terms of deep structure and surface structure and transformations that mapped constituents out of one position in Deep Structure into another position in Surface Structure. Modern theories of transformational grammar (e.g. Government and Binding Theory [7] and the Minimalist Program [8] ) assume a movement or copying procedure that moves or copies constituents out of one position into another in the course of the derivation. Representational phrase structure grammars (e.g. Head-Driven Phrase Structure Grammar [9] and Lexical Functional Grammar [10] ), in contrast, reject movement and in its stead, they assume some sort of feature passing mechanism that passes information about the displaced constituent up and down the tree.

Traditional dependency grammars (e.g. Lucien Tesnière's Structural Syntax [11] and Igor Mel'čuk's Meaning-Text Theory [12] ) approach discontinuities much differently. They tend to (posit one or more levels of syntactic structure that) abstract away from linear order and acknowledge hierarchical order alone. If linear order is taken to be (in a sense) secondary in this manner, discontinuities present less of a challenge and are therefore of secondary importance to the theory. Other dependency grammars, in contrast, take linear and hierarchical order to be of equal importance. These theories are likely to (also) pursue some sort of feature passing mechanism that passes information about the displaced unit up and down the tree.

The following trees illustrate the movement/copying type of approach to discontinuities:

That idea2.jpg

The constituent that idea is seen as being first generated in its canonical position to the right of the verb likes. It is then moved out of that position to the front of the sentence, or it is copied at the front of the sentence followed by the deletion of the lower occurrence. Due to the movement (or copying) this sort of approach to discontinuities can be called the movement approach (= derivational approach).

The alternative to the movement approach is information passing up and down the tree, e.g.

That idea3.jpg

The displaced unit is taken to appear first in its surface position (it is not moved to that position), and information (= features) about it is seen as being passed down the tree to the position where it would appear under more normal circumstances. The path of information passing is shown in red. Due to the passing of information, this sort of approach can be called the feature passing approach (= representational approach). The movement and feature passing approaches have something important in common. They both assume that there are no actual discontinuities in surface syntax (for there are no crossing lines in the trees). What appears to be a discontinuity at first blush is actually rather an indication that movement or feature passing has occurred.

Finally, one point is worth repeating. All theories must have a means of addressing discontinuities, but the nature of this means can vary significantly based upon the amount of structure that the theory at hand posits. Relatively flat structures incur significantly fewer discontinuities than more layered structures. Thus the component that addresses discontinuities plays a larger role in theories that restrict all branching to binary branching. In theories that allow n-ary branching, the role that inversion and shifting play can be greater, which reduces the role of the component of the theory that is needed to address discontinuities.

Notes

  1. Concerning scrambling, see Grewendorf and Stechow (1990).
  2. See Groß (1999: 174-187).
  3. The dependency grammar understanding of projectivity given here follows Osborne (2019: 203-206) closely.
  4. See for instance Hays (1964), Gaifman (1965), Robinson (1970), Mel'čuk (1988: 35ff.)
  5. Concerning the term "discontinuous constituent, see Wells (1947) and McCawley (1982).
  6. Concerning these numbers, see Nivre and Nilsson (2005: 99).
  7. See Chomsky (1981).
  8. See Chomsky (1995).
  9. See Pollard and Sag (1994).
  10. See Bresnan (2001).
  11. See Tesnière (1959).
  12. See Mel'čuk (1988).

Related Research Articles

In linguistics, syntax is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure (constituency), agreement, the nature of crosslinguistic variation, and the relationship between form and meaning (semantics). There are numerous approaches to syntax that differ in their central assumptions and goals.

In linguistics, a verb phrase (VP) is a syntactic unit composed of a verb and its arguments except the subject of an independent clause or coordinate clause. Thus, in the sentence A fat man quickly put the money into the box, the words quickly put the money into the box constitute a verb phrase; it consists of the verb put and its arguments, but not the subject a fat man. A verb phrase is similar to what is considered a predicate in traditional grammars.

In linguistics, a determiner phrase (DP) is a type of phrase headed by a determiner such as many. Controversially, many approaches, take a phrase like not very many apples to be a DP, headed, in this case, by the determiner many. This is called the DP analysis or the DP hypothesis. Others reject this analysis in favor of the more traditional NP analysis where apples would be the head of the phrase in which the DP not very many is merely a dependent. Thus, there are competing analyses concerning heads and dependents in nominal groups. The DP analysis developed in the late 1970s and early 1980s, and it is the majority view in generative grammar today.

A movement paradox is a phenomenon of grammar that challenges the transformational approach to syntax. The importance of movement paradoxes is emphasized by those theories of syntax that reject movement, i.e. the notion that discontinuities in syntax are explained by the movement of constituents.

Dependency grammar (DG) is a class of modern grammatical theories that are all based on the dependency relation and that can be traced back primarily to the work of Lucien Tesnière. Dependency is the notion that linguistic units, e.g. words, are connected to each other by directed links. The (finite) verb is taken to be the structural center of clause structure. All other syntactic units (words) are either directly or indirectly connected to the verb in terms of the directed links, which are called dependencies. Dependency grammar differs from phrase structure grammar in that while it can identify phrases it tends to overlook phrasal nodes. A dependency structure is determined by the relation between a word and its dependents. Dependency structures are flatter than phrase structures in part because they lack a finite verb phrase constituent, and they are thus well suited for the analysis of languages with free word order, such as Czech or Warlpiri.

The term phrase structure grammar was originally introduced by Noam Chomsky as the term for grammar studied previously by Emil Post and Axel Thue. Some authors, however, reserve the term for more restricted grammars in the Chomsky hierarchy: context-sensitive grammars or context-free grammars. In a broader sense, phrase structure grammars are also known as constituency grammars. The defining trait of phrase structure grammars is thus their adherence to the constituency relation, as opposed to the dependency relation of dependency grammars.

In syntactic analysis, a constituent is a word or a group of words that function as a single unit within a hierarchical structure. The constituent structure of sentences is identified using tests for constituents. These tests apply to a portion of a sentence, and the results provide evidence about the constituent structure of the sentence. Many constituents are phrases. A phrase is a sequence of one or more words built around a head lexical item and working as a unit within a sentence. A word sequence is shown to be a phrase/constituent if it exhibits one or more of the behaviors discussed below. The analysis of constituent structure is associated mainly with phrase structure grammars, although dependency grammars also allow sentence structure to be broken down into constituent parts.

In linguistics, wh-movement is the formation of syntactic dependencies involving interrogative words. An example in English is the dependency formed between what and the object position of doing in "What are you doing?" Interrogative forms are sometimes known within English linguistics as wh-words, such as what, when, where, who, and why, but also include other interrogative words, such as how. This dependency has been used as a diagnostic tool in syntactic studies as it can be observed to interact with other grammatical constraints.

In linguistics, pied-piping is a phenomenon of syntax whereby a given focused expression brings along an encompassing phrase with it when it is moved.

In generative grammar, non-configurational languages are languages characterized by a flat phrase structure, which allows syntactically discontinuous expressions, and a relatively free word order.

Topicalization is a mechanism of syntax that establishes an expression as the sentence or clause topic by having it appear at the front of the sentence or clause. This involves a phrasal movement of determiners, prepositions, and verbs to sentence-initial position. Topicalization often results in a discontinuity and is thus one of a number of established discontinuity types, the other three being wh-fronting, scrambling, and extraposition. Topicalization is also used as a constituency test; an expression that can be topicalized is deemed a constituent. The topicalization of arguments in English is rare, whereas circumstantial adjuncts are often topicalized. Most languages allow topicalization, and in some languages, topicalization occurs much more frequently and/or in a much less marked manner than in English. Topicalization in English has also received attention in the pragmatics literature.

In linguistics, raising constructions involve the movement of an argument from an embedded or subordinate clause to a matrix or main clause; in other words, a raising predicate/verb appears with a syntactic argument that is not its semantic argument, but is rather the semantic argument of an embedded predicate. For example, in they seem to be trying, the predicand of trying is the subject of seem. Although English has raising constructions, not all languages do.

Exceptional case-marking (ECM), in linguistics, is a phenomenon in which the subject of an embedded infinitival verb seems to appear in a superordinate clause and, if it is a pronoun, is unexpectedly marked with object case morphology. The unexpected object case morphology is deemed "exceptional". The term ECM itself was coined in the Government and Binding grammar framework although the phenomenon is closely related to the accusativus cum infinitivo constructions of Latin. ECM-constructions are also studied within the context of raising. The verbs that license ECM are known as raising-to-object verbs. Many languages lack ECM-predicates, and even in English, the number of ECM-verbs is small. The structural analysis of ECM-constructions varies in part according to whether one pursues a relatively flat structure or a more layered one.

Syntactic movement is the means by which some theories of syntax address discontinuities. Movement was first postulated by structuralist linguists who expressed it in terms of discontinuous constituents or displacement. Some constituents appear to have been displaced from the position in which they receive important features of interpretation. The concept of movement is controversial and is associated with so-called transformational or derivational theories of syntax. Representational theories, in contrast, reject the notion of movement and often instead address discontinuities with other mechanisms including graph reentrancies, feature passing, and type shifters.

Scrambling is a syntactic phenomenon wherein sentences can be formulated using a variety of different word orders without any change in meaning. Scrambling often results in a discontinuity since the scrambled expression can end up at a distance from its head. Scrambling does not occur in English, but it is frequent in languages with freer word order, such as German, Russian, Persian and Turkic languages. The term was coined by Haj Ross in his 1967 dissertation and is widely used in present work, particularly with the generative tradition.

In linguistics, gapping is a type of ellipsis that occurs in the non-initial conjuncts of coordinate structures. Gapping usually elides minimally a finite verb and further any non-finite verbs that are present. This material is "gapped" from the non-initial conjuncts of a coordinate structure. Gapping exists in many languages, but by no means in all of them, and gapping has been studied extensively and is therefore one of the more understood ellipsis mechanisms. Stripping is viewed as a particular manifestation of the gapping mechanism where just one remnant appears in the gapped/stripped conjunct.

In linguistics, negative inversion is one of many types of subject–auxiliary inversion in English. A negation or a word that implies negation or a phrase containing one of these words precedes the finite auxiliary verb necessitating that the subject and finite verb undergo inversion. Negative inversion is a phenomenon of English syntax. Other Germanic languages have a more general V2 word order, which allows inversion to occur much more often than in English, so they may not acknowledge negative inversion as a specific phenomenon. While negative inversion is a common occurrence in English, a solid understanding of just what elicits the inversion has not yet been established. It is, namely, not entirely clear why certain fronted expressions containing a negation elicit negative inversion, but others do not.

In linguistics, a catena is a unit of syntax and morphology, closely associated with dependency grammars. It is a more flexible and inclusive unit than the constituent and its proponents therefore consider it to be better suited than the constituent to serve as the fundamental unit of syntactic and morphosyntactic analysis.

In syntax, shifting occurs when two or more constituents appearing on the same side of their common head exchange positions in a sense to obtain non-canonical order. The most widely acknowledged type of shifting is heavy NP shift, but shifting involving a heavy NP is just one manifestation of the shifting mechanism. Shifting occurs in most if not all European languages, and it may in fact be possible in all natural languages including sign languages. Shifting is not inversion, and inversion is not shifting, but the two mechanisms are similar insofar as they are both present in languages like English that have relatively strict word order. The theoretical analysis of shifting varies in part depending on the theory of sentence structure that one adopts. If one assumes relatively flat structures, shifting does not result in a discontinuity. Shifting is often motivated by the relative weight of the constituents involved. The weight of a constituent is determined by a number of factors: e.g., number of words, contrastive focus, and semantic content.

Extraposition is a mechanism of syntax that alters word order in such a manner that a relatively "heavy" constituent appears to the right of its canonical position. Extraposing a constituent results in a discontinuity and in this regard, it is unlike shifting, which does not generate a discontinuity. The extraposed constituent is separated from its governor by one or more words that dominate its governor. Two types of extraposition are acknowledged in theoretical syntax: standard cases where extraposition is optional and it-extraposition where extraposition is obligatory. Extraposition is motivated in part by a desire to reduce center embedding by increasing right-branching and thus easing processing, center-embedded structures being more difficult to process. Extraposition occurs frequently in English and related languages.

References