Heavy NP shift

Last updated

Heavy NP shift is an operation that involves re-ordering (shifting) a "heavy" noun phrase (NP) to a position to the right of its canonical position under certain circumstances. The heaviness of the NP is determined by its grammatical complexity; whether or not shifting occurs can impact the grammaticality of the sentence.

Contents

Ross (1967) is thought to have been the first to systematically investigate the properties of heavy NP shift. Although the term heavy NP shift derives from the theoretical framework of transformational grammar, which describes the process in terms of movement of the NP, linguists working in other frameworks also use this term. And in approaches to syntax where NP is analyzed as a determiner phrase (DP), heavy NP shift is called heavy DP shift.

Factors contributing to the heaviness of an NP

Early observations of weight or heaviness have been credited to a rhythmical feel that unconsciously takes shape in languages, that is, the tendency of growth, to go from shorter to longer elements within a phrase. In these early observations, it was noted that when there are two constituents differing in size, the smaller constituent usually precedes the longer one. [1] This observation has been more recently coined as the "principle of end-weight". [2] The literature proposes many different definitions of weight or heaviness such as number of words in the NP (length of NP), number of nodes in the NP, and more generally number of phrasal nodes. [3] Other observations on heaviness include:

Complexity of NP

  • A NP's internal syntactic structure makes it 'complex'. A complex or heavy NP is one with a noun head and a modifying clause. [4]
ex: NP[my rich uncle from Detroit] is complex as its noun head my and a modifying clause rich uncle from Detroit. The internal structure is complex enough to be shifted to the right of its canonical position.
NP[N'[N my][NP[NP[AP rich]NP uncle[PP[P from][NP Detroit]]]]]

Separability of NP

  • According to Chomsky (1975), [5] the separability of the indirect object is determined by the complexity of the direct object (NP). It is not length in words but rather the complexity of the object that determines the "naturalness of formation".
ex: "They brought all the leaders of the riot in" appears more natural than "They brought the man I saw in." According to Chomsky (1975), the constituent [the man I saw], although shorter than the constituent [all the leaders of the riot], is more complex. [6]

Relative weight of NP

  • The relative weight of post verbal constituents influences their order, not the weight of any one constituent such as NP. [7] One study found that independent of NP length and complexity, the NP did not shift unless it was at least 4 words longer than the surrounding material. [8]
ex: The canonical phrase, I gave NP[the valuable book that was extremely difficult to find] PP[to Mary], is more likely to shift the heavy NP to the end of the phrase since the NP is more than 4 words longer than the following prepositional phrase (PP). [9]

Early versus late attachment of NP

  • A theory of Early Immediate Constituents (EIC) [10] seeks to explain word order variation at a phrase level of grammaticalization or more simply, a "save the hardest for last" [11] approach. Constituents are syntactically grouped so they can be recognized and produced rapidly and efficiently as possible. The parser must analyze the VP and its immediate constituents then calculate how early the listener will be able to identify all the constituents by counting the number of words it takes until all immediate constituents (IC's) are recognizable. [12] [13]
ex: According to EIC, the phrase, [I gave1 to2 Mary3the4 valuable book that was extremely difficult to find] is easy to remember because this phrase only requires 4 words (indicated by subscript numbers) to find all IC's. Compare this to, [I gave1the2 valuable3 book4 that5 was6 extremely7 difficult8 to9 find10 to11 Mary], which requires 11 words to find all IC's. According to the EIC, the first example is preferred because it is easier to comprehend. [14]
The EIC theory also proposes that word and constituent order varies cross linguistically and is dependent on whether a language is head final (such as Japanese and Korean) or head initial (such as English). [15]

New information precedes old information

  • Pragmatic effects have been thought to influence heavy NP shift. According to this theory, items are sequenced so unpredictable content (new or important information) precedes predictable content (old or unimportant information). [16]

Heavy NP shift in different languages

Heavy NP shift in English

Examples 1a, 2a, and 3a all show canonical order whereas 1b, 2b, 3b show the NP shifted to the right of its canonical position:

1. a) I gave the book that I bought last week to Mary.
1. b) I gave to Mary the book that I bought last week.
2. a) I met my rich uncle from Detroit on the street.
2. b) I met on the street my rich uncle from Detroit.
3. a) I sent the recipes from the paper that I was talking about to you.
3. b) I sent to you the recipes from the paper that I was talking about.

If a constituent is not considered 'heavy' or 'complex', the shift would not make grammatical sense or would be considered awkward to a native speaker of English, e.g.

4. a) I sent it to you.
4. b) *I sent to you it.

In example 4, the verb sent has a personal pronoun as its direct object. Typically, nothing can intervene between a verb and its direct object in most dialects of English, e.g.

5. a) I consider the problem unsolvable.
5. b) *I consider unsolvable the problem.

In example 5b, the NP the problem is not complex enough to be shifted.

Heavy NP shift in Japanese

Japanese is an SOV (subject, object, verb) language. Particles in Japanese encode syntactic functions, marking the relationship of the subject and object (NP) to the verb. This allows Japanese speakers to 'scramble' the order of particle-marked phrases which is important when considering word ordering biases such as heavy NP shift. In corpus studies and experiments, it has been shown that Japanese-speakers have a long-before-short bias, contrasting English biases of short-before-long in the presence of a heavy NP. [17]

In an experiment, [18] participants had to create a sentence from a set of phrases that varied in length of the direct object (DO) and indirect object (IO) for dative verbs. In Japanese, the order for dative verbs is; subject (S) -indirect object (IO) -direct object (DO). For example, the 'short' phrase:

Masako wa(S) otoko ni(IO) keeki o(DO) haitatusita(V), translates to, Masako delivered the cake to the man.

In this study, participants chose this order for a short phrase the majority of the time, without rearranging the order of the IO and DO. However, when the DO and IO were lengthened, the participants were more likely to shift the longer phrase before the short phrase even if it switched the order from S-IO-DO to S-DO-IO. For example, the phrase:

Masako wa(S) otoko ni(IO) sinbun de syookaisareta keeki o(DO) haitatusita(V).

was shifted so that the long DO was before the short IO:

Masako wa(S) sinbun de syookaisareta keeki o(DO) otoko ni(IO) haitatusita(V).

If the IO was longer than the DO, the participants were less likely to switch the order, leaving it as S-IO-DO. This suggests that Japanese speakers have a bias towards placing longer phrases before shorter ones.

Heavy NP shift in German

German word order is variable and flexible and relies on a rich inflectional system. Rearrangements of an NP and PP can take place similar to English i.e. [NP PP V] or [PP NP V]. German does not allow a direct object NP to the right of a final verb. [19] The examples below show the rearrangement of a heavy NP and a PP. Either order is acceptable in German as word order variation is not as strict as it is in other languages such as English.

1 a.)

Ich

I

habe

have

[NP

 

meinen

my

reichen

rich

Onkel

uncle

aus

from

Berlin]

Berlin

[PP

 

auf

on

der

the

Straße]

street

[V

 

getroffen].

met

Ich habe [NP meinenreichenOnkelausBerlin] [PP auf der Straße] [V getroffen].

I have {} my rich uncle from Berlin {} on the street {} met

'I met [NP my rich uncle from Berlin] [PP on the street].'

1 b.)

Ich

I

habe

have

[PP

 

auf

on

der

the

Straße]

street

[NP

 

meinen

my

reichen

rich

Onkel

uncle

aus

from

Berlin]

Berlin

[V

 

getroffen].

met

Ich habe [PP auf der Straße] [NP meinenreichenOnkelausBerlin] [V getroffen].

I have {} on the street {} my rich uncle from Berlin {} met

'I met [PP on the street] [NP my rich uncle from Berlin].'

Analyses of heavy NP shift

Movement analyses of heavy NP shift

For analyses that treat heavy NP shift as movement, there are two kinds of shifts when considering heavy NPs: one involves rightward movement, the involves leftward movement.

Heavy NP shift as rightward movement

The first, and most traditional, is when a "heavy" object NP is shifted rightward to the end of the utterance. According to one theory [20] a complex NP will be shifted rightward if it does not cross over a verb. Another theory proposes a leftward movement of a "light" indirect object. According to this theory [21] an indirect object is moved leftward independently of verb movement to a higher specifier position, next to the verb and before the direct object, leaving the heavy NP in situ.

This syntax tree shows an example of rightward movement of a heavy NP. The entire constituent my rich uncle from Detroit has 'shifted' from its position as a complement to the verb to an adjunct of the VP. This tree has been created using a particular style of syntax, mainly X-bar Theory.

HNPS rightward movement.png

Heavy NP shift as leftward movement

The syntax tree below shows an example of leftward movement of a 'light' indirect object. The constituent on the street has 'shifted' from its position in a small clause environment past the direct object to a higher functional specifier position leaving the 'heavy NP' in situ. [22] This syntax tree has been created using a particular style of syntax, mainly X-bar Theory.

Leftward movement for HNPS.png

Processing analyses of heavy NP shift

Psycholinguists want to understand when speakers resort to shifting heavy NPs to the end of a phrase. A study of printed corpus material [23] found that different verbs exhibit different word order variation. The study focused on whether speakers and writers produce the word of heavy NP shift for their own benefit, or if they take into account the needs of the listener or reader. In this study, two verb classes were compared to test the choice of early and late commitment (points when the speaker has 'committed' themselves to producing a VP immediately followed by NP and PP constituents). The study examined transitive verbs, which require an NP object (e.g. bring, carry, make, put, etc.) and prepositional or intransitive verbs that do not require an NP object (e.g. add, build, call, draw, leave, etc.).

[24] Example of utterance with transitive verb:

Pat brought a box with a ribbon around it →to the party.
Pat brought →to the party a box with a ribbon around it.

Example of utterance with intransitive verb:

Pat wrote something about Chris→ on the blackboard.
Pat wrote on the blackboard →something about Chris.

The arrows indicate where the speaker commits to producing a VP with an NP and PP following. They also indicate where the listener is sure the VP will contain a NP and PP.

After analyzing written and spoken corpus data it was found that heavy NP shift is more common with intransitive verbs. This suggests using heavy NP shift word order with intransitive verbs allows the producer more time to decide whether they will include a direct object or not. This reduces the amount of planning needed and gives the speaker more time to articulate their thoughts. Having this extra time should in turn reduce the chance of the producer having to correct or abort an utterance. [25]

Notes

  1. Wasow, T. and Arnold, J., 2003. Post-verbal constituent ordering in English. Topics in English Linguistics, 43, pp.119-154.
  2. Quirk, R., Greenbaum, S., Leech, G.N. and Svartvik, J., 1972. A grammar of contemporary English.
  3. Wasow, T. and Arnold, J., 2003. Post-verbal constituent ordering in English. Topics in English Linguistics, 43, pp.119-154.
  4. Ross, J. 1967. Constraints on variables in syntax. Ph.D. Dissertation, MIT.
  5. Wasow, T. and Arnold, J., 2003. Post-verbal constituent ordering in English. Topics in English Linguistics, 43, pp.119-154.
  6. Wasow, T. and Arnold, J., 2003. Post-verbal constituent ordering in English. Topics in English Linguistics, 43, pp.119-154.
  7. Wasow, T. and Arnold, J., 2003. Post-verbal constituent ordering in English. Topics in English Linguistics, 43, pp.119-154.
  8. Hawkins, John A. A performance theory of order and constituency. Vol. 73. Cambridge University Press, 1994.
  9. Hawkins, John A. A performance ::theory of order and constituency. Vol. 73. Cambridge University Press, 1994.
  10. Hawkins, John A. A performance theory of order and constituency. Vol. 73. Cambridge University Press, 1994.
  11. Arnold, J.E., Losongco, A., Wasow, T. and Ginstrom, R., 2000. Heaviness vs. newness: The effects of structural complexity and discourse status on constituent ordering. Language, pp.28-55.
  12. Hawkins, John A. A performance theory of order and constituency. Vol. 73. Cambridge University Press, 1994.
  13. Arnold, J.E., Losongco, A., Wasow, T. and Ginstrom, R., 2000. Heaviness vs. newness: The effects of structural complexity and discourse status on constituent ordering. Language, pp.28-55.
  14. Arnold, J.E., Losongco, A., Wasow, T. and Ginstrom, R., 2000. Heaviness vs. newness: The effects of structural complexity and discourse status on constituent ordering. Language, pp.28-55.
  15. Hawkins, John A. A performance theory of order and constituency. Vol. 73. Cambridge University Press, 1994.
  16. Hawkins, John A. A performance theory of order and constituency. Vol. 73. Cambridge University Press, 1994.
  17. Hawkins, John A. A performance theory of order and constituency. Vol. 73. Cambridge University Press, 1994.
  18. Yamashita, H. and Chang, F., 2001. "Long before short" preference in the production of a head-final language. Cognition, 81(2), pp.B45-B55.
  19. Hawkins, John A. A performance theory of order and constituency. Vol. 73. Cambridge University Press, 1994
  20. Ross, J. 1967. Constraints on variables in syntax. Ph.D. Dissertation, MIT.
  21. Kayne, R.S., 1994. The antisymmetry of syntax (No. 25). mit Press.
  22. Kayne, R.S., 1994. The antisymmetry of syntax (No. 25). mit Press.
  23. Wasow, T., 1997. End-weight from the speaker's perspective. Journal of Psycholinguistic research, 26(3), pp.347-361.
  24. Wasow, T., 1997. End-weight from the speaker's perspective. Journal of Psycholinguistic research, 26(3), pp.347-361.
  25. Wasow, T., 1997. End-weight from the speaker's perspective. Journal of Psycholinguistic research, 26(3), pp.347-361.

Related Research Articles

<span class="mw-page-title-main">Syntax</span> System responsible for combining morphemes into complex structures

In linguistics, syntax is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure (constituency), agreement, the nature of crosslinguistic variation, and the relationship between form and meaning (semantics). There are numerous approaches to syntax that differ in their central assumptions and goals.

In grammar, a phrase—called expression in some contexts—is a group of words or singular word acting as a grammatical unit. For instance, the English expression "the very happy squirrel" is a noun phrase which contains the adjective phrase "very happy". Phrases can consist of a single word or a complete sentence. In theoretical linguistics, phrases are often analyzed as units of syntactic structure such as a constituent. There is a difference between the common use of the term phrase and its technical use in linguistics. In common usage, a phrase is usually a group of words with some special idiomatic meaning or other significance, such as "all rights reserved", "economical with the truth", "kick the bucket", and the like. It may be a euphemism, a saying or proverb, a fixed expression, a figure of speech, etc.. In linguistics, these are known as phrasemes.

Phrase structure rules are a type of rewrite rule used to describe a given language's syntax and are closely associated with the early stages of transformational grammar, proposed by Noam Chomsky in 1957. They are used to break down a natural language sentence into its constituent parts, also known as syntactic categories, including both lexical categories and phrasal categories. A grammar that uses phrase structure rules is a type of phrase structure grammar. Phrase structure rules as they are commonly employed operate according to the constituency relation, and a grammar that employs phrase structure rules is therefore a constituency grammar; as such, it stands in contrast to dependency grammars, which are based on the dependency relation.

<span class="mw-page-title-main">Parse tree</span> Tree in formal language theory

A parse tree or parsing tree or derivation tree or concrete syntax tree is an ordered, rooted tree that represents the syntactic structure of a string according to some context-free grammar. The term parse tree itself is used primarily in computational linguistics; in theoretical syntax, the term syntax tree is more common.

In linguistics, a verb phrase (VP) is a syntactic unit composed of a verb and its arguments except the subject of an independent clause or coordinate clause. Thus, in the sentence A fat man quickly put the money into the box, the words quickly put the money into the box constitute a verb phrase; it consists of the verb put and its arguments, but not the subject a fat man. A verb phrase is similar to what is considered a predicate in traditional grammars.

Lexical functional grammar (LFG) is a constraint-based grammar framework in theoretical linguistics. It posits two separate levels of syntactic structure, a phrase structure grammar representation of word order and constituency, and a representation of grammatical functions such as subject and object, similar to dependency grammar. The development of the theory was initiated by Joan Bresnan and Ronald Kaplan in the 1970s, in reaction to the theory of transformational grammar which was current in the late 1970s. It mainly focuses on syntax, including its relation with morphology and semantics. There has been little LFG work on phonology.

In syntax, verb-second (V2) word order is a sentence structure in which the finite verb of a sentence or a clause is placed in the clause's second position, so that the verb is preceded by a single word or group of words.

The term phrase structure grammar was originally introduced by Noam Chomsky as the term for grammar studied previously by Emil Post and Axel Thue. Some authors, however, reserve the term for more restricted grammars in the Chomsky hierarchy: context-sensitive grammars or context-free grammars. In a broader sense, phrase structure grammars are also known as constituency grammars. The defining trait of phrase structure grammars is thus their adherence to the constituency relation, as opposed to the dependency relation of dependency grammars.

In generative grammar, non-configurational languages are languages characterized by a flat phrase structure, which allows syntactically discontinuous expressions, and a relatively free word order.

Topicalization is a mechanism of syntax that establishes an expression as the sentence or clause topic by having it appear at the front of the sentence or clause. This involves a phrasal movement of determiners, prepositions, and verbs to sentence-initial position. Topicalization often results in a discontinuity and is thus one of a number of established discontinuity types, the other three being wh-fronting, scrambling, and extraposition. Topicalization is also used as a constituency test; an expression that can be topicalized is deemed a constituent. The topicalization of arguments in English is rare, whereas circumstantial adjuncts are often topicalized. Most languages allow topicalization, and in some languages, topicalization occurs much more frequently and/or in a much less marked manner than in English. Topicalization in English has also received attention in the pragmatics literature.

In linguistics, head directionality is a proposed parameter that classifies languages according to whether they are head-initial or head-final. The head is the element that determines the category of a phrase: for example, in a verb phrase, the head is a verb. Therefore, head initial would be "VO" languages and head final would be "OV" languages.

In linguistic typology, a verb–object–subject or verb–object–agent language, which is commonly abbreviated VOS or VOA, is one in which most sentences arrange their elements in that order. That would be the equivalent in English to "Drank cocktail Sam." The relatively rare default word order accounts for only 3% of the world's languages. It is the fourth-most common default word order among the world's languages out of the six. It is a more common default permutation than OVS and OSV but is significantly rarer than SOV, SVO, and VSO. Families in which all or many of their languages are VOS include the following:

In linguistics, a small clause consists of a subject and its predicate, but lacks an overt expression of tense. Small clauses have the semantic subject-predicate characteristics of a clause, and have some, but not all, properties of a constituent. Structural analyses of small clauses vary according to whether a flat or layered analysis is pursued. The small clause is related to the phenomena of raising-to-object, exceptional case-marking, accusativus cum infinitivo, and object control.

In linguistics, coordination is a complex syntactic structure that links together two or more elements; these elements are called conjuncts or conjoins. The presence of coordination is often signaled by the appearance of a coordinator, e.g. and, or, but. The totality of coordinator(s) and conjuncts forming an instance of coordination is called a coordinate structure. The unique properties of coordinate structures have motivated theoretical syntax to draw a broad distinction between coordination and subordination. It is also one of the many constituency tests in linguistics. Coordination is one of the most studied fields in theoretical syntax, but despite decades of intensive examination, theoretical accounts differ significantly and there is no consensus on the best analysis.

The term linguistic performance was used by Noam Chomsky in 1960 to describe "the actual use of language in concrete situations". It is used to describe both the production, sometimes called parole, as well as the comprehension of language. Performance is defined in opposition to "competence"; the latter describes the mental knowledge that a speaker or listener has of language.

Syntactic movement is the means by which some theories of syntax address discontinuities. Movement was first postulated by structuralist linguists who expressed it in terms of discontinuous constituents or displacement. Some constituents appear to have been displaced from the position in which they receive important features of interpretation. The concept of movement is controversial and is associated with so-called transformational or derivational theories of syntax. Representational theories, in contrast, reject the notion of movement and often instead address discontinuities with other mechanisms including graph reentrancies, feature passing, and type shifters.

In linguistics, inversion is any of several grammatical constructions where two expressions switch their canonical order of appearance, that is, they invert. There are several types of subject-verb inversion in English: locative inversion, directive inversion, copular inversion, and quotative inversion. The most frequent type of inversion in English is subject–auxiliary inversion in which an auxiliary verb changes places with its subject; it often occurs in questions, such as Are you coming?, with the subject you is switched with the auxiliary are. In many other languages, especially those with a freer word order than English, inversion can take place with a variety of verbs and with other syntactic categories as well.

Scrambling is a syntactic phenomenon wherein sentences can be formulated using a variety of different word orders without any change in meaning. Scrambling often results in a discontinuity since the scrambled expression can end up at a distance from its head. Scrambling does not occur in English, but it is frequent in languages with freer word order, such as German, Russian, Persian and Turkic languages. The term was coined by Haj Ross in his 1967 dissertation and is widely used in present work, particularly with the generative tradition.

In syntax, shifting occurs when two or more constituents appearing on the same side of their common head exchange positions in a sense to obtain non-canonical order. The most widely acknowledged type of shifting is heavy NP shift, but shifting involving a heavy NP is just one manifestation of the shifting mechanism. Shifting occurs in most if not all European languages, and it may in fact be possible in all natural languages including sign languages. Shifting is not inversion, and inversion is not shifting, but the two mechanisms are similar insofar as they are both present in languages like English that have relatively strict word order. The theoretical analysis of shifting varies in part depending on the theory of sentence structure that one adopts. If one assumes relatively flat structures, shifting does not result in a discontinuity. Shifting is often motivated by the relative weight of the constituents involved. The weight of a constituent is determined by a number of factors: e.g., number of words, contrastive focus, and semantic content.

<span class="mw-page-title-main">English phrasal verbs</span> Concept in English grammar

In the traditional grammar of Modern English, a phrasal verb typically constitutes a single semantic unit consisting of a verb followed by a particle, sometimes collocated with a preposition.

References