Syntactic ambiguity

Last updated

Syntactic ambiguity, also known as structural ambiguity, [1] amphiboly, or amphibology, is characterized by the potential for a sentence to yield multiple interpretations due to its ambiguous syntax. This form of ambiguity is not derived from the varied meanings of individual words but rather from the relationships among words and clauses within a sentence, concealing interpretations beneath the word order. Consequently, a sentence presents as syntactically ambiguous when it permits reasonable derivation of several possible grammatical structures by an observer.

Contents

In jurisprudence, the interpretation of syntactically ambiguous phrases in statutory texts or contracts may be done by courts. Occasionally, claims based on highly improbable interpretations of such ambiguities are dismissed as being frivolous litigation and without merit.[ citation needed ] The term parse forest refers to the collection of all possible syntactic structures, known as parse trees , that can represent the ambiguous sentence's meanings. [2] [3] The task of clarifying which meaning is actually intended from among the possibilities is known as syntactic disambiguation. [4]

Different forms

Globally ambiguous

A globally ambiguous sentence is one that has at least two distinct interpretations and where reading the entire sentence does not resolve the ambiguity. Globally ambiguous sentences exist where no feature of the representation (i.e. word order) distinguishes the possible distinct interpretations. Global ambiguities are often unnoticed because readers tend to choose the interpretation they understand to be more probable. One example of a global ambiguity is "The woman held the baby in the green blanket." In this example, the baby, incidentally wrapped in the green blanket, is being held by the woman, or the woman is using the green blanket as an instrument to hold the baby, or the woman is wrapped in the green blanket and holding the baby.

Locally ambiguous

A locally ambiguous sentence is a sentence that contains an ambiguous phrase but has only one interpretation. [5] The ambiguity in a locally ambiguous sentence briefly stays and is resolved, i.e., disambiguated, by the end of the speech. Sometimes, local ambiguities can result in "garden path" sentences, in which a structurally correct sentence is difficult to interpret because one interpretation of the ambiguous region is not the one that makes most sense.

Examples

The duke yet lives that Henry shall depose. Henry VI (1.4.30), by William Shakespeare
  • Henry will depose the duke.
  • The duke will depose Henry.
Amphiboly occurs frequently in poetry, sometimes owing to the alteration of the natural order of words for metrical reasons.
Eduardum occidere nolite timere bonum est. Edward II by Christopher Marlowe
Isabella of France and Roger Mortimer, 1st Earl of March supposedly plotted to murder Edward II of England in such a way as not to draw blame on themselves, sending this order in Latin which changes meaning depending on where the comma is placed.
  • Do not be afraid to kill Edward; it is good. (either Edward, killing him, or being afraid to kill him is good)
  • Do not kill Edward; it is good to fear. (either Edward or killing him is good to fear)
I'm glad I'm a man, and so is Lola. — "Lola" by the Kinks (written by Ray Davies)
  • Lola and I are both glad I'm a man.
  • I'm glad I'm a man, and Lola is also a man.
  • I'm glad I'm a man, and I'm also glad Lola is a man.
  • I'm glad I'm a man, and Lola is also glad to be a man.
The ambiguity is intentional and alludes to a cross-dresser.
John saw the man on the mountain with a telescope.
  • John, using a telescope, saw a man on a mountain.
  • John saw a man on a mountain which had a telescope on it.
  • John saw a man on a mountain who had a telescope.
  • John, on a mountain and using a telescope, saw a man.
  • John, on a mountain, saw a man who had a telescope.
The word of the Lord came to Zechariah, son of Berekiah, son of Iddo, the prophet. [6]
  • ... the prophet Zechariah, who was the son of Berekiah, who was the son of Iddo
  • ... Zechariah, who was the son of the prophet Berekiah, who was the son of Iddo
  • ... Zechariah, who was the son of Berekiah, who was the son of the prophet Iddo
  • ... the prophet Zechariah, who was the son of Berekiah and Iddo
  • ... Zechariah, who was the son of Berekiah and Iddo, the prophet
Lesbian Vampire Killers , the title of a comedy-horror film
  • Lesbians that kill vampires.
  • Killers of lesbian vampires.
  • Lesbian vampires that are killers.
"The Purple People Eater" by Sheb Wooley
  • A purple creature that eats people.
  • A creature that eats purple people. (This interpretation is confirmed in the lyrics, although whether the creature itself is also purple is never made clear.)
British Left Waffles on Falkland Islands. [7]
  • The British party of the left rambles indecisively about Falkland Island policy.
  • The British forces left behind waffles (the breakfast item) on the Falkland Islands.

Aristotle writes about an influence of ambiguities on arguments and also about this influence depending on either combination or division of words:

... if one combines the words 'to write-while-not-writing': for then it means, that he has the power to write and not to write at once; whereas if one does not combine them, it means that when he is not writing he has the power to write.

Aristotle, Sophistical refutations, Book I, Part 4

In headlines

Newspaper headlines are written in a telegraphic style (headlinese) which often omits the copula, creating syntactic ambiguity. A common form is the garden path type. The name crash blossoms was proposed for these ambiguous headlines by Danny Bloom in the Testy Copy Editors discussion group in August 2009. He based this on the headline "Violinist linked to JAL crash blossoms" that Mike O'Connell had posted, asking what such a headline could be called. [8] The Columbia Journalism Review regularly reprints such headlines in its "The Lower Case" column, and has collected them in the anthologies "Squad Helps Dog Bite Victim" [9] and "Red Tape Holds Up New Bridge". [10] Language Log also has an extensive archive of crash blossoms, for example "Infant Pulled from Wrecked Car Involved in Short Police Pursuit". [11]

Many purported crash blossoms are apocryphal or recycled. [12] One celebrated one from World War I is "French push bottles up German rear"; [13] life imitated art in the Second World War headline "Eighth Army Push Bottles Up Germans". [14]

In humour and advertising

Syntactic or structural ambiguities are frequently found in humour and advertising. One of the most enduring jokes supposedly originating with the famous comedian Groucho Marx was his quip that used an ambiguous modifier. "I shot an elephant in my pajamas. How he got into my pajamas I don't know." Another sentence, which emerged from early 1960s machine translation research, is "Time flies like an arrow; fruit flies like a banana".

Significantly enough, structural ambiguities may also be intentionally created when one understands the kinds of syntactic structures that will lead to ambiguity; however, for the respective interpretations to work, they must be compatible with semantic and pragmatic contextual factors. [1]

Syntactic and semantic ambiguity

In syntactic ambiguity, the same sequence of words is interpreted as having different syntactic structures. In contrast, in semantic ambiguity the structure remains the same, but the individual words are interpreted differently. [15] [16] Controlled natural languages are often designed to be unambiguous so that they can be parsed into a logical form. [17]

Kantian

Immanuel Kant employs the term "amphiboly" in a sense of his own, as he has done in the case of other philosophical words. He means it as a confusion of pure understanding with perceived experience, and an attribution to the latter of what belongs only to the former. [18]

Models

Competition-based model

Competition-based models hold that differing syntactic analyses rival each other when syntactic ambiguities are resolved. If probability and language constraints offer similar support for each one, especially strong competition occurs. On the other hand, when constraints support one analysis over the other, competition is weak and processing is easy. After van Gompel et al.'s experiments (2005), the reanalysis model has become favoured over competition-based models. [19] Convincing evidence against competition-based models includes the fact that globally ambiguous sentences are easier to process than disambiguated (clearer) sentences, showing that the analyses do not compete against each other in the former. Plausibility tends to strengthen one analysis and eliminate rivalry. However, the model has not been completely rejected. Some theories claim that competition makes processing difficult, if only briefly. [19]

Reanalysis model

According to the reanalysis model, processing is hard once the reader has realised that their analysis is false (with respect to the already adopted syntactic structure) and he or she must then return and recheck the structure. Most reanalysis models, like the unrestricted race model, work in series, which implies that only one analysis can be supported at a time.

Consider the following statements:

  1. "The dog of the woman that had the parasol was brown."
  2. "The woman with the dog that had the parasol was brown."
  3. "The dog with the woman that had the parasol was brown."

Research supports the reanalysis model as the most likely reason for why interpreting these ambiguous sentences is hard. [19] Results of many experiments tracking the eye-movements of subjects have demonstrated that it is just as hard to process a persistently ambiguous sentence (1) as an unambiguous sentence (2 and 3) because information before the ambiguity only weakly leans towards each possible syntax. [19]

Unrestricted race model

The unrestricted race model states that analysis is affected before the introduction of ambiguity and affects which meaning is used (based on probability) before multiple analyses can be introduced. Gompel and Pickering plainly refer to the unrestricted race model as a two-stage reanalysis model. Unlike constraint-based theories, only one analysis can be made at any one time. Thus, reanalysis may sometimes be necessary if information following the first analysis proves it wrong. [19]

However, the name "unrestricted race" comes directly from its properties taken from the constraint-based models. As in constraint-based theories, any source of information can support the different analyses of an ambiguous structure; thus the name. In the model, the other possible structures of an ambiguous sentence compete in a race, with the structure that is constructed fastest being used. The more such an analysis is supported, and the stronger the support is, the more likely this one will be made first. [20]

Consider the following statements:

  1. "The maid of the princess who scratched herself in public was terribly humiliated."
  2. "The son of the princess who scratched himself in public was terribly humiliated."
  3. "The son of the princess who scratched herself in public was terribly humiliated."

Research showed that people took less time to read persistently ambiguous sentences (sentence 1) than temporarily ambiguous sentences that were clarified later (sentences 2 and 3). In sentences 2 and 3, the reflexive pronouns “himself” and “herself” clarify that “who scratched” is modifying the son and the princess respectively. Thus, the readers are forced to reanalyse and their reading times will therefore rise. In sentence 1, however, the ambiguity of the reflexive pronoun “herself” fits both the maid and the princess. This means the readers do not have to reanalyse. Thus, ambiguous sentences will take a shorter time to read compared to clarified ones. [21]

This is called the underspecification account [22] as readers do not stick to a meaning when not provided with clarifying words. The reader understands someone scratched herself but does not seek to determine whether it was the maid or the princess. This is also known as the “good-enough” approach to understanding language. [23]

The good-enough approach

The good-enough approach to understanding language claims that representations of meaning are usually incomplete and language processing only partial. A good-enough interpretation may occur when such a representation is not robust, supported by context, or both and must handle potentially distracting information. Thus, such information is clipped for successful understanding [23]

Differences in processing

Children and adults

Children interpret ambiguous sentences differently from adults due to lack of experience. Children have not yet learned how the environment and contextual clues can suggest a certain interpretation of a sentence. They have also not yet developed the ability to acknowledge that ambiguous words and phrases can be interpreted multiple ways. [24] As children read and interpret syntactically ambiguous sentences, the speed at which initial syntactic commitments are made is lower in children than in adults. Furthermore, children appear to be less skilled at directing their attention back to the part of the sentence that is most informative in terms of aiding reanalysis. [25] Other evidence attributes differences in interpreting ambiguous sentences to working memory span. While adults tend to have a higher working memory span, they sometimes spend more time resolving the ambiguity but tend to be more accurate in their final interpretation. Children, in contrast, can decide quickly on an interpretation because they consider only the interpretations their working memory can hold. [26]

Low reading span vs. high reading span adults

For low reading span adults who had the worst verbal working memory, they took longer to process the sentences with the reduced relative clause compared to the relative clause and had similar times from inanimate or animate subjects. For high reading span subjects who had the best verbal working memory, they were overall faster than the low reading span subjects. Within the high reading span subjects, however, they responded faster to inanimate subjects and took longer to respond to animate subjects. This was because the animate subjects had a greater propensity to create a garden path sentence despite greater verbal working memory. This suggested that since the low reading span subjects had less cognitive resources, only syntactic cues could be processed while high reading span subjects had more cognitive resources and could thus get tripped up with the garden path sentence. [26] [27]

See also

Related Research Articles

<span class="mw-page-title-main">Ambiguity</span> Type of uncertainty of meaning in which several interpretations are plausible

Ambiguity is the type of meaning in which a phrase, statement, or resolution is not explicitly defined, making for several interpretations; others describe it as a concept or statement that has no real reference. A common aspect of ambiguity is uncertainty. It is thus an attribute of any idea or statement whose intended meaning cannot be definitively resolved, according to a rule or process with a finite number of steps..

Natural language processing (NLP) is an interdisciplinary subfield of computer science and linguistics. It is primarily concerned with giving computers the ability to support and manipulate human language. It involves processing natural language datasets, such as text corpora or speech corpora, using either rule-based or probabilistic machine learning approaches. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them. The technology can then accurately extract information and insights contained in the documents as well as categorize and organize the documents themselves.

<span class="mw-page-title-main">Parse tree</span> Tree in formal language theory

A parse tree or parsing tree or derivation tree or concrete syntax tree is an ordered, rooted tree that represents the syntactic structure of a string according to some context-free grammar. The term parse tree itself is used primarily in computational linguistics; in theoretical syntax, the term syntax tree is more common.

Psycholinguistics or psychology of language is the study of the interrelation between linguistic factors and psychological aspects. The discipline is mainly concerned with the mechanisms by which language is processed and represented in the mind and brain; that is, the psychological and neurobiological factors that enable humans to acquire, use, comprehend, and produce language.

In linguistics, X-bar theory is a model of phrase-structure grammar and a theory of syntactic category formation that was first proposed by Noam Chomsky in 1970 reformulating the ideas of Zellig Harris (1951), and further developed by Ray Jackendoff, along the lines of the theory of generative grammar put forth in the 1950s by Chomsky. It attempts to capture the structure of phrasal categories with a single uniform structure called the X-bar schema, basing itself on the assumption that any phrase in natural language is an XP that is headed by a given syntactic category X. It played a significant role in resolving issues that phrase structure rules had, representative of which is the proliferation of grammatical rules, which is against the thesis of generative grammar.

Parsing, syntax analysis, or syntactic analysis is the process of analyzing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar. The term parsing comes from Latin pars (orationis), meaning part.

A garden-path sentence is a grammatically correct sentence that starts in such a way that a reader's most likely interpretation will be incorrect; the reader is lured into a parse that turns out to be a dead end or yields a clearly unintended meaning. "Garden path" refers to the saying "to be led down [or up] the garden path", meaning to be deceived, tricked, or seduced. In A Dictionary of Modern English Usage (1926), Fowler describes such sentences as unwittingly laying a "false scent".

In semantics, mathematical logic and related disciplines, the principle of compositionality is the principle that the meaning of a complex expression is determined by the meanings of its constituent expressions and the rules used to combine them. The principle is also called Frege's principle, because Gottlob Frege is widely credited for the first modern formulation of it. However, the principle has never been explicitly stated by Frege, and arguably it was already assumed by George Boole decades before Frege's work.

In linguistics, nominalization or nominalisation is the use of a word that is not a noun as a noun, or as the head of a noun phrase. This change in functional category can occur through morphological transformation, but it does not always. Nominalization can refer, for instance, to the process of producing a noun from another part of speech by adding a derivational affix, but it can also refer to the complex noun that is formed as a result.

Speech segmentation is the process of identifying the boundaries between words, syllables, or phonemes in spoken natural languages. The term applies both to the mental processes used by humans, and to artificial processes of natural language processing.

In generative grammar and related approaches, the logical form (LF) of a linguistic expression is the variant of its syntactic structure which undergoes semantic interpretation. It is distinguished from phonetic form, the structure which corresponds to a sentence's pronunciation. These separate representations are postulated in order to explain the ways in which an expression's meaning can be partially independent of its pronunciation, e.g. scope ambiguities.

Constraint grammar (CG) is a methodological paradigm for natural language processing (NLP). Linguist-written, context-dependent rules are compiled into a grammar that assigns grammatical tags ("readings") to words or other tokens in running text. Typical tags address lemmatisation, inflexion, derivation, syntactic function, dependency, valency, case roles, semantic type etc. Each rule either adds, removes, selects or replaces a tag or a set of grammatical tags in a given sentence context. Context conditions can be linked to any tag or tag set of any word anywhere in the sentence, either locally or globally. Context conditions in the same rule may be linked, i.e. conditioned upon each other, negated, or blocked by interfering words or tags. Typical CGs consist of thousands of rules, that are applied set-wise in progressive steps, covering ever more advanced levels of analysis. Within each level, safe rules are used before heuristic rules, and no rule is allowed to remove the last reading of a given kind, thus providing a high degree of robustness.

Attempto Controlled English (ACE) is a controlled natural language, i.e. a subset of standard English with a restricted syntax and restricted semantics described by a small set of construction and interpretation rules. It has been under development at the University of Zurich since 1995. In 2013, ACE version 6.7 was announced.

The term linguistic performance was used by Noam Chomsky in 1960 to describe "the actual use of language in concrete situations". It is used to describe both the production, sometimes called parole, as well as the comprehension of language. Performance is defined in opposition to "competence"; the latter describes the mental knowledge that a speaker or listener has of language.

Sentence processing takes place whenever a reader or listener processes a language utterance, either in isolation or in the context of a conversation or a text. Many studies of the human language comprehension process have focused on reading of single utterances (sentences) without context. Extensive research has shown that language comprehension is affected by context preceding a given utterance as well as many other factors.

<span class="mw-page-title-main">Janet Dean Fodor</span> American linguist (1942–2023)

Janet Dean Fodor was distinguished professor emerita of linguistics at the Graduate Center of the City University of New York. Her primary field was psycholinguistics, and her research interests included human sentence processing, prosody, learnability theory and L1 (first-language) acquisition.

The P600 is an event-related potential (ERP) component, or peak in electrical brain activity measured by electroencephalography (EEG). It is a language-relevant ERP component and is thought to be elicited by hearing or reading grammatical errors and other syntactic anomalies. Therefore, it is a common topic of study in neurolinguistic experiments investigating sentence processing in the human brain.

The early left anterior negativity is an event-related potential in electroencephalography (EEG), or component of brain activity that occurs in response to a certain kind of stimulus. It is characterized by a negative-going wave that peaks around 200 milliseconds or less after the onset of a stimulus, and most often occurs in response to linguistic stimuli that violate word-category or phrase structure rules. As such, it is frequently a topic of study in neurolinguistics experiments, specifically in areas such as sentence processing. While it is frequently used in language research, there is no evidence yet that it is necessarily a language-specific phenomenon.

A reduced relative clause is a relative clause that is not marked by an explicit relative pronoun or complementizer such as who, which or that. An example is the clause I saw in the English sentence "This is the man I saw." Unreduced forms of this relative clause would be "This is the man that I saw." or "...whom I saw."

Syntactic parsing is the automatic analysis of syntactic structure of natural language, especially syntactic relations and labelling spans of constituents. It is motivated by the problem of structural ambiguity in natural language: a sentence can be assigned multiple grammatical parses, so some kind of knowledge beyond computational grammar rules is needed to tell which parse is intended. Syntactic parsing is one of the important tasks in computational linguistics and natural language processing, and has been a subject of research since the mid-20th century with the advent of computers.

References

  1. 1 2 Oaks, Dallin D. (2010). Structural Ambiguity in English: An Applied Grammatical Inventory. Vol. 2 vols. London: Continuum. ISBN   9781441141378.
  2. Billot, Sylvie, and Bernard Lang. "The structure of shared forests in ambiguous parsing." Proceedings of the 27th annual meeting on Association for Computational Linguistics. Association for Computational Linguistics, 1989.
  3. Kurohashi, Sadao, and Makoto Nagao. "Building a Japanese parsed corpus while improving the parsing system." Proceedings of The 1st International Conference on Language Resources & Evaluation. 1998.
  4. MacDonald, Maryellen C., Neal J. Pearlmutter, and Mark S. Seidenberg. "The lexical nature of syntactic ambiguity resolution Archived 2016-08-03 at the Wayback Machine ." Psychological review 101.4 (1994): 676.
  5. Monique Lamers; Peter de Swart (20 October 2011). Case, Word Order and Prominence: Interacting Cues in Language Production and Comprehension. Springer Science & Business Media. ISBN   978-94-007-1463-2.
  6. Zechariah 1:7 (King James Version)
  7. headline, The Guardian, April 28, 1982, as quoted by Gloria Cooper, "Mailbag", Barron's, February 19, 2001
  8. Ben Zimmer, "On Language: Crash Blossoms", New York Times Magazine , January 27, 2010 online text
  9. Gloria Cooper, ed., Squad Helps Dog Bite Victim, and other flubs from the nation's press, Dolphin Books, 1980, ISBN   0-385-15828-9
  10. Gloria Cooper, Red tape holds up new bridge, and more flubs from the nation's press, Perigee Books, 1987. ISBN   0-399-51406-6
  11. "Language Log".
  12. 1997 Headlines at Snopes.com.
  13. Mayes, Ian (2000-04-13). "Heads you win: The readers' editor on the art of the headline writer". Guardian. London. Retrieved 2009-06-05.
  14. Fritz Spiegl, What The Papers Didn't Mean to Say Scouse Press, Liverpool, 1965
  15. Layman E. Allen "Some Uses of Symbolic Logic in Law Practice" 1962J M.U.L.L. 119, at 120;
  16. L.E. Allen & M.E. Caldwell "Modern Logic and Judicial Decision Making: A Sketch of One View" in H.W. Baade (ed.) "Jurimetrics" Basic Books Inc., New York, USA, 1963, 213, at 228
  17. Tobias Kuhn; Norbert E Fuchs (9 August 2012). Controlled Natural Language: Third International Workshop, CNL 2012, Zurich, Switzerland, August 29-31, 2012, Proceedings. Springer. ISBN   978-3-642-32612-7.
  18. adapted from Garden, Francis A Dictionary of English Philosophical Terms (1878)
  19. 1 2 3 4 5 van Gompel, Roger P.G.; Pickering, Martin J.; Pearson, Jamie; Liversedge, Simon P.; et al. (4 January 2005). "Evidence against competition during syntactic ambiguity resolution". Journal of Memory and Language. 52 (2): 284–307. CiteSeerX   10.1.1.165.8161 . doi:10.1016/j.jml.2004.11.003.
  20. van Gompel, Roger P.G.; Pickering, Martin J. (2000), Unrestricted race: A new model of syntactic ambiguity resolution, CiteSeerX   10.1.1.165.9576
  21. Traxler, Matthew J.; Pickering, Martin J.; Clifton, Charles (1998-11-01). "Adjunct Attachment Is Not a Form of Lexical Ambiguity Resolution". Journal of Memory and Language. 39 (4): 558–592. doi:10.1006/jmla.1998.2600. ISSN   0749-596X.
  22. Swets, Benjamin; Desmet, Timothy; Clifton, Charles; Ferreira, Fernanda (2008-01-01). "Underspecification of syntactic ambiguities: Evidence from self-paced reading". Memory & Cognition. 36 (1): 201–216. doi: 10.3758/MC.36.1.201 . ISSN   1532-5946. PMID   18323075.
  23. 1 2 Ferreira, Fernanda; Bailey, Karl G.D.; Ferraro, Vittoria (February 2002). "Good-Enough Representations in Language Comprehension". Current Directions in Psychological Science. 11 (1): 11–15. doi:10.1111/1467-8721.00158. ISSN   0963-7214. S2CID   4126375.
  24. Yi Ting Huang; Jesse Snedeker. "The use of referential context in children's online interpretation of adjectives" (PDF). Archived from the original (PDF) on 10 June 2015. Retrieved 4 November 2013.
  25. Holly S. S. L. Joseph; Simon P. Liversedge (2013). "Children's and Adults' On-Line Processing of Syntactically Ambiguous Sentences during Reading". PLOS ONE. 8 (1): e54141. Bibcode:2013PLoSO...854141J. doi: 10.1371/journal.pone.0054141 . PMC   3547875 . PMID   23349807.
  26. 1 2 Maryellen C. MacDonald; Marcel A. Just (1992). "Working memory constraints on the processing of syntactic ambiguity". Cognitive Psychology. 24 (1): 56–98. doi:10.1016/0010-0285(92)90003-K. PMID   1537232. S2CID   23695158.
  27. Ferreira, Fernanda; Clifton, Charles (1986-06-01). "The independence of syntactic processing". Journal of Memory and Language. 25 (3): 348–368. doi:10.1016/0749-596X(86)90006-9. ISSN   0749-596X.