Merge is one of the basic operations in the Minimalist Program, a leading approach to generative syntax, when two syntactic objects are combined to form a new syntactic unit (a set). Merge also has the property of recursion in that it may be applied to its own output: the objects combined by Merge are either lexical items or sets that were themselves formed by Merge. This recursive property of Merge has been claimed to be a fundamental characteristic that distinguishes language from other cognitive faculties. As Noam Chomsky (1999) puts it, Merge is "an indispensable operation of a recursive system ... which takes two syntactic objects A and B and forms the new object G={A,B}" (p. 2). [1]
Within the Minimalist Program, syntax is derivational, and Merge is the structure-building operation. Merge is assumed to have certain formal properties constraining syntactic structure, and is implemented with specific mechanisms. In terms of a merge-base theory of language acquisition, complements and specifiers are simply notations for first-merge (read as "complement-of" [head-complement]), and later second-merge (read as "specifier-of" [specifier-head]), with merge always forming to a head. First-merge establishes only a set {a, b} and is not an ordered pair. In its original formulation by Chomsky in 1995 Merge was defined as inherently asymmetric; in Moro 2000 it was first proposed that Merge can generate symmetrical structures provided that they are rescued by movement and asymmetry is restored [2] For example, an {N, N}-compound of 'boat-house' would allow the ambiguous readings of either 'a kind of house' and/or 'a kind of boat'. It is only with second-merge that order is derived out of a set {a {a, b}} which yields the recursive properties of syntax. For example, a 'House-boat' {house {house, boat}} now reads unambiguously only as a 'kind of boat'. It is this property of recursion that allows for projection and labeling of a phrase to take place; [2] in this case, that the Noun 'boat' is the head of the compound, and 'house' acting as a kind of specifier/modifier. External-merge (first-merge) establishes substantive 'base structure' inherent to the VP, yielding theta/argument structure, and may go beyond the lexical-category VP to involve the functional-category light verb vP. Internal-merge (second-merge) establishes more formal aspects related to edge-properties of scope and discourse-related material pegged to CP. In a Phase-based theory, this twin vP/CP distinction follows the "duality of semantics" discussed within the Minimalist Program, and is further developed into a dual distinction regarding a probe-goal relation. [3] As a consequence, at the "external/first-merge-only" stage, young children would show an inability to interpret readings from a given ordered pair, since they would only have access to the mental parsing of a non-recursive set. (See Roeper for a full discussion of recursion in child language acquisition). [4] In addition to word-order violations, other more ubiquitous results of a first-merge stage would show that children's initial utterances lack the recursive properties of inflectional morphology, yielding a strict Non-inflectional stage-1, consistent with an incremental Structure building model of child language. [5]
Merge takes two objects α and β and combines them, creating a binary structure.
In some variants of the Minimalist Program Merge is triggered by feature checking, e.g. the verb eat selects the noun cheesecake because the verb has an uninterpretable N-feature [uN] ("u" stands for "uninterpretable"), which must be checked (or deleted) due to full interpretation. [6] By saying that this verb has a nominal uninterpretable feature, we rule out such ungrammatical constructions as *eat beautiful (the verb selects an adjective). Schematically it can be illustrated as:
There are three different accounts of how strong features force movement: [7] [8]
1. Phonetic Form (PF) crash theory (Chomsky 1993) is conceptually motivated. The argument goes as follows: under the assumption that Logical Form (LF) is invariant, it must be the case that any parametric differences between languages reduce to morphological properties that are reflected at PF (Chomsky 1993:192). Two possible implementations of the PF crash theory are discussed by Chomsky:
PF crash theory: A strong feature that is not checked in overt syntax causes a derivation to crash at PF.
2. Logical Form (LF) crash theory (Chomsky 1994) is empirically motivated by VP ellipsis.
LF crash theory: A strong feature that is not checked (and eliminated) in overt syntax causes a derivation to crash at LF.
3. Immediate elimination theory ((Chomsky 1995))
Virus theory: A strong feature must be eliminated (almost) immediately upon its introduction into the phrase marker; otherwise, the derivation cancels.
Initially, the cooperation of Last Resort (LR) and the Uniformity Condition (UC) were the indicators of the structures provided by Bare Phrase which contain labels and are constructed by move, as well the impact of the Structure Preservation Hypothesis. [9]
When we consider the features of the word that provide the label when the word projects, we assume that the categorical feature of the word is always among the features that become the label of the newly created syntactic object. [11] In this example below, Cecchetto demonstrated how projection selects a head as the label.
In this example by Cecchetto (2015), the verb "read" unambiguously labels the structure because "read" is a word, which means it is a probe by definition, in which "read" selects "the book". the bigger constituent generated by merging the word with the syntactic objects receives the label of the word itself, which allow us to label the tree as demonstrated.
In this tree, the verb "read" is the head selecting the DP "the book", which makes the constituent a VP.
Merge operates blindly, projecting labels in all possible combinations. The subcategorization features of the head act as a filter by admitting only labelled projections that are consistent with the selectional properties of the head. All other alternatives are eliminated. Merge does nothing more than combine two syntactic objects (SO’s) into a unit, but does not affect the properties of the combining elements in any way. This is called the No Tampering Condition (NTC). Therefore, if α (as a syntactic object) has some property before combining with β (which is likewise a syntactic object) it will still have this property after it has combined with β. This allows Merge to account for further merging, which enables structures with movement dependencies (such as wh-movement) to occur. All grammatical dependencies are established under Merge: this means that if α and β are grammatically linked, α and β must have merged. [12]
A major development of the Minimalist Program is Bare Phrase Structure (BPS), a theory of phrase structure (structure building operations) developed by Noam Chomsky in 1994. [13] BPS is a representation of the structure of phrases in which syntactic units are not explicitly assigned to categories. [14] The introduction of BPS moves the generative grammar towards dependency grammar (discussed below), which operates with significantly less structure than most phrase structure grammars. [15] The constitutional operation of BPS is Merge. Bare phrase structure attempts to: (i) eliminate unnecessary elements; (ii) generate simpler trees; (ii) account for variation across languages. [16]
Bare Phrase Structure defines projection levels according to the following features: [9]
Maximal Projection | Minimal Projection | Intermediate Projection |
The minimalist program brings into focus four fundamental properties that govern the structure of human language: [17] [18]
Context-sensitive phrase-structure (PS) rules ABC → ADC B = single symbol A, C, D = string of symbols (D = non-null, A and C can be null) A and C are non-null when the environment in which B needs to be re-written as D is specified.
Context-free phrase-structure (PS) rules B → D B = single non-terminal symbol D = string of non-terminal symbols (can be non-null) where lexical items can be inserted with their subcategorization features
Since the publication of bare phrase structure in 1994, [13] other linguists have continued to build on this theory. In 2002, Chris Collins continued research on Chomsky's proposal to eliminate labels, backing up Chomsky's suggestion of a more simple theory of phrase structure. [19] Collins proposed that economy features, such as Minimality, govern derivations and lead to simpler representations.
In more recent work by John Lowe and John Lundstrand, published in 2020, minimal phrase structure is formulated as an extension to bare phrase structure and X-bar theory. However it does not adopt all of the assumptions associated with the Minimalist Program (see above). Lowe and Lundstrand argue that any successful phrase structure theory, should include the following seven features: [16]
Although Bare Phrase Structure includes many of these features, it does not include all of them, therefore other theories have attempted to incorporate all of these features in order to present a successful phrase structure theory.
Chomsky (2001) distinguishes between external and internal Merge: if A and B are separate objects then we deal with external Merge; if either of them is part of the other it is internal Merge. [20]
As it is commonly understood, standard Merge adopts three key assumptions about the nature of syntactic structure and the faculty of language:
While these three assumptions are taken for granted for the most part by those working within the broad scope of the Minimalist Program, other theories of syntax reject one or more of them.
Merge is commonly seen as merging smaller constituents to greater constituents until the greatest constituent, the sentence, is reached. This bottom-up view of structure generation is rejected by representational (non-derivational) theories (e.g. Generalized Phrase Structure Grammar, Head-Driven Phrase Structure Grammar, Lexical Functional Grammar, most dependency grammars, etc.), and it is contrary to early work in Transformational Grammar. The phrase structure rules of context free grammar, for instance, were generating sentence structure top down.
The Minimalist view that Merge is strictly binary is justified with the argument that an -ary Merge where would inevitably lead to both under and overgeneration, and as such Merge must be strictly binary. [21] More formally, the forms of undergeneration given in Marcolli et al., (2023) are such that for any -ary Merge with , only strings of length for some can be generated (so sentences like "it rains" cannot be), and further, there are always strings of length that are ambiguous when parsed with binary Merge, for which an -ary merge with would not be able to account for.
Further, -ary Merge where is also said to necessarily lead to overgeneration. If we take a binary tree and an -ary tree with identical sets of leaves, then the binary tree will have a smaller number of accessible pairs of terms compared to the total -tuples of accessible terms in the -ary tree. This is responsible for the generation of ungrammatical sentences like "peanuts monkeys children will throw" (as opposed to "children will throw monkeys peanuts") with a ternary Merge. [22] Despite this, there have also been empirical arguments against strictly binary Merge, such as that coming from constituency tests, [23] and so some theories of grammar such as Head-Driven Phrase Structure Grammar still retain -ary branching in the syntax.
Merge merges two constituents in such a manner that these constituents become sister constituents and are daughters of the newly created mother constituent. This understanding of how structure is generated is constituency-based (as opposed to dependency-based). Dependency grammars (e.g. Meaning-Text Theory, Functional Generative Description, Word grammar) disagree with this aspect of Merge, since they take syntactic structure to be dependency-based. [24]
In other approaches to generative syntax, such as Head-driven phrase structure grammar, Lexical functional grammar and other types of unification grammar, the analogue to Merge is the unification operation of graph theory. In these theories, operations over attribute-value matrices (feature structures) are used to account for many of the same facts. Though Merge is usually assumed to be unique to language, the linguists Jonah Katz and David Pesetsky have argued that the harmonic structure of tonal music is also a result of the operation Merge. [25]
This notion of 'merge' may in fact be related to Fauconnier's 'blending' notion in cognitive linguistics.
Phrase structure grammar (PSG) represents immediate constituency relations (i.e. how words group together) as well as linear precedence relations (i.e. how words are ordered). In a PSG, a constituent contains at least one member, but has no upper bound. In contrast, with Merge theory, a constituent contains at most two members. Specifically, in Merge theory, each syntactic object is a constituent.
X-bar theory is a template that claims that all lexical items project three levels of structure: X, X', and XP. Consequently, there is a three-way distinction between Head, Complement, and Specifier:
While the first application of Merge is equivalent to the Head-Complement relation, the second application of Merge is equivalent to the Specifier-Head relation. However, the two theories differ in the claims they make about the nature of the Specifier-Head-Complement (S-H-C) structure. In X-bar theory, S-H-C is a primitive, an example of this is Kayne's antisymmetry theory. In a Merge theory, S-H-C is derivative.
In linguistics, syntax is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure (constituency), agreement, the nature of crosslinguistic variation, and the relationship between form and meaning (semantics). Diverse approaches, such as generative grammar and functional grammar, offer unique perspectives on syntax, reflecting its complexity and centrality to understanding human language.
A syntactic category is a syntactic unit that theories of syntax assume. Word classes, largely corresponding to traditional parts of speech, are syntactic categories. In phrase structure grammars, the phrasal categories are also syntactic categories. Dependency grammars, however, do not acknowledge phrasal categories.
In grammar, a phrase—called expression in some contexts—is a group of words or singular word acting as a grammatical unit. For instance, the English expression "the very happy squirrel" is a noun phrase which contains the adjective phrase "very happy". Phrases can consist of a single word or a complete sentence. In theoretical linguistics, phrases are often analyzed as units of syntactic structure such as a constituent. There is a difference between the common use of the term phrase and its technical use in linguistics. In common usage, a phrase is usually a group of words with some special idiomatic meaning or other significance, such as "all rights reserved", "economical with the truth", "kick the bucket", and the like. It may be a euphemism, a saying or proverb, a fixed expression, a figure of speech, etc.. In linguistics, these are known as phrasemes.
Phrase structure rules are a type of rewrite rule used to describe a given language's syntax and are closely associated with the early stages of transformational grammar, proposed by Noam Chomsky in 1957. They are used to break down a natural language sentence into its constituent parts, also known as syntactic categories, including both lexical categories and phrasal categories. A grammar that uses phrase structure rules is a type of phrase structure grammar. Phrase structure rules as they are commonly employed operate according to the constituency relation, and a grammar that employs phrase structure rules is therefore a constituency grammar; as such, it stands in contrast to dependency grammars, which are based on the dependency relation.
In linguistics, transformational grammar (TG) or transformational-generative grammar (TGG) is part of the theory of generative grammar, especially of natural languages. It considers grammar to be a system of rules that generate exactly those combinations of words that form grammatical sentences in a given language and involves the use of defined operations to produce new sentences from existing ones.
A noun phrase – or NP or nominal (phrase) – is a phrase that usually has a noun or pronoun as its head, and has the same grammatical functions as a noun. Noun phrases are very common cross-linguistically, and they may be the most frequently occurring phrase type.
A parse tree or parsing tree is an ordered, rooted tree that represents the syntactic structure of a string according to some context-free grammar. The term parse tree itself is used primarily in computational linguistics; in theoretical syntax, the term syntax tree is more common.
Lexical semantics, as a subfield of linguistic semantics, is the study of word meanings. It includes the study of how words structure their meaning, how they act in grammar and compositionality, and the relationships between the distinct senses and uses of a word.
In linguistics, X-bar theory is a model of phrase-structure grammar and a theory of syntactic category formation that was first proposed by Noam Chomsky in 1970 reformulating the ideas of Zellig Harris (1951), and further developed by Ray Jackendoff, along the lines of the theory of generative grammar put forth in the 1950s by Chomsky. It attempts to capture the structure of phrasal categories with a single uniform structure called the X-bar schema, basing itself on the assumption that any phrase in natural language is an XP that is headed by a given syntactic category X. It played a significant role in resolving issues that phrase structure rules had, representative of which is the proliferation of grammatical rules, which is against the thesis of generative grammar.
In linguistics, the minimalist program is a major line of inquiry that has been developing inside generative grammar since the early 1990s, starting with a 1993 paper by Noam Chomsky.
In linguistics, the head or nucleus of a phrase is the word that determines the syntactic category of that phrase. For example, the head of the noun phrase boiling hot water is the noun water. Analogously, the head of a compound is the stem that determines the semantic category of that compound. For example, the head of the compound noun handbag is bag, since a handbag is a bag, not a hand. The other elements of the phrase or compound modify the head, and are therefore the head's dependents. Headed phrases and compounds are called endocentric, whereas exocentric ("headless") phrases and compounds lack a clear head. Heads are crucial to establishing the direction of branching. Head-initial phrases are right-branching, head-final phrases are left-branching, and head-medial phrases combine left- and right-branching.
In linguistics, branching refers to the shape of the parse trees that represent the structure of sentences. Assuming that the language is being written or transcribed from left to right, parse trees that grow down and to the right are right-branching, and parse trees that grow down and to the left are left-branching. The direction of branching reflects the position of heads in phrases, and in this regard, right-branching structures are head-initial, whereas left-branching structures are head-final. English has both right-branching (head-initial) and left-branching (head-final) structures, although it is more right-branching than left-branching. Some languages such as Japanese and Turkish are almost fully left-branching (head-final). Some languages are mostly right-branching (head-initial).
The term phrase structure grammar was originally introduced by Noam Chomsky as the term for grammar studied previously by Emil Post and Axel Thue. Some authors, however, reserve the term for more restricted grammars in the Chomsky hierarchy: context-sensitive grammars or context-free grammars. In a broader sense, phrase structure grammars are also known as constituency grammars. The defining character of phrase structure grammars is thus their adherence to the constituency relation, as opposed to the dependency relation of dependency grammars.
Principles and parameters is a framework within generative linguistics in which the syntax of a natural language is described in accordance with general principles and specific parameters that for particular languages are either turned on or off. For example, the position of heads in phrases is determined by a parameter. Whether a language is head-initial or head-final is regarded as a parameter which is either on or off for particular languages. Principles and parameters was largely formulated by the linguists Noam Chomsky and Howard Lasnik. Many linguists have worked within this framework, and for a period of time it was considered the dominant form of mainstream generative linguistics.
In generative grammar, non-configurational languages are languages characterized by a flat phrase structure, which allows syntactically discontinuous expressions, and a relatively free word order.
In theoretical linguistics, a distinction is made between endocentric and exocentric constructions. A grammatical construction is said to be endocentric if it fulfils the same linguistic function as one of its parts, and exocentric if it does not. The distinction reaches back at least to Bloomfield's work of the 1930s, who based it on terms by Pāṇini and Patañjali in Sanskrit grammar. Such a distinction is possible only in phrase structure grammars, since in dependency grammars all constructions are necessarily endocentric.
In linguistics, the projection principle is a stipulation proposed by Noam Chomsky as part of the phrase structure component of generative-transformational grammar. The projection principle is used in the derivation of phrases under the auspices of the principles and parameters theory.
In linguistics, Immediate Constituent Analysis (ICA) is a syntactic theory which focuses on the hierarchical structure of sentences by isolating and identifying the constituents. While the idea of breaking down sentences into smaller components can be traced back to early psychological and linguistic theories, ICA as a formal method was developed in the early 20th century. It was influenced by Wilhelm Wundt's psychological theories of sentence structure but was later refined and formalized within the framework of structural linguistics by Leonard Bloomfield. The method gained traction in the distributionalist tradition through the work of Zellig Harris and Charles F. Hockett, who expanded and applied it to sentence analysis. Additionally, ICA was further explored within the context of glossematics by Knud Togeby. These contributions helped ICA become a central tool in syntactic analysis, focusing on the hierarchical relationships between sentence constituents.
In linguistics, a discontinuity occurs when a given word or phrase is separated from another word or phrase that it modifies in such a manner that a direct connection cannot be established between the two without incurring crossing lines in the tree structure. The terminology that is employed to denote discontinuities varies depending on the theory of syntax at hand. The terms discontinuous constituent, displacement, long distance dependency, unbounded dependency, and projectivity violation are largely synonymous with the term discontinuity. There are various types of discontinuities, the most prominent and widely studied of these being topicalization, wh-fronting, scrambling, and extraposition.
In formal syntax, a node is a point in a tree diagram or syntactic tree that can be assigned a syntactic category label.