In syntactic analysis, a constituent is a word or a group of words that function as a single unit within a hierarchical structure. The constituent structure of sentences is identified using tests for constituents. [1] These tests apply to a portion of a sentence, and the results provide evidence about the constituent structure of the sentence. Many constituents are phrases. A phrase is a sequence of one or more words (in some theories two or more) built around a head lexical item and working as a unit within a sentence. A word sequence is shown to be a phrase/constituent if it exhibits one or more of the behaviors discussed below. The analysis of constituent structure is associated mainly with phrase structure grammars, although dependency grammars also allow sentence structure to be broken down into constituent parts.
Tests for constituents are diagnostics used to identify sentence structure. There are numerous tests for constituents that are commonly used to identify the constituents of English sentences. 15 of the most commonly used tests are listed next: 1) coordination (conjunction), 2) pro-form substitution (replacement), 3) topicalization (fronting), 4) do-so-substitution, 5) one-substitution, 6) answer ellipsis (question test), 7) clefting, 8) VP-ellipsis, 9) pseudoclefting, 10) passivization, 11) omission (deletion), 12) intrusion, 13) wh-fronting, 14) general substitution, 15) right node raising (RNR).
The order in which these 15 tests are listed here corresponds to the frequency of use, coordination being the most frequently used of the 15 tests and RNR being the least frequently used. A general word of caution is warranted when employing these tests, since they often deliver contradictory results. The tests are merely rough-and-ready tools that grammarians employ to reveal clues about syntactic structure. Some syntacticians even arrange the tests on a scale of reliability, with less-reliable tests treated as useful to confirm constituency though not sufficient on their own. Failing to pass a single test does not mean that the test string is not a constituent, and conversely, passing a single test does not necessarily mean the test string is a constituent. It is best to apply as many tests as possible to a given string in order to prove or to rule out its status as a constituent.
The 15 tests are introduced, discussed, and illustrated below mainly relying on the same one sentence: [2]
By restricting the introduction and discussion of the tests for constituents below mainly to this one sentence, it becomes possible to compare the results of the tests. To aid the discussion and illustrations of the constituent structure of this sentence, the following two sentence diagrams are employed (D = determiner, N = noun, NP = noun phrase, Pa = particle, S = sentence, V = Verb, VP = verb phrase):
These diagrams show two potential analyses of the constituent structure of the sentence. A given node in a tree diagram is understood as marking a constituent, that is, a constituent is understood as corresponding to a given node and everything that that node exhaustively dominates. Hence the first tree, which shows the constituent structure according to dependency grammar, marks the following words and word combinations as constituents: Drunks, off, the, the customers, and put off the customers. [3] The second tree, which shows the constituent structure according to phrase structure grammar, marks the following words and word combinations as constituents: Drunks, could, put, off, the, customers, the customers, put off the customers, and could put off the customers. The analyses in these two tree diagrams provide orientation for the discussion of tests for constituents that now follows.
The coordination test assumes that only constituents can be coordinated, i.e., joined by means of a coordinator such as and, or, or but: [4] The next examples demonstrate that coordination identifies individual words as constituents:
The square brackets mark the conjuncts of the coordinate structures. Based on these data, one might assume that drunks, could, put off, and customers are constituents in the test sentence because these strings can be coordinated with bums, would, drive away, and neighbors, respectively. Coordination also identifies multi-word strings as constituents:
These data suggest that the customers, put off the customers, and could put off the customers are constituents in the test sentence.
Examples such as (a-g) are not controversial insofar as many theories of sentence structure readily view the strings tested in sentences (a-g) as constituents. However, additional data are problematic, since they suggest that certain strings are also constituents even though most theories of syntax do not acknowledge them as such, e.g.
These data suggest that could put off, put off these, and Drunks could are constituents in the test sentence. Most theories of syntax reject the notion that these strings are constituents, though. Data such as (h-j) are sometimes addressed in terms of the right node raising (RNR) mechanism.
The problem for the coordination test represented by examples (h-j) is compounded when one looks beyond the test sentence, for one quickly finds that coordination suggests that a wide range of strings are constituents that most theories of syntax do not acknowledge as such, e.g.
The strings from home on Tuesday and from home on Tuesday on his bicycle are not viewed as constituents in most theories of syntax, and concerning sentence (m), it is very difficult there to even discern how one should delimit the conjuncts of the coordinate structure. The coordinate structures in (k-l) are sometimes characterized in terms of non-constituent conjuncts (NCC), and the instance of coordination in sentence (m) is sometimes discussed in terms of stripping and/or gapping.
Due to the difficulties suggested with examples (h-m), many grammarians view coordination skeptically regarding its value as a test for constituents. The discussion of the other tests for constituents below reveals that this skepticism is warranted, since coordination identifies many more strings as constituents than the other tests for constituents. [5]
Proform substitution, or replacement, involves replacing the test string with the appropriate proform (e.g. pronoun, pro-verb, pro-adjective, etc.). Substitution normally involves using a definite proform like it, he, there, here, etc. in place of a phrase or a clause. If such a change yields a grammatical sentence where the general structure has not been altered, then the test string is likely a constituent: [6]
These examples suggest that Drunks, the customers, and put off the customers in the test sentence are constituents. An important aspect of the proform test is the fact that it fails to identify most subphrasal strings as constituents, e.g.
These examples suggest that the individual words could, put, off, and customers should not be viewed as constituents. This suggestion is of course controversial, since most theories of syntax assume that individual words are constituents by default. The conclusion one can reach based on such examples, however, is that proform substitution using a definite proform identifies phrasal constituents only; it fails to identify sub-phrasal strings as constituents.
Topicalization involves moving the test string to the front of the sentence. It is a simple movement operation. [7] Many instances of topicalization seem only marginally acceptable when taken out of context. Hence to suggest a context, an instance of topicalization can be preceded by ...and and a modal adverb can be added as well (e.g. certainly):
These examples suggest that the customers and put off the customers are constituents in the test sentence. Topicalization is like many of the other tests in that it identifies phrasal constituents only. When the test sequence is a sub-phrasal string, topicalization fails:
These examples demonstrate that customers, could, put, off, and the fail the topicalization test. Since these strings are all sub-phrasal, one can conclude that topicalization is unable to identify sub-phrasal strings as constituents.
Do-so-substitution is a test that substitutes a form of do so (does so, did so, done so, doing so) into the test sentence for the target string. This test is widely used to probe the structure of strings containing verbs (because do is a verb). [8] The test is limited in its applicability, though, precisely because it is only applicable to strings containing verbs:
The 'a' example suggests that put off the customers is a constituent in the test sentence, whereas the b example fails to suggest that could put off the customers is a constituent, for do so cannot include the meaning of the modal verb could. To illustrate more completely how the do so test is employed, another test sentence is now used, one that contains two post-verbal adjunct phrases:
These data suggest that met them, met them in the pub, and met them in the pub because we had time are constituents in the test sentence. Taken together, such examples seem to motivate a structure for the test sentence that has a left-branching verb phrase, because only a left-branching verb phrase can view each of the indicated strings as a constituent. There is a problem with this sort of reasoning, however, as the next example illustrates:
In this case, did so appears to stand in for the discontinuous word combination consisting of met them and because we had time. Such a discontinuous combination of words cannot be construed as a constituent. That such an interpretation of did so is indeed possible is seen in a fuller sentence such as You met them in the cafe because you had time, and we did so in the pub. In this case, the preferred reading of did so is that it indeed simultaneously stands in for both met them and because we had time.
The one-substitution test replaces the test string with the indefinite pronoun one or ones. [9] If the result is acceptable, then the test string is deemed a constituent. Since one is a type of pronoun, one-substitution is only of value when probing the structure of noun phrases. In this regard, the test sentence from above is expanded in order to better illustrate the manner in which one-substitution is generally employed:
These examples suggest that customers, loyal customers, customers around here, loyal customers around here, and customers around here who we rely on are constituents in the test sentence. Some have pointed to a problem associated with the one-substitution in this area, however. This problem is that it is impossible to produce a single constituent structure of the noun phrase the loyal customers around here who we rely on that could simultaneous view all of the indicated strings as constituents. [10] Another problem that has been pointed out concerning the one-substitution as a test for constituents is the fact that it at times suggests that non-string word combinations are constituents, [11] e.g.
The word combination consisting of both loyal customers and who we rely on is discontinuous in the test sentence, a fact that should motivate one to generally question the value of one-substitution as a test for constituents.
The answer fragment test involves forming a question that contains a single wh-word (e.g. who, what, where, etc.). If the test string can then appear alone as the answer to such a question, then it is likely a constituent in the test sentence: [12]
These examples suggest that Drunks, the customers, and put off the customers are constituents in the test sentence. The answer fragment test is like most of the other tests for constituents in that it does not identify sub-phrasal strings as constituents:
These answer fragments are all grammatically unacceptable, suggesting that could, put, off, and customers are not constituents. Note as well that the latter two questions themselves are ungrammatical. It is apparently often impossible to form the question in a way that could successfully elicit the indicated strings as answer fragments. The conclusion, then, is that the answer fragment test is like most of the other tests in that it fails to identify sub-phrasal strings as constituents.
Clefting involves placing the test string X within the structure beginning with It is/was: It was X that.... [13] The test string appears as the pivot of the cleft sentence:
These examples suggest that Drunks and the customers are constituents in the test sentence. Example c is of dubious acceptability, suggesting that put off the customers may not be constituent in the test string. Clefting is like most of the other tests for constituents in that it fails to identify most individual words as constituents:
The examples suggest that each of the individual words could, put, off, the, and customers are not constituents, contrary to what most theories of syntax assume. In this respect, clefting is like many of the other tests for constituents in that it only succeeds at identifying certain phrasal strings as constituents.
The VP-ellipsis test checks to see which strings containing one or more predicative elements (usually verbs) can be elided from a sentence. Strings that can be elided are deemed constituents: [14] The symbol ∅ is used in the following examples to mark the position of ellipsis:
These examples suggest that put off is not a constituent in the test sentence, but that immediately put off the customers, put off the customers when they arrive, and immediately put off the customers when they arrive are constituents. Concerning the string put off the customers in (b), marginal acceptability makes it difficult to draw a conclusion about put off the customers.
There are various difficulties associated with this test. The first of these is that it can identify too many constituents, such as in this case here where it is impossible to produce a single constituent structure that could simultaneously view each of the three acceptable examples (c-e) as having elided a constituent. Another problem is that the test can at times suggest that a discontinuous word combination is a constituent, e.g.:
In this case, it appears as though the elided material corresponds to the discontinuous word combination including help and in the office.
Pseudoclefting is similar to clefting in that it puts emphasis on a certain phrase in a sentence. There are two variants of the pseudocleft test. One variant inserts the test string X in a sentence starting with a free relative clause: What.....is/are X; the other variant inserts X at the start of the sentence followed by the it/are and then the free relative clause: X is/are what/who... Only the latter of these two variants is illustrated here. [15]
These examples suggest that Drunks, the customers, and put off the customers are constituents in the test sentence. Pseudoclefting fails to identify most individual words as constituents:
The pseudoclefting test is hence like most of the other tests insofar as it identifies phrasal strings as constituents, but does not suggest that sub-phrasal strings are constituents.
Passivization involves changing an active sentence to a passive sentence, or vice versa. The object of the active sentence is changed to the subject of the corresponding passive sentence: [16]
The fact that sentence (b), the passive sentence, is acceptable, suggests that Drunks and the customers are constituents in sentence (a). The passivization test used in this manner is only capable of identifying subject and object words, phrases, and clauses as constituents. It does not help identify other phrasal or sub-phrasal strings as constituents. In this respect, the value of passivization as test for constituents is very limited.
Omission checks whether the target string can be omitted without influencing the grammaticality of the sentence. In most cases, local and temporal adverbials, attributive modifiers, and optional complements can be safely omitted and thus qualify as constituents. [17]
This sentence suggests that the definite article the is a constituent in the test sentence. Regarding the test sentence, however, the omission test is very limited in its ability to identify constituents, since the strings that one wants to check do not appear optionally. Therefore, the test sentence is adapted to better illustrate the omission test:
The ability to omit obnoxious, immediately, and when they arrive suggests that these strings are constituents in the test sentence. Omission used in this manner is of limited applicability, since it is incapable of identifying any constituent that appears obligatorily. Hence there are many target strings that most accounts of sentence structure take to be constituents but that fail the omission test because these constituents appear obligatorily, such as subject phrases.
Intrusion probes sentence structure by having an adverb "intrude" into parts of the sentence. The idea is that the strings on either side of the adverb are constituents. [18]
Example (a) suggests that Drunks and could put off the customers are constituents. Example (b) suggests that Drunks could and put off the customers are constituents. The combination of (a) and (b) suggest in addition that could is a constituent. Sentence (c) suggests that Drunks could put and off the customers are not constituents. Example (d) suggests that Drunks could put off and the customers are not constituents. And example (e) suggests that Drunks could put off the and customers are not constituents.
Those that employ the intrusion test usually use a modal adverb like definitely. This aspect of the test is problematic, though, since the results of the test can vary based upon the choice of adverb. For instance, manner adverbs distribute differently than modal adverbs and will hence suggest a distinct constituent structure from that suggested by modal adverbs.
Wh-fronting checks to see if the test string can be fronted as a wh-word. [19] This test is similar to the answer fragment test insofar it employs just the first half of that test, disregarding the potential answer to the question.
These examples suggest that Drunks, the customers, and put off the customers are constituents in the test sentence. Wh-fronting is like a number of the other tests in that it fails to identify many subphrasal strings as constituents:
These examples demonstrate a lack of evidence for viewing the individual words would, put, off, the, and customers as constituents.
The general substitution test replaces the test string with some other word or phrase. [20] It is similar to proform substitution, the only difference being that the replacement word or phrase is not a proform, e.g.
These examples suggest that the strings Drunks, the customers, and could are constituents in the test sentence. There is a major problem with this test, for it is easily possible to find a replacement word for strings that the other tests suggest are clearly not constituents, e.g.
These examples suggest that could put, Drunks could, and could put of the are constituents in the test sentence. This is contrary to what the other tests reveal and to what most theories of sentence structure assume. The value of general substitution as test for constituents is therefore suspect. It is like the coordination test in that it suggests that too many strings are constituents.
Right node raising, abbreviated as RNR, is a test that isolates the test string on the right side of a coordinate structure. [21] The assumption is that only constituents can be shared by the conjuncts of a coordinate structure, e.g.
These examples suggest that could put off the customers, put off the customers, and the customers are constituents in the test sentence. There are two problems with the RNR diagnostic as a test for constituents. The first is that it is limited in its applicability, since it is only capable of identifying strings as constituents if they appear on the right side of the test sentence. The second is that it can suggest strings to be constituents that most of the other tests suggest are not constituents. To illustrate this point, a different example must be used:
These examples suggest that their bicycles (his bicycle) to us to use if need be, to us to use if need be, and to use if need be are constituents in the test sentence. Most theories of syntax do not view these strings as constituents, and more importantly, most of the other tests suggest that they are not constituents. In short, these tests are not taken for granted because a constituent may pass one test and fail to pass many others. We need to consult our intuitive thinking when judging the constituency of any set of words.
A word of caution is warranted concerning the tests for constituents as just discussed above. These tests are found in textbooks on linguistics and syntax that are written mainly with the syntax of English in mind, and the examples that are discussed are mainly from English. The tests may or may not be valid and useful when probing the constituent structure of other languages. Ideally, a battery of tests for constituents can and should be developed for each language, catered to the idiosyncrasies of the language at hand.
Constituent structure analyses of sentences are a central concern for theories of syntax. A given theory can produce an analysis of constituent structure that is quite unlike the next. This point is evident with the two tree diagrams above of the sentence Drunks could put off the customers, where the dependency grammar analysis of constituent structure looks very much unlike the phrase structure analysis. The crucial difference across the two analyses is that the phrase structure analysis views every individual word as a constituent by default, whereas the dependency grammar analysis sees only those individual words as constituents that do not dominate other words. Phrase structure grammars therefore acknowledge many more constituents than dependency grammars.
A second example further illustrates this point (D = determiner, N = noun, NP = noun phrase, Pa = particle, S = sentence, V = Verb, V' = verb-bar, VP = verb phrase):
The dependency grammar tree shows five words and word combinations as constituents: who, these, us, these diagrams, and show us. The phrase structure tree, in contrast, shows nine words and word combinations as constituents: what, do, these, diagrams, show, us, these diagrams, show us, and do these diagrams show us. The two diagrams thus disagree concerning the status of do, diagrams, show, and do these diagrams show us, the phrase structure diagram showing them as constituents and the dependency grammar diagram showing them as non-constituents. To determine which analysis is more plausible, one turns to the tests for constituents discussed above. [22]
Within phrase structure grammars, views about of constituent structure can also vary significantly. Many modern phrase structure grammars assume that syntactic branching is always binary, that is, each greater constituent is necessarily broken down into two lesser constituents. More dated phrase structures analyses are, however, more likely to allow n-ary branching, that is, each greater constituent can be broken down into one, two, or more lesser constituents. The next two trees illustrate the distinction (Aux = auxiliary verb, AuxP = auxiliary verb phrase, Aux' = Aux-bar, D = determiner, N = noun, NP = noun phrase, P = preposition, PP = prepositional phrase, Pa = particle, S = sentence, t = trace, V = Verb, V' = verb-bar, VP = verb phrase):
The details in the second diagram here not crucial to the point at hand. This point is that the all branching there is strictly binary, whereas in the first tree diagram ternary branching is present twice, for the AuxP and for the VP. Observe in this regard that strictly binary branching analyses increase the number of (overt) constituents to what is possible. The word combinations have sent many things to us and many things to us are shown as constituents in the second tree diagram but not in the first. Which of these two analyses is better is again at least in part a matter of what the tests for constituents can reveal.
In linguistics, syntax is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure (constituency), agreement, the nature of crosslinguistic variation, and the relationship between form and meaning (semantics). Diverse approaches, such as generative grammar and functional grammar, offer unique perspectives on syntax, reflecting its complexity and centrality to understanding human language.
A syntactic category is a syntactic unit that theories of syntax assume. Word classes, largely corresponding to traditional parts of speech, are syntactic categories. In phrase structure grammars, the phrasal categories are also syntactic categories. Dependency grammars, however, do not acknowledge phrasal categories.
In grammar, a phrase—called expression in some contexts—is a group of words or singular word acting as a grammatical unit. For instance, the English expression "the very happy squirrel" is a noun phrase which contains the adjective phrase "very happy". Phrases can consist of a single word or a complete sentence. In theoretical linguistics, phrases are often analyzed as units of syntactic structure such as a constituent. There is a difference between the common use of the term phrase and its technical use in linguistics. In common usage, a phrase is usually a group of words with some special idiomatic meaning or other significance, such as "all rights reserved", "economical with the truth", "kick the bucket", and the like. It may be a euphemism, a saying or proverb, a fixed expression, a figure of speech, etc.. In linguistics, these are known as phrasemes.
Phrase structure rules are a type of rewrite rule used to describe a given language's syntax and are closely associated with the early stages of transformational grammar, proposed by Noam Chomsky in 1957. They are used to break down a natural language sentence into its constituent parts, also known as syntactic categories, including both lexical categories and phrasal categories. A grammar that uses phrase structure rules is a type of phrase structure grammar. Phrase structure rules as they are commonly employed operate according to the constituency relation, and a grammar that employs phrase structure rules is therefore a constituency grammar; as such, it stands in contrast to dependency grammars, which are based on the dependency relation.
An adjective phrase is a phrase whose head is an adjective. Almost any grammar or syntax textbook or dictionary of linguistics terminology defines the adjective phrase in a similar way, e.g. Kesner Bland (1996:499), Crystal (1996:9), Greenbaum (1996:288ff.), Haegeman and Guéron (1999:70f.), Brinton (2000:172f.), Jurafsky and Martin (2000:362). The adjective can initiate the phrase, conclude the phrase, or appear in a medial position. The dependents of the head adjective—i.e. the other words and phrases inside the adjective phrase—are typically adverb or prepositional phrases, but they can also be clauses. Adjectives and adjective phrases function in two basic ways, attributively or predicatively. An attributive adjective (phrase) precedes the noun of a noun phrase. A predicative adjective (phrase) follows a linking verb and serves to describe the preceding subject, e.g. The man is very happy.
A parse tree or parsing tree is an ordered, rooted tree that represents the syntactic structure of a string according to some context-free grammar. The term parse tree itself is used primarily in computational linguistics; in theoretical syntax, the term syntax tree is more common.
In linguistics, X-bar theory is a model of phrase-structure grammar and a theory of syntactic category formation that was first proposed by Noam Chomsky in 1970 reformulating the ideas of Zellig Harris (1951), and further developed by Ray Jackendoff, along the lines of the theory of generative grammar put forth in the 1950s by Chomsky. It attempts to capture the structure of phrasal categories with a single uniform structure called the X-bar schema, basing itself on the assumption that any phrase in natural language is an XP that is headed by a given syntactic category X. It played a significant role in resolving issues that phrase structure rules had, representative of which is the proliferation of grammatical rules, which is against the thesis of generative grammar.
In linguistics, a verb phrase (VP) is a syntactic unit composed of a verb and its arguments except the subject of an independent clause or coordinate clause. Thus, in the sentence A fat man quickly put the money into the box, the words quickly put the money into the box constitute a verb phrase; it consists of the verb put and its arguments, but not the subject a fat man. A verb phrase is similar to what is considered a predicate in traditional grammars.
Theta roles are the names of the participant roles associated with a predicate: the predicate may be a verb, an adjective, a preposition, or a noun. If an object is in motion or in a steady state as the speakers perceives the state, or it is the topic of discussion, it is called a theme. The participant is usually said to be an argument of the predicate. In generative grammar, a theta role or θ-role is the formal device for representing syntactic argument structure—the number and type of noun phrases—required syntactically by a particular verb. For example, the verb put requires three arguments.
In linguistics, pied-piping is a phenomenon of syntax whereby a given focused expression brings along an encompassing phrase with it when it is moved.
Topicalization is a mechanism of syntax that establishes an expression as the sentence or clause topic by having it appear at the front of the sentence or clause. This involves a phrasal movement of determiners, prepositions, and verbs to sentence-initial position. Topicalization often results in a discontinuity and is thus one of a number of established discontinuity types, the other three being wh-fronting, scrambling, and extraposition. Topicalization is also used as a constituency test; an expression that can be topicalized is deemed a constituent. The topicalization of arguments in English is rare, whereas circumstantial adjuncts are often topicalized. Most languages allow topicalization, and in some languages, topicalization occurs much more frequently and/or in a much less marked manner than in English. Topicalization in English has also received attention in the pragmatics literature.
In linguistics, an argument is an expression that helps complete the meaning of a predicate, the latter referring in this context to a main verb and its auxiliaries. In this regard, the complement is a closely related concept. Most predicates take one, two, or three arguments. A predicate and its arguments form a predicate-argument structure. The discussion of predicates and arguments is associated most with (content) verbs and noun phrases (NPs), although other syntactic categories can also be construed as predicates and as arguments. Arguments must be distinguished from adjuncts. While a predicate needs its arguments to complete its meaning, the adjuncts that appear with a predicate are optional; they are not necessary to complete the meaning of the predicate. Most theories of syntax and semantics acknowledge arguments and adjuncts, although the terminology varies, and the distinction is generally believed to exist in all languages. Dependency grammars sometimes call arguments actants, following Lucien Tesnière (1959).
In linguistics, a small clause consists of a subject and its predicate, but lacks an overt expression of tense. Small clauses have the semantic subject-predicate characteristics of a clause, and have some, but not all, properties of a constituent. Structural analyses of small clauses vary according to whether a flat or layered analysis is pursued. The small clause is related to the phenomena of raising-to-object, exceptional case-marking, accusativus cum infinitivo, and object control.
Nanosyntax is an approach to syntax where the terminal nodes of syntactic parse trees may be reduced to units smaller than a morpheme. Each unit may stand as an irreducible element and not be required to form a further "subtree." Due to its reduction to the smallest terminal possible, the terminals are smaller than morphemes. Therefore, morphemes and words cannot be itemised as a single terminal, and instead are composed by several terminals. As a result, nanosyntax can serve as a solution to phenomena that are inadequately explained by other theories of syntax.
Syntactic movement is the means by which some theories of syntax address discontinuities. Movement was first postulated by structuralist linguists who expressed it in terms of discontinuous constituents or displacement. Some constituents appear to have been displaced from the position in which they receive important features of interpretation. The concept of movement is controversial and is associated with so-called transformational or derivational theories of syntax. Representational theories, in contrast, reject the notion of movement and often instead address discontinuities with other mechanisms including graph reentrancies, feature passing, and type shifters.
In linguistics, Immediate Constituent Analysis (ICA) is a syntactic theory which focuses on the hierarchical structure of sentences by isolating and identifying the constituents. While the idea of breaking down sentences into smaller components can be traced back to early psychological and linguistic theories, ICA as a formal method was developed in the early 20th century. It was influenced by Wilhelm Wundt's psychological theories of sentence structure but was later refined and formalized within the framework of structural linguistics by Leonard Bloomfield. The method gained traction in the distributionalist tradition through the work of Zellig Harris and Charles F. Hockett, who expanded and applied it to sentence analysis. Additionally, ICA was further explored within the context of glossematics by Knud Togeby. These contributions helped ICA become a central tool in syntactic analysis, focusing on the hierarchical relationships between sentence constituents.
In linguistics, subcategorization denotes the ability/necessity for lexical items to require/allow the presence and types of the syntactic arguments with which they co-occur. For example, the word "walk" as in "X walks home" requires the noun-phrase X to be animate.
The lexical integrity hypothesis (LIH) or lexical integrity principle is a hypothesis in linguistics which states that syntactic transformations do not apply to subparts of words. It functions as a constraint on transformational grammar.
In the traditional grammar of Modern English, a phrasal verb typically constitutes a single semantic unit consisting of a verb followed by a particle, sometimes collocated with a preposition.
The lexicalist hypothesis is a hypothesis proposed by Noam Chomsky in which he claims that syntactic transformations only can operate on syntactic constituents. It says that the system of grammar that assembles words is separate and different from the system of grammar that assembles phrases out of words.