A phrasal template is a phrase-long collocation that contains one or several empty slots which may be filled by words to produce individual phrases.
A phrasal template is a phrase-long collocation that contains one or several empty slots which may be filled by words to produce individual phrases. Often there are some restrictions on the grammatic category of the words allowed to fill particular slots. Phrasal templates are akin to forms, in which blanks are to be filled with appropriate data. The term phrasal template first appeared in a linguistic study of prosody in 1983 [1] but doesn't appear to have come into common use until the late 1990s. [2] An example is the phrase "common stocks rose <Number> to <Number>", e.g., "common stocks rose 1.72 to 340.36". [3]
The neologism "snowclone" was introduced to refer to a special case of phrasal templates that "clone" popular clichés. For example, a misquotation of Diana Vreeland's "Pink is the navy blue of India" [4] may have given rise to the template "<color> is the new black", which in turn evolved into "<X> is the new <Y>".[ citation needed ]
An idiom is a phrase or expression that typically presents a figurative, non-literal meaning attached to the phrase; but some phrases become figurative idioms while retaining the literal meaning of the phrase. Categorized as formulaic language, an idiom's figurative meaning is different from the literal meaning. Idioms occur frequently in all languages; in English alone there are an estimated twenty-five thousand idiomatic expressions.
Word-sense disambiguation (WSD) is an open problem in computational linguistics concerned with identifying which sense of a word is used in a sentence. The solution to this issue impacts other computer-related writing, such as discourse, improving relevance of search engines, anaphora resolution, coherence, and inference.
In linguistics, X-bar theory is a theory of syntactic category formation that was first proposed by Chomsky (1970) and further developed by Jackendoff (1977), along the lines of the theory of generative grammar put forth in the 1950s by Noam Chomsky. It attempts to capture the structure of phrasal categories with a single uniform structure called the X-bar schema, basing itself on the assumption that any phrase in natural language is an XP that is headed by a given syntactic category X. It played a significant role in resolving issues that phrase structure rules had, representative of which is the proliferation of grammatical rules, which is against the thesis of generative grammar.
Question answering (QA) is a computer science discipline within the fields of information retrieval and natural language processing (NLP), which is concerned with building systems that automatically answer questions posed by humans in a natural language.
In corpus linguistics, a collocation is a series of words or terms that co-occur more often than would be expected by chance. In phraseology, a collocation is a type of compositional phraseme, meaning that it can be understood from the words that make it up. This contrasts with an idiom, where the meaning of the whole cannot be inferred from its parts, and may be completely unrelated.
Semantic prosody, also discourse prosody, describes the way in which certain seemingly neutral words can be perceived with positive or negative associations through frequent occurrences with particular collocations. Coined in analogy to linguistic prosody, popularised by Bill Louw.
Construction grammar is a family of theories within the field of cognitive linguistics which posit that constructions, or learned pairings of linguistic patterns with meanings, are the fundamental building blocks of human language. Constructions include words, morphemes, fixed expressions and idioms, and abstract grammatical rules such as the passive voice or the ditransitive. Any linguistic pattern is considered to be a construction as long as some aspect of its form or its meaning cannot be predicted from its component parts, or from other constructions that are recognized to exist. In construction grammar, every utterance is understood to be a combination of multiple different constructions, which together specify its precise meaning and form.
In linguistics and pedagogy, an interlinear gloss is a gloss placed between lines, such as between a line of original text and its translation into another language. When glossed, each line of the original text acquires one or more corresponding lines of transcription known as an interlinear text or interlinear glossed text (IGT)—interlinear for short. Such glosses help the reader follow the relationship between the source text and its translation, and the structure of the original language. In its simplest form, an interlinear gloss is simply a literal, word-for-word translation of the source text.
A snowclone is a cliché and phrasal template that can be used and recognized in multiple variants. The term was coined as a neologism in 2004, derived from journalistic clichés that referred to the number of Eskimo words for snow.
In lexicography, a lexical item is a single word, a part of a word, or a chain of words (catena) that forms the basic elements of a language's lexicon (≈ vocabulary). Examples are cat, traffic light, take care of, by the way, and it's raining cats and dogs. Lexical items can be generally understood to convey a single meaning, much as a lexeme, but are not limited to single words. Lexical items are like semes in that they are "natural units" translating between languages, or in learning a new language. In this last sense, it is sometimes said that language consists of grammaticalized lexis, and not lexicalized grammar. The entire store of lexical items in a language is called its lexis.
Sentiment analysis is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information. Sentiment analysis is widely applied to voice of the customer materials such as reviews and survey responses, online and social media, and healthcare materials for applications that range from marketing to customer service to clinical medicine. With the rise of deep language models, such as RoBERTa, also more difficult data domains can be analyzed, e.g., news texts where authors typically express their opinion/sentiment less explicitly.
In linguistics, statistical semantics applies the methods of statistics to the problem of determining the meaning of words or phrases, ideally through unsupervised learning, to a degree of precision at least sufficient for the purpose of information retrieval.
Ontology learning is the automatic or semi-automatic creation of ontologies, including extracting the corresponding domain's terms and the relationships between the concepts that these terms represent from a corpus of natural language text, and encoding them with an ontology language for easy retrieval. As building ontologies manually is extremely labor-intensive and time-consuming, there is great motivation to automate the process.
Janet Dean Fodor is distinguished professor of linguistics at the City University of New York. Her primary field is psycholinguistics, and her research interests include human sentence processing, prosody, learnability theory and L1 (first-language) acquisition.
Collocation extraction is the task of using a computer to extract collocations automatically from a corpus.
In natural language processing, semantic role labeling is the process that assigns labels to words or phrases in a sentence that indicates their semantic role in the sentence, such as that of an agent, goal, or result.
Norbert Schmitt is an American linguist and a Professor of Applied Linguistics at the University of Nottingham in the United Kingdom. He is known for his work on second language vocabulary acquisition and second language vocabulary teaching. He has published numerous books and papers on vocabulary acquisition.
In natural language processing (NLP), word embedding is a term used for the representation of words for text analysis, typically in the form of a real-valued vector that encodes the meaning of the word such that the words that are closer in the vector space are expected to be similar in meaning. Word embeddings can be obtained using a set of language modeling and feature learning techniques where words or phrases from the vocabulary are mapped to vectors of real numbers. Conceptually it involves the mathematical embedding from space with many dimensions per word to a continuous vector space with a much lower dimension.
Prosodic bootstrapping in linguistics refers to the hypothesis that learners of a primary language (L1) use prosodic features such as pitch, tempo, rhythm, amplitude, and other auditory aspects from the speech signal as a cue to identify other properties of grammar, such as syntactic structure. Acoustically signaled prosodic units in the stream of speech may provide critical perceptual cues by which infants initially discover syntactic phrases in their language. Although these features by themselves are not enough to help infants learn the entire syntax of their native language, they provide various cues about different grammatical properties of the language, such as identifying the ordering of heads and complements in the language using stress prominence, indicating the location of phrase boundaries, and word boundaries. It is argued that prosody of a language plays an initial role in the acquisition of the first language helping children to uncover the syntax of the language, mainly due to the fact that children are sensitive to prosodic cues at a very young age.
Semantic parsing is the task of converting a natural language utterance to a logical form: a machine-understandable representation of its meaning. Semantic parsing can thus be understood as extracting the precise meaning of an utterance. Applications of semantic parsing include machine translation, question answering, ontology induction, automated reasoning, and code generation. The phrase was first used in the 1970s by Yorick Wilks as the basis for machine translation programs working with only semantic representations.