FrameNet

Last updated
FrameNet
Mission statement Building a lexical database based on a theory of meaning called Frame semantics.
Commercial?No (freely available for download)
Type of projectLexical database (containing: frames, frame elements(FE), lexical units (LU), examples sentences, and frame relations)
Location International Computer Science Institute in Berkeley, California
OwnerCollin Baker (current project manager)
Founder Charles J. Fillmore
Established1997;26 years ago (1997)
Website framenet.icsi.berkeley.edu

FrameNet is a group of online lexical databases based upon the theory of meaning known as Frame semantics, developed by linguist Charles J. Fillmore. The project's fundamental notion is simple: most words' meanings may be best understood in terms of a semantic frame, which is a description of a certain kind of event, connection, or item and its actors.

Contents

As an illustration, the act of cooking usually requires the following: a cook, the food being cooked, a container to hold the food while it is being cooked, and a heating instrument. [1] Within FrameNet, this act is represented by a frame named Apply_heat, and its components (Cook, Food, Container, and Heating_instrument), are referred to as frame elements (FEs). The Apply_heat frame also lists a number of words that represent it, known as lexical units (LUs), like fry, bake, boil, and broil.

Other frames are simpler. For example, Placing only has an agent or cause, a theme—something that is placed—and the location where it is placed. Some frames are more complex, like Revenge, which contains more FEs (offender, injury, injured party, avenger, and punishment).[ citation needed ] As in the examples of Apply_heat and Revenge below, FrameNet's role is to define the frames and annotate sentences to demonstrate how the FEs fit syntactically around the word that elicits the frame. [1]

Concepts

Frames

A frame is a schematic representation of a situation involving various participants, props, and other conceptual roles. Examples of frame names are Being_born and Locative_relation. A frame in FrameNet contains a textual description of what it represents (a frame definition), associated frame elements, lexical units, example sentences, and frame-to-frame relations.

Frame elements

Frame elements (FE) provide additional information to the semantic structure of a sentence. Each frame has a number of core and non-core FEs which can be thought of as semantic roles. Core FEs are essential to the meaning of the frame while non-core FEs are generally descriptive (such as time, place, manner, etc.) [2] For example:

FrameNet includes shallow data on syntactic roles that frame elements play in the example sentences. For example, for a sentence like "She was born about AD 460", FrameNet would mark She as a noun phrase referring to the Child frame element, and "about AD 460" as a noun phrase corresponding to the Time frame element. Details of how frame elements can be realized in a sentence are important because this reveals important information about the subcategorization frames as well as possible diathesis alternations (e.g. "John broke the window" vs. "The window broke") of a verb.

Lexical units

Lexical units (LUs) are lemmas, with their part of speech, that evoke a specific frame. In other words, when an LU is identified in a sentence, that specific LU can be associated with its specific frame(s). For each frame, there may be many LUs associated to that frame, and also there may be many frames that share a specific LU; this is typically the case with LUs that have multiple word senses. [2] Alongside the frame, each lexical unit is associated with specific frame elements by means of the annotated example sentences.

For example, lexical units that evoke the Complaining frame (or more specific perspectivized versions of it, to be precise), include the verbs complain, grouse, lament, and others. [5]

Example sentences

Frames are associated with example sentences and frame elements are marked within the sentences. Thus, the sentence

She was born about AD 460

is associated with the frame Being_born, while She is marked as the frame element Child and "about AD 460" is marked as Time. [3]

From the start, the FrameNet project has been committed to looking at evidence from actual language use as found in text collections like the British National Corpus. Based on such example sentences, automatic semantic role labeling tools are able to determine frames and mark frame elements in new sentences.

Valences

FrameNet also exposes statistics on the valence of each frame; that is, the number and position of the frame elements within example sentences. The sentence

She was born about AD 460

falls in the valence pattern

NP Ext, INI --, NP Dep

which occurs twice in the FrameNet's annotation report for the born.v lexical unit, [3] namely:

She was bornabout AD 460, daughter and granddaughter of Roman and Byzantine emperors, whose family had been prominent in Roman politics for over 700 years.
He was soon posted to north Africa, and never met their only child, a daughterborn8 June 1941.

Frame relations

FrameNet additionally captures relationships between different frames using relations. These include the following:

Applications

FrameNet has proven to be useful in a number of computational applications, because computers need additional knowledge in order to recognize that "John sold a car to Mary" and "Mary bought a car from John" describe essentially the same situation, despite using two quite different verbs, different prepositions and a different word order. FrameNet has been used in applications like question answering, paraphrasing, recognizing textual entailment, and information extraction, either directly or by means of Semantic Role Labeling tools. The first automatic system for Semantic Role Labeling (SRL, sometimes also referred to as "shallow semantic parsing") was developed by Daniel Gildea and Daniel Jurafsky based on FrameNet in 2002. [6] Semantic Role Labeling has since become one of the standard tasks in natural language processing, with the latest version (1.7) of FrameNet now fully supported in the Natural Language Toolkit. [7]

Since frames are essentially semantic descriptions, they are similar across languages, and several projects have arisen over the years that have relied on the original FrameNet as the basis for additional non-English FrameNets, for Spanish, Japanese, German, and Polish, among others.

See also

Related Research Articles

<span class="mw-page-title-main">Semantic network</span> Knowledge base that represents semantic relations between concepts in a network

A semantic network, or frame network is a knowledge base that represents semantic relations between concepts in a network. This is often used as a form of knowledge representation. It is a directed or undirected graph consisting of vertices, which represent concepts, and edges, which represent semantic relations between concepts, mapping or connecting semantic fields. A semantic network may be instantiated as, for example, a graph database or a concept map. Typical standardized semantic networks are expressed as semantic triples.

Case roles, according to the work by Fillmore (1967), are the semantic roles of noun phrases in relation to the syntactic structures that contain these noun phrases. The term case role is most widely used for purely semantic relations, including theta roles and thematic roles, that can be independent of the morpho-syntax. The concept of case roles is related to the larger notion of Case which is defined as a system of marking dependent nouns for the type of semantic or syntactic relationship they bear to their heads. Case traditionally refers to inflectional marking.

Theta roles are the names of the participant roles associated with a predicate: the predicate may be a verb, an adjective, a preposition, or a noun. If an object is in motion or in a steady state as the speakers perceives the state, or it is the topic of discussion, it is called a theme. The participant is usually said to be an argument of the predicate. In generative grammar, a theta role or θ-role is the formal device for representing syntactic argument structure—the number and type of noun phrases—required syntactically by a particular verb. For example, the verb put requires three arguments.

Construction grammar is a family of theories within the field of cognitive linguistics which posit that constructions, or learned pairings of linguistic patterns with meanings, are the fundamental building blocks of human language. Constructions include words, morphemes, fixed expressions and idioms, and abstract grammatical rules such as the passive voice or the ditransitive. Any linguistic pattern is considered to be a construction as long as some aspect of its form or its meaning cannot be predicted from its component parts, or from other constructions that are recognized to exist. In construction grammar, every utterance is understood to be a combination of multiple different constructions, which together specify its precise meaning and form.

Shallow parsing is an analysis of a sentence which first identifies constituent parts of sentences and then links them to higher order units that have discrete grammatical meanings. While the most elementary chunking algorithms simply link constituent parts on the basis of elementary search patterns, approaches that use machine learning techniques can take contextual information into account and thus compose chunks in such a way that they better reflect the semantic relations between the basic constituents. That is, these more advanced methods get around the problem that combinations of elementary constituents can have different higher level meanings depending on the context of the sentence.

In linguistics, valency or valence is the number and type of arguments controlled by a predicate, content verbs being typical predicates. Valency is related, though not identical, to subcategorization and transitivity, which count only object arguments – valency counts all arguments, including the subject. The linguistic meaning of valency derives from the definition of valency in chemistry. Like valency found in chemistry, there is the binding of specific elements. In the grammatical theory of valency, the verbs organize sentences by binding the specific elements. Examples of elements that would be bound would be the complement and the actant. Although the term originates from valence in chemistry, linguistic valency has a close analogy in mathematics under the term arity.

Cognitive semantics is part of the cognitive linguistics movement. Semantics is the study of linguistic meaning. Cognitive semantics holds that language is part of a more general human cognitive ability, and can therefore only describe the world as people conceive of it. It is implicit that different linguistic communities conceive of simple things and processes in the world differently, not necessarily some difference between a person's conceptual world and the real world.

<span class="mw-page-title-main">Charles J. Fillmore</span> American linguist

Charles J. Fillmore was an American linguist and Professor of Linguistics at the University of California, Berkeley. He received his Ph.D. in Linguistics from the University of Michigan in 1961. Fillmore spent ten years at Ohio State University and a year as a Fellow at the Center for Advanced Study in the Behavioral Sciences at Stanford University before joining Berkeley's Department of Linguistics in 1971. Fillmore was extremely influential in the areas of syntax and lexical semantics.

Frame semantics is a theory of linguistic meaning developed by Charles J. Fillmore that extends his earlier case grammar. It relates linguistic semantics to encyclopedic knowledge. The basic idea is that one cannot understand the meaning of a single word without access to all the essential knowledge that relates to that word. For example, one would not be able to understand the word "sell" without knowing anything about the situation of commercial transfer, which also involves, among other things, a seller, a buyer, goods, money, the relation between the money and the goods, the relations between the seller and the goods and the money, the relation between the buyer and the goods and the money and so on. Thus, a word activates, or evokes, a frame of semantic knowledge relating to the specific concept to which it refers.

Case grammar is a system of linguistic analysis, focusing on the link between the valence, or number of subjects, objects, etc., of a verb and the grammatical context it requires. The system was created by the American linguist Charles J. Fillmore in the context of Transformational Grammar (1968). This theory analyzes the surface syntactic structure of sentences by studying the combination of deep cases which are required by a specific verb. For instance, the verb "give" in English requires an Agent (A) and Object (O), and a Beneficiary (B); e.g. "Jones (A) gave money (O) to the school (B).

In lexicography, a lexical item is a single word, a part of a word, or a chain of words (catena) that forms the basic elements of a language's lexicon (≈ vocabulary). Examples are cat, traffic light, take care of, by the way, and it's raining cats and dogs. Lexical items can be generally understood to convey a single meaning, much as a lexeme, but are not limited to single words. Lexical items are like semes in that they are "natural units" translating between languages, or in learning a new language. In this last sense, it is sometimes said that language consists of grammaticalized lexis, and not lexicalized grammar. The entire store of lexical items in a language is called its lexis.

In linguistics, an argument is an expression that helps complete the meaning of a predicate, the latter referring in this context to a main verb and its auxiliaries. In this regard, the complement is a closely related concept. Most predicates take one, two, or three arguments. A predicate and its arguments form a predicate-argument structure. The discussion of predicates and arguments is associated most with (content) verbs and noun phrases (NPs), although other syntactic categories can also be construed as predicates and as arguments. Arguments must be distinguished from adjuncts. While a predicate needs its arguments to complete its meaning, the adjuncts that appear with a predicate are optional; they are not necessary to complete the meaning of the predicate. Most theories of syntax and semantics acknowledge arguments and adjuncts, although the terminology varies, and the distinction is generally believed to exist in all languages. Dependency grammars sometimes call arguments actants, following Lucien Tesnière (1959).

PropBank is a corpus that is annotated with verbal propositions and their arguments—a "proposition bank". Although "PropBank" refers to a specific corpus produced by Martha Palmer et al., the term propbank is also coming to be used as a common noun referring to any corpus that has been annotated with propositions and their arguments.

In frame semantics, a theory of linguistic meaning, null instantiation is the name of a category used to annotate, or tag, absent semantic constituents or frame elements. Frame semantics, best exemplified by the FrameNet project, views words as evoking frames of knowledge and frames as typically involving multiple components, called frame elements. The term null refers to the fact that the frame element in question is absent. The logical object of the term instantiation refers to the frame element itself. So, null instantiation is an empty instantiation of a frame element. Ruppenhofer and Michaelis postulate an implicational regularity tying the interpretation type of an omitted argument to the frame membership of its predicator: "If a particular frame element role is lexically omissible under a particular interpretation for one LU [lexical unit] in a frame, then for any other LUs in the same frame that allow the omission of this same FE [frame element], the interpretation of the missing FE is the same."

Frames are an artificial intelligence data structure used to divide knowledge into substructures by representing "stereotyped situations". They were proposed by Marvin Minsky in his 1974 article "A Framework for Representing Knowledge". Frames are the primary data structure used in artificial intelligence frame languages; they are stored as ontologies of sets.

In certain theories of linguistics, thematic relations, also known as semantic roles, are the various roles that a noun phrase may play with respect to the action or state described by a governing verb, commonly the sentence's main verb. For example, in the sentence "Susan ate an apple", Susan is the doer of the eating, so she is an agent; an apple is the item that is eaten, so it is a patient.

Meaning–text theory (MTT) is a theoretical linguistic framework, first put forward in Moscow by Aleksandr Žolkovskij and Igor Mel’čuk, for the construction of models of natural language. The theory provides a large and elaborate basis for linguistic description and, due to its formal character, lends itself particularly well to computer applications, including machine translation, phraseology, and lexicography.

In natural language processing, semantic role labeling is the process that assigns labels to words or phrases in a sentence that indicates their semantic role in the sentence, such as that of an agent, goal, or result.

In linguistics, subcategorization denotes the ability/necessity for lexical items to require/allow the presence and types of the syntactic arguments with which they co-occur. For example, the word "walk" as in "X walks home" requires the noun-phrase X to be animate.

Syntactic bootstrapping is a theory in developmental psycholinguistics and language acquisition which proposes that children learn word meanings by recognizing syntactic categories and the structure of their language. It is proposed that children have innate knowledge of the links between syntactic and semantic categories and can use these observations to make inferences about word meaning. Learning words in one's native language can be challenging because the extralinguistic context of use does not give specific enough information about word meanings. Therefore, in addition to extralinguistic cues, conclusions about syntactic categories are made which then lead to inferences about a word's meaning. This theory aims to explain the acquisition of lexical categories such as verbs, nouns, etc. and functional categories such as case markers, determiners, etc.

References

  1. 1 2 "What is FrameNet?". FrameNet. Archived from the original on 2023-08-03. Retrieved 2023-09-09.
  2. 1 2 "Glossary". FrameNet. Archived from the original on 2023-08-03. Retrieved 2023-09-09.
  3. 1 2 3 "Being_born.born.v (Annotation)". FrameNet. Archived from the original on 2023-09-09. Retrieved 2023-09-09.
  4. "Commerce_goods-transfer". FrameNet. Archived from the original on 2023-09-09. Retrieved 2023-09-09.
  5. "Complaining". FrameNet. Archived from the original on 2023-09-09. Retrieved 2023-09-09.
  6. Gildea, Daniel; Jurafsky, Daniel (2002). "Automatic Labeling of Semantic Roles" (PDF). Computational Linguistics. 28 (3): 245–288. doi: 10.1162/089120102760275983 . S2CID   207747200.
  7. Schneider, Nathan; Wooters, Chuck (2017). "The NLTK FrameNet API: Designing for Discoverability with a Rich Linguistic Resource". EMNLP 2017: Conference on Empirical Methods in Natural Language Processing. arXiv: 1703.07438 . Bibcode:2017arXiv170307438S.

Further reading