Last updated

An idiom is a phrase or expression that typically presents a figurative, non-literal meaning attached to the phrase; but some phrases become figurative idioms while retaining the literal meaning of the phrase. Categorized as formulaic language, an idiom's figurative meaning is different from the literal meaning. [1] Idioms occur frequently in all languages; in English alone there are an estimated twenty-five million[ dubious ] idiomatic expressions. [2]



Many idiomatic expressions were meant literally in their original use, but sometimes the attribution of the literal meaning changed and the phrase itself grew away from its original roots—typically leading to a folk etymology. For instance, the phrase "spill the beans" (meaning to reveal a secret) is first attested in 1919, but has been said to originate from an ancient method of voting by depositing beans in jars, which could be spilled, prematurely revealing the results. [3]

Other idioms are deliberately figurative. For example, "break a leg" is an ironic expression to wish a person good luck just prior to their giving a performance or presentation. It may have arisen from the superstition that one ought not utter the words "good luck" to an actor because it is believed that doing so will cause the opposite result. [4]


Love is blind--an idiom meaning a person who is in love can see no faults or imperfections in the person whom they love Donato Barcaglia, Liebe macht blind, Ausschnitt.jpg
Love is blind —an idiom meaning a person who is in love can see no faults or imperfections in the person whom they love

In linguistics, idioms are usually presumed to be figures of speech contradicting the principle of compositionality. That compositionality is the key notion for the analysis of idioms is emphasized in most accounts of idioms. [6] [7] This principle states that the meaning of a whole should be constructed from the meanings of the parts that make up the whole. In other words, one should be in a position to understand the whole if one understands the meanings of each of the parts that make up the whole. The following example is widely employed to illustrate the point:

Fred kicked the bucket.

Understood compositionally, Fred has literally kicked an actual, physical bucket. The much more likely idiomatic reading, however, is non-compositional: Fred is understood to have died. Arriving at the idiomatic reading from the literal reading is unlikely for most speakers. What this means is that the idiomatic reading is, rather, stored as a single lexical item that is now largely independent of the literal reading.

In phraseology, idioms are defined as a sub-type of phraseme, the meaning of which is not the regular sum of the meanings of its component parts. [8] John Saeed defines an idiom as collocated words that became affixed to each other until metamorphosing into a fossilised term. [9] This collocation of words redefines each component word in the word-group and becomes an idiomatic expression. Idioms usually do not translate well; in some cases, when an idiom is translated directly word-for-word into another language, either its meaning is changed or it is meaningless.

When two or three words are conventionally used together in a particular sequence, they form an irreversible binomial. For example, a person may be left "high and dry", but never "dry and high". Not all irreversible binomials are idioms, however: "chips and dip" is irreversible, but its meaning is straightforwardly derived from its components.


Idioms possess varying degrees of mobility. Whereas some idioms are used only in a routine form, others can undergo syntactic modifications such as passivization, raising constructions, and clefting, demonstrating separable constituencies within the idiom. [10] Mobile idioms, allowing such movement, maintain their idiomatic meaning where fixed idioms do not:

I spilled the beans on our project.The beans were spilled on our project.
The old man kicked the bucket.The bucket was kicked (by the old man).

Many fixed idioms lack semantic composition, meaning that the idiom contains the semantic role of a verb, but not of any object. This is true of kick the bucket, which means die. By contrast, the semantically composite idiom spill the beans, meaning reveal a secret, contains both a semantic verb and object, reveal and secret. Semantically composite idioms have a syntactic similarity between their surface and semantic forms. [10]

The types of movement allowed for certain idioms also relate to the degree to which the literal reading of the idiom has a connection to its idiomatic meaning. This is referred to as motivation or transparency. While most idioms that do not display semantic composition generally do not allow non-adjectival modification, those that are also motivated allow lexical substitution. [11] For example, oil the wheels and grease the wheels allow variation for nouns that elicit a similar literal meaning. [12] These types of changes can occur only when speakers can easily recognize a connection between what the idiom is meant to express and its literal meaning, thus an idiom like kick the bucket cannot occur as kick the pot.

From the perspective of dependency grammar, idioms are represented as a catena which cannot be interrupted by non-idiomatic content. Although syntactic modifications introduce disruptions to the idiomatic structure, this continuity is only required for idioms as lexical entries. [13]

Certain idioms, allowing unrestricted syntactic modification, can be said to be metaphors. Expressions such as jump on the bandwagon, pull strings, and draw the line all represent their meaning independently in their verbs and objects, making them compositional. In the idiom jump on the bandwagon, jump on involves joining something and a 'bandwagon' can refer to a collective cause, regardless of context. [10]


A word-by-word translation of an opaque idiom will most likely not convey the same meaning in other languages. The English idiom kick the bucket has a variety of equivalents in other languages, such as kopnąć w kalendarz ("kick the calendar") in Polish, casser sa pipe ("to break his pipe") in French [14] and tirare le cuoia ("pulling the leathers") in Italian. [15]

Some idioms are transparent. [16] Much of their meaning gets through if they are taken (or translated) literally. For example, lay one's cards on the table meaning to reveal previously unknown intentions or to reveal a secret. Transparency is a matter of degree; spill the beans (to let secret information become known) and leave no stone unturned (to do everything possible in order to achieve or find something) are not entirely literally interpretable but involve only a slight metaphorical broadening. Another category of idioms is a word having several meanings, sometimes simultaneously, sometimes discerned from the context of its usage. This is seen in the (mostly uninflected) English language in polysemes, the common use of the same word for an activity, for those engaged in it, for the product used, for the place or time of an activity, and sometimes for a verb.

Idioms tend to confuse those unfamiliar with them; students of a new language must learn its idiomatic expressions as vocabulary. Many natural language words have idiomatic origins but are assimilated and so lose their figurative senses. For example, in Portuguese, the expression saber de coração 'to know by heart', with the same meaning as in English, was shortened to 'saber de cor', and, later, to the verb decorar, meaning memorize.

In 2015, TED collected 40 examples of bizarre idioms that cannot be translated literally. They include the Swedish saying "to slide in on a shrimp sandwich", which refers those who did not have to work to get where they are. [17]

Conversely, idioms may be shared between multiple languages. For example, the Arabic phrase في نفس المركب (fi nafs al-markeb) is translated as "in the same boat," and it carries the same figurative meaning as the equivalent idiom in English.

According to the German linguist Elizabeth Piirainen, the idiom "to get on one's nerves" has the same figurative meaning in 57 European languages. She also says that the phrase "to shed crocodile tears," meaning to express insincere sorrow, is similarly widespread in European languages but is also used in Arabic, Swahili, Persian, Chinese, Mongolian, and several others.[ citation needed ]

The origin of cross-language idioms is uncertain. One theory is that cross-language idioms are a language contact phenomenon, resulting from a word-for-word translation called a calque. Piirainen says that may happen as a result of lingua franca usage in which speakers incorporate expressions from their own native tongue, which exposes them to speakers of other languages. Other theories suggest they come from a shared ancestor language or that humans are naturally predisposed to develop certain metaphors.[ citation needed ]

Dealing with non-compositionality

The non-compositionality of meaning of idioms challenges theories of syntax. The fixed words of many idioms do not qualify as constituents in any sense. For example:

How do we get to the bottom of this situation?

The fixed words of this idiom (in bold) do not form a constituent in any theory's analysis of syntactic structure because the object of the preposition (here this situation) is not part of the idiom (but rather it is an argument of the idiom). One can know that it is not part of the idiom because it is variable; for example, How do we get to the bottom of this situation / the claim / the phenomenon / her statement / etc. What this means is that theories of syntax that take the constituent to be the fundamental unit of syntactic analysis are challenged. The manner in which units of meaning are assigned to units of syntax remains unclear. This problem has motivated a tremendous amount of discussion and debate in linguistics circles and it is a primary motivator behind the Construction Grammar framework. [18]

A relatively recent development in the syntactic analysis of idioms departs from a constituent-based account of syntactic structure, preferring instead the catena-based account. The catena unit was introduced to linguistics by William O'Grady in 1998. Any word or any combination of words that are linked together by dependencies qualifies as a catena. [19] The words constituting idioms are stored as catenae in the lexicon, and as such, they are concrete units of syntax. The dependency grammar trees of a few sentences containing non-constituent idioms illustrate the point:

Idiom trees 1'.png

The fixed words of the idiom (in orange) in each case are linked together by dependencies; they form a catena. The material that is outside of the idiom (in normal black script) is not part of the idiom. The following two trees illustrate proverbs:

Idiom trees 2.png

The fixed words of the proverbs (in orange) again form a catena each time. The adjective nitty-gritty and the adverb always are not part of the respective proverb and their appearance does not interrupt the fixed words of the proverb. A caveat concerning the catena-based analysis of idioms concerns their status in the lexicon. Idioms are lexical items, which means they are stored as catenae in the lexicon. In the actual syntax, however, some idioms can be broken up by various functional constructions.

The catena-based analysis of idioms provides a basis for an understanding of meaning compositionality. The Principle of Compositionality can in fact be maintained. Units of meaning are being assigned to catenae, whereby many of these catenae are not constituents.

Various studies have investigated methods to develop the ability to interpret idioms in children with various diagnoses including Autism, [20] Moderate Learning Difficulties, [21] Developmental Language Disorder [22] and typically developing weak readers. [23]

See also

Related Research Articles

In grammar, a phrase—called expression in some contexts—is a group of words or singular word acting as a grammatical unit. For instance, the English expression "the very happy squirrel" is a noun phrase which contains the adjective phrase "very happy". Phrases can consist of a single word or a complete sentence. In theoretical linguistics, phrases are often analyzed as units of syntactic structure such as a constituent.

<span class="mw-page-title-main">Idiom dictionary</span> Dictionary or phrase book that lists and explains idioms

An idiom dictionary is a dictionary or phrase book that lists and explains idioms – distinctive words or phrases having a figurative meaning that goes beyond the original semantics of the words.

Lexical semantics, as a subfield of linguistic semantics, is the study of word meanings. It includes the study of how words structure their meaning, how they act in grammar and compositionality, and the relationships between the distinct senses and uses of a word.

In linguistics, a determiner phrase (DP) is a type of phrase headed by a determiner such as many. Controversially, many approaches, take a phrase like not very many apples to be a DP, headed, in this case, by the determiner many. This is called the DP analysis or the DP hypothesis. Others reject this analysis in favor of the more traditional NP analysis where apples would be the head of the phrase in which the DP not very many is merely a dependent. Thus, there are competing analyses concerning heads and dependents in nominal groups. The DP analysis developed in the late 1970s and early 1980s, and it is the majority view in generative grammar today.

In semantics, mathematical logic and related disciplines, the principle of compositionality is the principle that the meaning of a complex expression is determined by the meanings of its constituent expressions and the rules used to combine them. This principle is also called Frege's principle, because Gottlob Frege is widely credited for the first modern formulation of it. The principle was never explicitly stated by Frege, and it was arguably already assumed by George Boole decades before Frege's work.

Dependency grammar (DG) is a class of modern grammatical theories that are all based on the dependency relation and that can be traced back primarily to the work of Lucien Tesnière. Dependency is the notion that linguistic units, e.g. words, are connected to each other by directed links. The (finite) verb is taken to be the structural center of clause structure. All other syntactic units (words) are either directly or indirectly connected to the verb in terms of the directed links, which are called dependencies. Dependency grammar differs from phrase structure grammar in that while it can identify phrases it tends to overlook phrasal nodes. A dependency structure is determined by the relation between a word and its dependents. Dependency structures are flatter than phrase structures in part because they lack a finite verb phrase constituent, and they are thus well suited for the analysis of languages with free word order, such as Czech or Warlpiri.

Construction grammar is a family of theories within the field of cognitive linguistics which posit that constructions, or learned pairings of linguistic patterns with meanings, are the fundamental building blocks of human language. Constructions include words, morphemes, fixed expressions and idioms, and abstract grammatical rules such as the passive voice or the ditransitive. Any linguistic pattern is considered to be a construction as long as some aspect of its form or its meaning cannot be predicted from its component parts, or from other constructions that are recognized to exist. In construction grammar, every utterance is understood to be a combination of multiple different constructions, which together specify its precise meaning and form.

The term predicate is used in one of two ways in linguistics and its subfields. The first defines a predicate as everything in a standard declarative sentence except the subject, and the other views it as just the main content verb or associated predicative expression of a clause. Thus, by the first definition the predicate of the sentence Frank likes cake is likes cake. By the second definition, the predicate of the same sentence is just the content verb likes, whereby Frank and cake are the arguments of this predicate. Differences between these two definitions can lead to confusion.

In linguistics, control is a construction in which the understood subject of a given predicate is determined by some expression in context. Stereotypical instances of control involve verbs. A superordinate verb "controls" the arguments of a subordinate, nonfinite verb. Control was intensively studied in the government and binding framework in the 1980s, and much of the terminology from that era is still used today. In the days of Transformational Grammar, control phenomena were discussed in terms of Equi-NP deletion. Control is often analyzed in terms of a null pronoun called PRO. Control is also related to raising, although there are important differences between control and raising. Most if not all languages have control constructions and these constructions tend to occur frequently.

In linguistics, semantic analysis is the process of relating syntactic structures, from the levels of phrases, clauses, sentences and paragraphs to the level of the writing as a whole, to their language-independent meanings. It also involves removing features specific to particular linguistic and cultural contexts, to the extent that such a project is possible. The elements of idiom and figurative speech, being cultural, are often also converted into relatively invariant meanings in semantic analysis. Semantics, although related to pragmatics, is distinct in that the former deals with word or sentence choice in any given context, while pragmatics considers the unique or particular meaning derived from context or tone. To reiterate in different terms, semantics is about universally coded meaning, and pragmatics, the meaning encoded in words that is then interpreted by an audience.

In lexicography, a lexical item is a single word, a part of a word, or a chain of words (catena) that forms the basic elements of a language's lexicon (≈ vocabulary). Examples are cat, traffic light, take care of, by the way, and it's raining cats and dogs. Lexical items can be generally understood to convey a single meaning, much as a lexeme, but are not limited to single words. Lexical items are like semes in that they are "natural units" translating between languages, or in learning a new language. In this last sense, it is sometimes said that language consists of grammaticalized lexis, and not lexicalized grammar. The entire store of lexical items in a language is called its lexis.

A phraseme, also called a set phrase, fixed expression, multiword expression, , is a multi-word or multi-morphemic utterance whose components include at least one that is selectionally constrained or restricted by linguistic convention such that it is not freely chosen. In the most extreme cases, there are expressions such as X kicks the bucket ≈ ‘person X dies of natural causes, the speaker being flippant about X’s demise’ where the unit is selected as a whole to express a meaning that bears little or no relation to the meanings of its parts. All of the words in this expression are chosen restrictedly, as part of a chunk. At the other extreme, there are collocations such as stark naked, hearty laugh, or infinite patience where one of the words is chosen freely based on the meaning the speaker wishes to express while the choice of the other (intensifying) word is constrained by the conventions of the English language. Both kinds of expression are phrasemes, and can be contrasted with ’’free phrases’’, expressions where all of the members are chosen freely, based exclusively on their meaning and the message that the speaker wishes to communicate.

In linguistics, periphrasis is the use of one or more function words to express meaning that otherwise may be expressed by attaching an affix or clitic to a word. The resulting phrase includes two or more collocated words instead of one inflected word. The word periphrasis originates from the Greek word periphrazein, which means talking around.

Antecedent-contained deletion (ACD), also called antecedent-contained ellipsis, is a phenomenon whereby an elided verb phrase appears to be contained within its own antecedent. For instance, in the sentence "I read every book that you did", the verb phrase in the main clause appears to license ellipsis inside the relative clause which modifies its object. ACD is a classic puzzle for theories of the syntax-semantics interface, since it threatens to introduce an infinite regress. It is commonly taken as motivation for syntactic transformations such as quantifier raising, though some approaches explain it using semantic composition rules or by adoption more flexible notions of what it means to be a syntactic unit.

A multiword expression (MWE), also called phraseme, is a lexeme-like unit made up of a sequence of two or more lexemes that has properties that are not predictable from the properties of the individual lexemes or their normal mode of combination. MWEs differ from lexemes in that the latter are required by many sources to have meaning that cannot be derived from the meaning of separate components. While MWEs must have some properties that cannot be derived from the same property of the components, the property in question does not need to be meaning.

In linguistics, gapping is a type of ellipsis that occurs in the non-initial conjuncts of coordinate structures. Gapping usually elides minimally a finite verb and further any non-finite verbs that are present. This material is "gapped" from the non-initial conjuncts of a coordinate structure. Gapping exists in many languages, but by no means in all of them, and gapping has been studied extensively and is therefore one of the more understood ellipsis mechanisms. Stripping is viewed as a particular manifestation of the gapping mechanism where just one remnant appears in the gapped/stripped conjunct.

In linguistics, a catena is a unit of syntax and morphology, closely associated with dependency grammars. It is a more flexible and inclusive unit than the constituent and its proponents therefore consider it to be better suited than the constituent to serve as the fundamental unit of syntactic and morphosyntactic analysis.

Pseudogapping is an ellipsis mechanism that elides most but not all of a non-finite verb phrase; at least one part of the verb phrase remains, which is called the remnant. Pseudogapping occurs in comparative and contrastive contexts, so it appears often after subordinators and coordinators such as if, although, but, than, etc. It is similar to verb phrase ellipsis (VP-ellipsis) insofar as the ellipsis is introduced by an auxiliary verb, and many grammarians take it to be a particular type of VP-ellipsis. The distribution of pseudogapping is more restricted than that of VP-ellipsis, however, and in this regard, it has some traits in common with gapping. But unlike gapping, pseudogapping occurs in English but not in closely related languages. The analysis of pseudogapping can vary greatly depending in part on whether the analysis is based in a phrase structure grammar or a dependency grammar. Pseudogapping was first identified, named, and explored by Stump (1977) and has since been studied in detail by Levin (1986) among others, and now enjoys a firm position in the canon of acknowledged ellipsis mechanisms of English.

Comprehension of idioms is the act of processing and understanding idioms. Idioms are a common type of figure of speech. Based on common linguistic definitions, an idiom is a combination of words that contains a meaning that cannot be understood based on the literal definition of the individual words. An example of an idiom is hit the sack, which means to go to bed. It can be used in a sentence like the following: I'm beat; I'm gonna hit the sack.

In linguistics, the syntax–semantics interface is the interaction between syntax and semantics. Its study encompasses phenomena that pertain to both syntax and semantics, with the goal of explaining correlations between form and meaning. Specific topics include scope, binding, and lexical semantic properties such as verbal aspect and nominal individuation, semantic macroroles, and unaccusativity.


  1. The Oxford companion to the English language (1992:495f.)
  2. Jackendoff (1997).
  3. "The Mavens' Word of the Day: Spill the Beans". Random House. 23 February 2001. Archived from the original on 25 April 2011. Retrieved 28 July 2021.
  4. Gary Martin. "Break a leg". The Phrase Finder. Archived from the original on 2015-04-02. Retrieved 2015-03-29.
  5. Elizabeth Knowles, ed. (2006). The Oxford Dictionary of Phrase and Fable. Oxford University Press. pp. 302–3. ISBN   9780191578564. the saying is generally used to mean that a person is often unable to see faults in the one they love.
  6. Radford (2004:187f.)
  7. Portner (2005:33f).
  8. Mel'čuk (1995:167–232).
  9. For Saeed's definition, see Saeed (2003:60).
  10. 1 2 3 Horn, George (2003). "Idioms, Metaphors, and Syntactic Mobility". Journal of Linguistics. 39 (2): 245–273. doi:10.1017/s0022226703002020.
  11. Keizer, Evelien (2016). "Idiomatic expressions in Functional Discourse Grammar". Linguistics. 54 (5): 981–1016. doi:10.1515/ling-2016-0022. S2CID   151574119.
  12. Mostafa, Massrura (2010). "Variation in V+the+N idioms". English Today. 26 (4): 37–43. doi:10.1017/s0266078410000325. S2CID   145266570.
  13. O'Grady, William (1998). "The Syntax of Idioms". Natural Language and Linguistic Theory. 16 (2): 279–312. doi:10.1023/a:1005932710202. S2CID   170903210.
  14. "Translation of the idiom kick the bucket in French". Archived from the original on 2022-05-19. Retrieved 2018-01-06.
  15. "Translation of the idiom kick the bucket in Italian". Archived from the original on 2018-01-07. Retrieved 2018-01-06.
  16. Gibbs, R. W. (1987)
  17. "40 brilliant idioms that simply can't be translated literally". TED Blog. Archived from the original on 2016-04-09. Retrieved 2016-04-08.
  18. Culicver and Jackendoff (2005:32ff.)
  19. Osborne and Groß (2012:173ff.)
  20. Mashal and Kasirer, 2011
  21. Ezell and Goldstein, 1992
  22. Benjamin, Ebbels and Newton, 2020
  23. Lundblom and Woods, 2012


  • Benjamin, L.; Ebbels, S.; Newton, C. (2020). "Investigating the effectiveness of idiom intervention for 9-16 year olds with developmental language disorder". International Journal of Language and Communication Disorders. 55 (2): 266–286. doi: 10.1111/1460-6984.12519 . PMID   31867833.
  • Crystal, A dictionary of linguistics and phonetics, 4th edition. Oxford, UK: Blackwell Publishers.
  • Culicover, P. and R. Jackendoff. 2005. Simpler syntax. Oxford, UK: Oxford University Press.
  • Ezell, H.; Goldstein, H. (1992). "Teaching Idiom Comprehension To Children with Mental Retardation". Journal of Applied Behavior Analysis. 25 (1): 181–191. doi: 10.1901/jaba.1992.25-181 . PMC   1279665 . PMID   1582965.
  • Gibbs, R (1987). "Linguistic factors in children's understanding of idioms". Journal of Child Language. 14 (3): 569–586. doi:10.1017/s0305000900010291. PMID   2447110. S2CID   6544015.
  • Jackendoff, R. 1997. The architecture of the language faculty. Cambridge, Massachusetts: MIT Press.
  • Jurafsky, D. and J. Martin. 2008. Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition. Dorling Kindersley (India): Pearson Education, Inc.
  • Leaney, C. 2005. In the know: Understanding and using idioms. New York: Cambridge University Press.
  • Lundblom, E.; Woods, J. (2012). "Working in the Classroom: Improving Idiom Comprehension Through Classwide Peer Tutoring". Communication Disorders Quarterly. 33 (4): 202–219. doi:10.1177/1525740111404927. S2CID   143858683.
  • Mel’čuk, I. 1995. "Phrasemes in language and phraseology in linguistics". In M. Everaert, E.-J. van der Linden, A. Schenk and R. Schreuder (eds.), Idioms: Structural and psychological perspectives, 167–232. Hillsdale, N.J.: Lawrence Erlbaum.
  • Mashal, Nira; Kasirer, Anat (2011). "Thinking maps enhance metaphoric competence in children with autism and learning disabilities". Research in Developmental Disabilities. 32 (6): 2045–2054. doi:10.1016/j.ridd.2011.08.012. PMID   21985987.
  • O'Grady, W (1998). "The syntax of idioms". Natural Language and Linguistic Theory. 16 (2): 79–312. doi:10.1023/A:1005932710202. S2CID   170903210.
  • Osborne, T.; Groß, T. (2012). "Constructions are catenae: Construction Grammar meets Dependency Grammar". Cognitive Linguistics. 23 (1): 163–214. doi:10.1515/cog-2012-0006.
  • Portner, P. 2005. What is meaning?: Fundamentals of formal semantics. Malden, MA: Blackwell Publishing.
  • Radford, A. English syntax: An introduction. Cambridge, UK: Cambridge University Press.
  • Saeed, J. 2003. Semantics. 2nd edition. Oxford: Blackwell.

Further reading