Historical Thesaurus of the Oxford English Dictionary , two-volume set

In general usage, a thesaurus is a reference work that lists words grouped together according to similarity of meaning (containing synonyms and sometimes antonyms), in contrast to a dictionary, which provides definitions for words, and generally lists them in alphabetical order. The main purpose of such reference works is for users "to find the word, or words, by which [an] idea may be most fitly and aptly expressed," quoting Peter Mark Roget, author of Roget's Thesaurus . [1]

Reference work Publication to which one can refer for confirmed facts

A reference work is a book or periodical to which one can refer for information. The information is intended to be found quickly when needed. Reference works are usually referred to for particular pieces of information, rather than read beginning to end. The writing style used in these works is informative; the authors avoid use of the first person, and emphasize facts. Many reference works are compiled by a team of contributors whose work is coordinated by one or more editors rather than by an individual author. Indices are commonly provided in many types of reference work. Updated editions are usually published as needed, in some cases annually. Reference works include dictionaries, thesauruses, encyclopedias, almanacs, bibliographies, and catalogs. Many reference works are available in electronic form and can be obtained as application software, CD-ROMs, DVDs, or online through the Internet.

Dictionary collection of words and their meanings

A dictionary, sometimes known as a wordbook, is a collection of words in one or more specific languages, often arranged alphabetically, which may include information on definitions, usage, etymologies, pronunciations, translation, etc. or a book of words in one language with their equivalents in another, sometimes known as a lexicon. It is a lexicographical reference that shows inter-relationships among the data.

Peter Mark Roget British physician, philologist

Peter Mark Roget was a British physician, natural theologian and lexicographer. He is best known for publishing, in 1852, the Thesaurus of English Words and Phrases, a classified collection of related words.


Although including synonyms, a thesaurus should not be taken as a complete list of all the synonyms for a particular word. The entries are also designed for drawing distinctions between similar words and assisting in choosing exactly the right word. Unlike a dictionary, a thesaurus entry does not give the definition of words.

In library science and information science, thesauri have been widely used to specify domain models. Recently, thesauri have been implemented with Simple Knowledge Organization System (SKOS). [2]

Library science is an interdisciplinary or multidisciplinary field that applies the practices, perspectives, and tools of management, information technology, education, and other areas to libraries; the collection, organization, preservation, and dissemination of information resources; and the political economy of information. Martin Schrettinger, a Bavarian librarian, coined the discipline within his work (1808–1828) Versuch eines vollständigen Lehrbuchs der Bibliothek-Wissenschaft oder Anleitung zur vollkommenen Geschäftsführung eines Bibliothekars. Rather than classifying information based on nature-oriented elements, as was previously done in his Bavarian library, Schrettinger organized books in alphabetical order. The first American school for library science was founded by Melvil Dewey at Columbia University in 1887.

Information science field primarily concerned with the analysis, collection, classification, manipulation, storage, retrieval and dissemination of information

Information science is a field primarily concerned with the analysis, collection, classification, manipulation, storage, retrieval, movement, dissemination, and protection of information. Practitioners within and outside the field study application and usage of knowledge in organizations along with the interaction between people, organizations, and any existing information systems with the aim of creating, replacing, improving, or understanding information systems. Historically, information science is associated with computer science, psychology, technology and intelligence agencies. However, information science also incorporates aspects of diverse fields such as archival science, cognitive science, commerce, law, linguistics, museology, management, mathematics, philosophy, public policy, and social sciences.

Simple Knowledge Organization System (SKOS) is a W3C recommendation designed for representation of thesauri, classification schemes, taxonomies, subject-heading systems, or any other type of structured controlled vocabulary. SKOS is part of the Semantic Web family of standards built upon RDF and RDFS, and its main objective is to enable easy publication and use of such vocabularies as linked data.


The word "thesaurus" is derived from 16th-century New Latin, in turn from Latin thēsaurus , which is the Latinisation of the Greek θησαυρός (thēsauros), "treasure, treasury, storehouse". [3] The word thēsauros is of uncertain etymology. Douglas Harper derives it from the root of the Greek verb τιθέναι tithenai, "to put, to place." [3] Robert Beekes rejected an Indo-European derivation and suggested a Pre-Greek suffix *-arwo-. [4]

New Latin Form of the Latin language between c. 1375 and c. 1900

New Latin was a revival in the use of Latin in original, scholarly, and scientific works between c. 1375 and c. 1900. Modern scholarly and technical nomenclature, such as in zoological and botanical taxonomy and international scientific vocabulary, draws extensively from New Latin vocabulary. In such use, New Latin is subject to new word formation. As a language for full expression in prose or poetry, however, it is often distinguished from its successor, Contemporary Latin.

Latin Indo-European language of the Italic family

Latin is a classical language belonging to the Italic branch of the Indo-European languages. The Latin alphabet is derived from the Etruscan and Greek alphabets and ultimately from the Phoenician alphabet.

Ancient Greek Version of the Greek language used from roughly the 9th century BC to the 6th century AD

The ancient Greek language includes the forms of Greek used in Ancient Greece and the ancient world from around the 9th century BC to the 6th century AD. It is often roughly divided into the Archaic period, Classical period, and Hellenistic period. It is antedated in the second millennium BC by Mycenaean Greek and succeeded by Medieval Greek.

From the 16th to the 19th centuries, the term "thesaurus" was applied to any dictionary or encyclopedia, as in the Thesaurus Linguae Latinae (Dictionary of the Latin Language, 1532), and the Thesaurus Linguae Graecae (Dictionary of the Greek Language, 1572). The meaning "collection of words arranged according to sense" is first attested in 1852 in Roget's title and thesaurer is attested in Middle English for "treasurer". [3]

Encyclopedia type of reference work

An encyclopedia or encyclopaedia is a reference work or compendium providing summaries of knowledge either from all branches or from a particular field or discipline. Encyclopedias are divided into articles or entries that are often arranged alphabetically by article name and sometimes by thematic categories. Encyclopedia entries are longer and more detailed than those in most dictionaries. Generally speaking, unlike dictionary entries—which focus on linguistic information about words, such as their etymology, meaning, pronunciation, use, and grammatical forms—encyclopedia articles focus on factual information concerning the subject named in the article's title.

<i>Thesaurus Linguae Latinae</i> organization

The Thesaurus Linguae Latinae is a monumental dictionary of Latin founded on historical principles. It encompasses the Latin language from the time of its origin to the time of Isidore of Seville.

The Thesaurus Linguae Graecae (TLG) is a research center at the University of California, Irvine. The TLG was founded in 1972 by Marianne McDonald with the goal to create a comprehensive digital collection of all surviving texts written in Greek from antiquity to the present era. Since 1972, the TLG has collected and digitized most surviving literary texts written in Greek from Homer to the fall of Constantinople in 1453 CE, and beyond. Theodore Brunner (1934-2007) directed the project from 1972 until his retirement from the University of California in 1998. Maria Pantelia, also a classics professor at UC Irvine, succeeded Theodore Brunner in 1998, and has been directing the TLG since. TLG's name is shared with its online database, the full title of which is Thesaurus Linguae Graecae: A Digital Library of Greek Literature.


Peter Mark Roget, author of the first modern thesaurus.

In antiquity, Philo of Byblos authored the first text that could now be called a thesaurus. In Sanskrit, the Amarakosha is a thesaurus in verse form, written in the 4th century. The Amarakosha mentions 18 prior works, but they have all been lost.[ citation needed ]

Philo of Byblos, also known as Herennius Philon, was an antiquarian writer of grammatical, lexical and historical works in Greek. He is chiefly known for his Phoenician history assembled from the writings of Sanchuniathon.

Sanskrit language of ancient Indian subcontinent

Sanskrit is a language of ancient India with a 3,500-year history. It is the primary liturgical language of Hinduism and the predominant language of most works of Hindu philosophy as well as some of the principal texts of Buddhism and Jainism. Sanskrit, in its variants and numerous dialects, was the lingua franca of ancient and medieval India. In the early 1st millennium CE, along with Buddhism and Hinduism, Sanskrit migrated to Southeast Asia, parts of East Asia and Central Asia, emerging as a language of high culture and of local ruling elites in these regions.

<i>Amarakosha</i> thesaurus of Sanskrit written by the ancient Indian scholar Amarasimha

The Amarakosha is the popular name for Namalinganushasanam a thesaurus in Sanskrit written by the ancient Indian scholar Amarasimha. It may be the oldest extant kosha. The author himself mentions 18 prior works, but they have all been lost. There have been more than 40 commentaries on the Amarakosha.

The first modern thesaurus was Roget's Thesaurus , first compiled in 1805 by Peter Mark Roget, and last published in 1852. Since its publication, it has never been out of print and is still a widely used work across the English-speaking world. [5] Entries in Roget's Thesaurus are listed conceptually rather than alphabetically. Roget described his thesaurus in the foreword to the first edition:

It is now nearly fifty years since I first projected a system of verbal classification similar to that on which the present work is founded. Conceiving that such a compilation might help to supply my own deficiencies, I had, in the year 1805, completed a classed catalogue of words on a small scale, but on the same principle, and nearly in the same form, as the Thesaurus now published. [6]

Thesauri have been used to perform automatic word-sense disambiguation [7] and text simplification for machine translation systems. [8]

