Index (publishing)

Last updated

An index (plural: usually indexes, more rarely indices; see below) is a list of words or phrases ('headings') and associated pointers ('locators') to where useful material relating to that heading can be found in a document or collection of documents. Examples are an index in the back matter of a book and an index that serves as a library catalog.


In a traditional back-of-the-book index, the headings will include names of people, places, events, and concepts selected by the indexer as being relevant and of interest to a possible reader of the book. The indexer may be the author, the editor, or a professional indexer working as a third party. The pointers are typically page numbers, paragraph numbers or section numbers.

In a library catalog the words are authors, titles, subject headings, etc., and the pointers are call numbers. Internet search engines (such as Google) and full-text searching help provide access to information but are not as selective as an index, as they provide non-relevant links, and may miss relevant information if it is not phrased in exactly the way they expect. [1]

Perhaps the most advanced investigation of problems related to book indexes is made in the development of topic maps, which started as a way of representing the knowledge structures inherent in traditional back-of-the-book indexes. The concept embodied by book indexes lent its name to database indexes, which similarly provide an abridged way to look up information in a larger collection, albeit one for computer use rather than human use.

Earliest examples in English

In the English language, indexes have been referred to as early as 1593, as can be seen from lines in Christopher Marlowe's Hero and Leander of that year:

Therefore, even as an index to a book
So to his mind was young Leander's look.

A similar reference to indexes is in Shakespeare's lines from Troilus and Cressida (I.3.344), written nine years later:

And in such indices, althougo small pricks
To their subsequent volumes, there is seen
The baby figure of the giant mass
Of things to come at large.

Table of contents of w:My Secret Life (memoir) Beginning of table of contents of My Secret LIfe.png
Table of contents of w:My Secret Life (memoir)

But according to G. Norman Knight, "at that period, as often as not, by an 'index to a book' was meant what we should now call a table of contents." [2] Until about the end of the nineteenth century, books, fiction as well as non-fiction, sometimes had very detailed chapter titles, which could be several sentences long.

Among the first indexes – in the modern sense – to a book in the English language was Leonard Mascall's [3] "A booke of the arte and maner how to plant and graffe all sortes of trees" printed in 1575. Another was one in Plutarch's Parallel Lives , in Sir Thomas North's 1595 translation. [2] A section entitled "An Alphabetical Table of the most material contents of the whole book" may be found in Henry Scobell's Acts and Ordinances of Parliament of 1658. This section comes after "An index of the general titles comprised in the ensuing Table". [2] Both of these indexes predate the index to Alexander Cruden's Concordance (1737), which is erroneously held to be the earliest index found in an English book. [2]

Etymology and plural

The word is derived from Latin, in which index means "one who points out", an "indication", or a "forefinger".

In Latin, the plural form of the word is indices. In English, the plural "indices" is commonly used in mathematical and computing contexts, and sometimes in bibliographical contexts – for example, in the 17-volume Women in World History: A Biographical Encyclopedia (1999–2002). [4] However, this form is now seen as an archaism by many writers and commentators, who prefer the anglicised plural "indexes". "Indexes" is widely used in the publishing industry; in the International Standard ISO 999, Information and documentation – Guidelines for the content, organization and presentation of indexes; and is preferred by the Oxford Style Manual . [5] The Chicago Manual of Style allows both forms. [6]

G. Norman Knight quotes Shakespeare's lines from Troilus and Cressida (I.3.344) – "And in such indexes ..." – and comments:

"But the real importance of this passage is that it establishes for all time the correct literary plural; we can leave the Latin form "indices" to the mathematicians (and similarly "appendices" to the anatomists)." [2]

Indexing process

The first page of the index of Novus Atlas Sinensis by Martino Martini (published as a section of Volume 10 of Joan Blaeu's Atlas Maior in 1655) Novus Atlas Sinensis - First page of the index.jpg
The first page of the index of Novus Atlas Sinensis by Martino Martini (published as a section of Volume 10 of Joan Blaeu's Atlas Maior in 1655)

Conventional indexing

The indexer reads through the text, identifying indexable concepts (those for which the text provides useful information and which will be of relevance for the text's readership). The indexer creates index headings to represent those concepts, which are phrased such that they can be found when in alphabetical order (so, for example, one would write 'indexing process' rather than 'how to create an index'). These headings and their associated locators (indicators to position in the text) are entered into specialist indexing software which handles the formatting of the index and facilitates the editing phase. The index is then edited to impose consistency throughout the index.

Indexers must analyze the text to enable presentation of concepts and ideas in the index that may not be named within the text. The index is intended to help the reader, researcher, or information professional, rather than the author, find information, so the professional indexer must act as a liaison between the text and its ultimate user.

In the United States, according to tradition, the index for a non-fiction book is the responsibility of the author, but most authors don't actually do it. Most indexing is done by freelancers hired by authors, publishers or an independent business which manages the production of a book, [7] publishers or book packagers. Some publishers and database companies employ indexers.

Before indexing software existed, indexes were created using slips of paper or, later, index cards. After hundreds of such slips or cards were filled out (as the indexer worked through the pages of the book proofs), they could then be shuffled by hand into alphabetical order, at which point they served as manuscript to be typeset into the printed index.

Indexing software

Software is available to aid the indexer in building a book index. [8] [9] There are several dedicated indexing software programs available to assist with the special sorting and copying needs involved in index preparation.

Embedded indexing

Embedded indexing involves including the index headings in the midst of the text itself, but surrounded by codes so that they are not normally displayed. A usable index is then generated automatically from the embedded text using the position of the embedded headings to determine the locators. Thus, when the pagination is changed the index can be regenerated with the new locators.

LaTeX documents support embedded indexes primarily through the MakeIndex package. Several widely used XML DTDs, including DocBook and TEI, have elements that allow index creation directly in the XML files. Most word processing software, such as StarWriter/ Writer, Microsoft Word, and WordPerfect, as well as some desktop publishing software (for example, FrameMaker and InDesign), as well as other tools (for example, MadCap Software's Flare), have some facility for embedded indexing as well. TExtract and IndexExploit support embedded indexing of Microsoft Word documents. [8]

An embedded index requires more time to create than a conventional static index; however, an embedded index can save time in the long run when the material is updated or repaginated. This is because, with a static index, if even a few pages change, the entire index must be revised or recreated while, with an embedded index, only the pages that changed need updating or indexing.


Indexes are also designed to help the reader find information quickly and easily. A complete and truly useful index is not simply a list of the words and phrases used in a publication (which is properly called a concordance), but an organized map of its contents, including cross-references, grouping of like concepts, and other useful intellectual analysis.

Sample back-of-the-book index excerpt:

sage, 41–42. See also Herbs ← directing the reader to related terms
Scarlet Sages. SeeSalvia coccinea ← redirecting the reader to term used in the text
shade plants ← grouping term (may not appear in the text; may be generated by indexer)
hosta, 93 ← subentries
myrtle, 46
sunflower, 47 ← regular entry

In books, indexes are usually placed near the end (this is commonly known as "BoB" or back-of-book indexing). They complement the table of contents by enabling access to information by specific subject, whereas contents listings enable access through broad divisions of the text arranged in the order they occur. It has been remarked that, while "[a]t first glance the driest part of the book, on closer inspection the index may provide both interest and amusement from time to time." [10]

Index quality

Some principles of good indexing include: [11]

Indexing pitfalls:

Indexer roles

Some indexers specialize in specific formats, such as scholarly books, microforms, web indexing (the application of a back-of-book-style index to a website or intranet), search engine indexing, database indexing (the application of a pre-defined controlled vocabulary such as MeSH to articles for inclusion in a database), and periodical indexing [12] (indexing of newspapers, journals, magazines).

Some indexers with expertise in controlled vocabularies also work as taxonomists and ontologists.

Some indexers specialize in particular subject areas, such as anthropology, business, computers, economics, education, government documents, history, law, mathematics, medicine, psychology, and technology. An indexer can be found for any subject.

In "The Library of Babel", a short story by Jorge Luis Borges, there is an index of indexes that catalogues all of the books in the library, which contains all possible books.

Kurt Vonnegut's novel Cat's Cradle includes a character who is a professional indexer and believes that "indexing [is] a thing that only the most amateurish author [undertakes] to do for his own book." She claims to be able to read an author's character through the index he created for his own history text, and warns the narrator, an author, "Never index your own book."

Vladimir Nabokov's novel Pale Fire includes a parody of an index, reflecting the insanity of the narrator.

Mark Danielewski's novel House of Leaves contains an exhaustive 41 page index of words in the novel, including even large listings for inconsequential words such as the, and, and in.

J. G. Ballard's "The Index" is a short story told through the form of an index to an "unpublished and perhaps suppressed" autobiography. [13]



The American Society for Indexing, Inc. (ASI) is a national association founded in 1968 to promote excellence in indexing and increase awareness of the value of well-designed indexes. ASI serves indexers, librarians, abstractors, editors, publishers, database producers, data searchers, product developers, technical writers, academic professionals, researchers and readers, and others concerned with indexing. It is the only professional organization in the United States devoted solely to the advancement of indexing, abstracting and related methods of information retrieval.

Other similar societies include:

See also

Related Research Articles

Mind map System or map used to visually organize information

A mind map is a diagram used to visually organize information. A mind map is hierarchical and shows relationships among pieces of the whole. It is often created around a single concept, drawn as an image in the center of a blank page, to which associated representations of ideas such as images, words and parts of words are added. Major ideas are connected directly to the central concept, and other ideas branch out from those major ideas.

Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1993 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. Based on the PostScript language, each PDF file encapsulates a complete description of a fixed-layout flat document, including the text, fonts, vector graphics, raster images and other information needed to display it.

Key Word in Context

Key Word In Context (KWIC) is the most common format for concordance lines. The term KWIC was first coined by Hans Peter Luhn. The system was based on a concept called keyword in titles which was first proposed for Manchester libraries in 1864 by Andrea Crestadoro.

Adobe InDesign Desktop publishing software

Adobe InDesign is a desktop publishing and typesetting software application produced by Adobe Inc.. It can be used to create works such as posters, flyers, brochures, magazines, newspapers, presentations, books and ebooks. InDesign can also publish content suitable for tablet devices in conjunction with Adobe Digital Publishing Suite. Graphic designers and production artists are the principal users, creating and laying out periodical publications, posters, and print media. It also supports export to EPUB and SWF formats to create e-books and digital publications, including digital magazines, and content suitable for consumption on tablet computers. In addition, InDesign supports XML, style sheets, and other coding markup, making it suitable for exporting tagged text content for use in other digital and online formats. The Adobe InCopy word processor uses the same formatting engine as InDesign.

Bibliography Organized listing of books and the systematic description of them as objects

Bibliography, as a discipline, is traditionally the academic study of books as physical, cultural objects; in this sense, it is also known as bibliology. John Carter describes bibliography as a word having two senses, one, a list of books for further study or of works consulted by an author ; the other, one applicable for collectors, is "the study of books as physical objects" and "the systematic description of books as objects".

This page is a glossary of library and information science.

Table of contents Ordered list of the parts of a written work

A table of contents, usually headed simply Contents and abbreviated informally as TOC, is a list, usually found on a page before the start of a written work, of its chapter or section titles or brief descriptions with their commencing page numbers.

Computer-aided translation (CAT), also referred to as machine-assisted translation (MAT) or machine-aided human translation (MAHT), is the use of software to assist a human translator in the translation process. The translation is created by a human, and certain aspects of the process are facilitated by software; this is in contrast with machine translation (MT), in which the translation is created by a computer, optionally with some human intervention.

In text retrieval, full-text search refers to techniques for searching a single computer-stored document or a collection in a full-text database. Full-text search is distinguished from searches based on metadata or on parts of the original texts represented in databases.

Document classification or document categorization is a problem in library science, information science and computer science. The task is to assign a document to one or more classes or categories. This may be done "manually" or algorithmically. The intellectual classification of documents has mostly been the province of library science, while the algorithmic classification of documents is mainly in information science and computer science. The problems are overlapping, however, and there is therefore interdisciplinary research on document classification.

Microsoft WinHelp is a proprietary format for online help files that can be displayed by the Microsoft Help browser winhelp.exe or winhlp32.exe. The file format is based on Rich Text Format (RTF). It remained a popular Help platform from Windows 3.0 platform through Windows XP. WinHelp was removed in Windows Vista purportedly to discourage software developers from using the obsolete format and encourage use of newer help formats.

Controlled vocabularies provide a way to organize knowledge for subsequent retrieval. They are used in subject indexing schemes, subject headings, thesauri, taxonomies and other knowledge organization systems. Controlled vocabulary schemes mandate the use of predefined, authorised terms that have been preselected by the designers of the schemes, in contrast to natural language vocabularies, which have no such restriction.

Concordance (publishing) List of words or terms in a published book

A concordance is an alphabetical list of the principal words used in a book or body of work, listing every instance of each word with its immediate context. Concordances have been compiled only for works of special importance, such as the Vedas, Bible, Qur'an or the works of Shakespeare, James Joyce or classical Latin and Greek authors, because of the time, difficulty, and expense involved in creating a concordance in the pre-computer era.

An IFilter is a plugin that allows Microsoft's search engines to index various file formats so that they become searchable. Without an appropriate IFilter, contents of a file cannot be parsed and indexed by the search engine.

FictionBook is an open XML-based e-book format which originated and gained popularity in Russia. FictionBook files have the .fb2 filename extension. Some readers also support ZIP-compressed FictionBook files

Search engine optimisation indexing is the collecting, parsing, and storing of data to facilitate fast and accurate information retrieval. Index design incorporates interdisciplinary concepts from linguistics, cognitive psychology, mathematics, informatics, and computer science. An alternate name for the process in the context of search engines designed to find web pages on the Internet is web indexing.

The following is a comparison of e-book formats used to create and publish e-books.

Data individual units of information

Data are units of information, often numeric, that are collected through observation. In a more technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects, while a datum is a single value of a single variable.

Indexing software are computer software programs in form of application which help to build a book index.


  1. "Search vs Index". 2013-04-05. Retrieved 2019-02-17.
  2. 1 2 3 4 5 Knight, G. Norman (1979) Indexing, the Art of: A Guide to the Indexing of Books and Periodicals (HarperCollins), pp. 17–18
  3. Mascall, Leonard (1575). A booke of the arte and maner how to plant and graffe all sortes of trees. London: John Wight.
  4. Commire, Anne, ed. (1999–2002). Women in World History: a biographical encyclopedia . Detroit: Yorkin Publications. ISBN   0-7876-3736-X.
  5. Ritter, R. M., ed. (2003). The Oxford Style Manual . Oxford: Oxford University Press. p. 772.
  6. "7.6: Alternative plural forms". The Chicago Manual of Style (16th ed.). Chicago: University of Chicago Press. 2010. ISBN   978-0-226-10420-1.
  7. "Frequently Asked Questions". The American Society for Indexing. Retrieved 2019-07-10.
  8. 1 2 "Software". The American Society for Indexing. Retrieved 2016-12-21.
  9. "Equipment, technology and software". Society of Indexers . Retrieved 2019-07-10.
  10. Robert L. Collison, Book Collecting, London, 1957, p. 121.
  11. "Creating Online Help (Part 2): Strategies and Implementation". Archived from the original on 2009-04-19.
  12. Weaver, Carolyn. "The Gist of Journal Indexing Archived 2008-10-29 at the Wayback Machine ", Key Words 10.1 (Jan./Feb. 2002), 16–22.
  13. "The Index".
  14. "ASAIB – Home".
  15. "Home – Australian and New Zealand Society of Indexers".
  16. "Home – British Record Society".
  17. "中国索引学会". Retrieved 2014-02-23.
  18. Indexers, German Network of. "German Network of Indexers: Welcome".
  19. "Home Accueil – Indexing Society of Canada".
  20. "NIN – Nederlands Indexers Netwerk".
  21. "Home :: The Society of Indexers".

Further reading