Bradford's law

Last updated
Visual Representation of Bradford's law. Bradfordslaw.png
Visual Representation of Bradford's law.

Bradford's law is a pattern first described by Samuel C. Bradford in 1934 that estimates the exponentially diminishing returns of searching for references in science journals. One formulation is that if journals in a field are sorted by number of articles into three groups, each with about one-third of all articles, then the number of journals in each group will be proportional to 1:n:n2. [1] There are a number of related formulations of the principle.

Contents

In many disciplines, this pattern is called a Pareto distribution. As a practical example, suppose that a researcher has five core scientific journals for his or her subject. Suppose that in a month there are 12 articles of interest in those journals. Suppose further that in order to find another dozen articles of interest, the researcher would have to go to an additional 10 journals. Then that researcher's Bradford multiplier bm is 2 (i.e. 10/5). For each new dozen articles, that researcher will need to look in bm times as many journals. After looking in 5, 10, 20, 40, etc. journals, most researchers quickly realize that there is little point in looking further.

Different researchers have different numbers of core journals, and different Bradford multipliers. But the pattern holds quite well across many subjects, and may well be a general pattern for human interactions in social systems. Like Zipf's law, to which it is related, we do not have a good explanation for why it works, but knowing that it does is very useful for librarians. What it means is that for each specialty, it is sufficient to identify the "core publications" for that field and only stock those; very rarely will researchers need to go outside that set.[ verification needed ]

However, its impact has been far greater than that. Armed with this idea and inspired by Vannevar Bush's famous article As We May Think , Eugene Garfield at the Institute for Scientific Information in the 1960s developed a comprehensive index of how scientific thinking propagates. His Science Citation Index (SCI) had the effect of making it easy to identify exactly which scientists did science that had an impact, and which journals that science appeared in. It also caused the discovery, which some did not expect, that a few journals, such as Nature and Science , were core for all of hard science. The same pattern does not happen with the humanities or the social sciences.

The result of this is pressure on scientists to publish in the best journals, and pressure on universities to ensure access to that core set of journals. On the other hand, the set of "core journals" may vary more or less strongly with the individual researchers, and even more strongly along schools-of-thought divides. There is also a danger of over-representing majority views if journals are selected in this fashion.

Scattering

Bradford's law is also known as Bradford's law of scattering or the Bradford distribution, as it describes how the articles on a particular subject are scattered throughout the mass of periodicals. [2] Another more general term that has come into use since 2006 is information scattering, an often observed phenomenon related to information collections where there are a few sources that have many items of relevant information about a topic, while most sources have only a few. [3] This law of distribution in bibliometrics can be applied to the World Wide Web as well. [4]

Hjørland and Nicolaisen identified three kinds of scattering: [5]

  1. Lexical scattering. The scattering of words in texts and in collections of texts.
  2. Semantic scattering. The scattering of concepts in texts and in collections of texts.
  3. Subject scattering. The scattering of items useful to a given task or problem.

They found that the literature of Bradford's law (including Bradford's own papers) is unclear in relation to which kind of scattering is actually being measured.

Law's interpretations

The interpretation of Bradford's law in terms of a geometric progression was suggested by V. Yatsko, [6] who introduced an additional constant and demonstrated that Bradford distribution can be applied to a variety of objects, not only to distribution of articles or citations across journals. V. Yatsko's interpretation (Y-interpretation) can be effectively used to compute threshold values in case it is necessary to distinguish subsets within a set of objects (successful/unsuccessful applicants, developed/underdeveloped regions, etc.).

See also

Notes

  1. Black, Paul E. (2004-12-12). "Bradford's law, in Dictionary of Algorithms and Data Structures". U.S. National Institute of Standards and Technology . Retrieved 2007-10-24.
  2. VICKERY, B.C. (1948-01-01). "Bradford's Law of Scattering". Journal of Documentation. 4 (3): 198–203. doi:10.1108/eb026133. ISSN   0022-0418.
  3. "Information Scattering". Encyclopedia of Library and Information Sciences, Third Edition. CRC Press. 2009-12-17. pp. 2564–2569. doi:10.1081/E-ELIS3-120043255. ISBN   978-0-203-75763-5.
  4. Turnbull, Don (1997). "Bibliometrics and the World Wide Web". University of Toronto Technical Report. Archived from the original on 2007-04-02. Retrieved 2007-07-05.{{cite journal}}: Cite journal requires |journal= (help)
  5. Hjørland, Birger; Nicolaisen, Jeppe (2005). Bradford's law of scattering: ambiguities in the concept of "subject. 5th International Conference on Conceptions of Library and Information Science. pp. 96–106.
  6. Yatsko, V. A. (2012). "The interpretation of Bradford's law in terms of geometric progression". Automatic Documentation and Mathematical Linguistics. 46 (2): 112–117. doi:10.3103/S0005105512020094. S2CID   255432905.

Related Research Articles

<span class="mw-page-title-main">Zipf's law</span> Probability distribution

Zipf's law is an empirical law stating that when a list of measured values is sorted in decreasing order, the value of the nth entry is often approximately inversely proportional to n.

<span class="mw-page-title-main">Heaps' law</span> Heuristic for distinct words in a document

In linguistics, Heaps' law is an empirical law which describes the number of distinct words in a document as a function of the document length. It can be formulated as

In information science and information retrieval, relevance denotes how well a retrieved document or set of documents meets the information need of the user. Relevance may include concerns such as timeliness, authority or novelty of the result.

A tertiary source is an index or textual consolidation of already published primary and secondary sources that does not provide additional interpretations or analysis of the sources. Some tertiary sources can be used as an aid to find key (seminal) sources, key terms, general common knowledge and established mainstream science on a topic. The exact definition of tertiary varies by academic field.

A book review is a form of literary criticism in which a book is merely described or analyzed based on content, style, and merit.

<span class="mw-page-title-main">Bibliometrics</span> Statistical analysis of written publications

Bibliometrics is the application of statistical methods to the study of bibliographic data, especially in scientific and library and information science contexts, and is closely associated with scientometrics to the point that both fields largely overlap.

<span class="mw-page-title-main">Lotka's law</span> An application of Zipfs law describing the frequency of publication by authors in any given field

Lotka's law, named after Alfred J. Lotka, is one of a variety of special applications of Zipf's law. It describes the frequency of publication by authors in any given field. Let be the number of publications, be the number of authors with publications, and be a constants depending on the specific field. Lotka's law states that .

<span class="mw-page-title-main">Informetrics</span> Study of the quantitative aspects of information

Informetrics is the study of quantitative aspects of information, it is an extension and evolution of traditional bibliometrics and scientometrics. Informetrics uses bibliometrics and scientometrics methods to study mainly the problems of literature information management and evaluation of science and technology. Informetrics is an independent discipline that uses quantitative methods from mathematics and statistics to study the process, phenomena, and law of informetrics. Informetrics has gained more attention as it is a common scientific method for academic evaluation, research hotspots in discipline, and trend analysis.

A bibliogram is a graphical representation of the frequency of certain target words, usually noun phrases, in a given text. The term was introduced in 2005 by Howard D. White to name the linguistic object studied, but not previously named, in informetrics, scientometrics and bibliometrics. The noun phrases in the ranking may be authors, journals, subject headings, or other indexing terms. The "stretches of text” may be a book, a set of related articles, a subject bibliography, a set of Web pages, and so on. Bibliograms are always generated from writings, usually from scholarly or scientific literature.

Source criticism is the process of evaluating an information source, i.e.: a document, a person, a speech, a fingerprint, a photo, an observation, or anything used in order to obtain knowledge. In relation to a given purpose, a given information source may be more or less valid, reliable or relevant. Broadly, "source criticism" is the interdisciplinary study of how information sources are evaluated for given tasks.

Bibliographic coupling, like co-citation, is a similarity measure that uses citation analysis to establish a similarity relationship between documents. Bibliographic coupling occurs when two works reference a common third work in their bibliographies. It is an indication that a probability exists that the two works treat a related subject matter.

Subject indexing is the act of describing or classifying a document by index terms, keywords, or other symbols in order to indicate what different documents are about, to summarize their contents or to increase findability. In other words, it is about identifying and describing the subject of documents. Indexes are constructed, separately, on three distinct levels: terms in a document such as a book; objects in a collection such as a library; and documents within a field of knowledge.

The Redalyc project is a bibliographic database and a digital library of Open Access journals, supported by the Universidad Autónoma del Estado de México with the help of numerous other higher education institutions and information systems.

Aboutness is a term used in library and information science (LIS), linguistics, philosophy of language, and philosophy of mind. In general, the term refers to the concept that a text, utterance, image, or action is on or of something. In LIS, it is often considered synonymous with a document's subject. In the philosophy of mind, it has been often considered synonymous with intentionality, perhaps since John Searle (1983). In the philosophy of logic and language, it is understood as the way a piece of text relates to a subject matter or topic.

Source literature is a kind of information source. It might, for example, be cited and used as sources in academic writings, and then called the literature on the subject.

Samuel Clement Bradford was a British mathematician, librarian and documentalist at the Science Museum in London. He developed "Bradford's law" regarding differences in demand for scientific journals. This work influences bibliometrics and citation analysis of scientific publications. Bradford founded the British Society for International Bibliography (BSIB) and he was elected president of International Federation for Information and Documentation (FID) in 1945. Bradford was a strong proponent of the UDC and of establishing abstracts of the scientific literature.

<span class="mw-page-title-main">Documentation science</span> Study of recording and retrieval of information

Documentation science is the study of the recording and retrieval of information. Documentation science gradually developed into the broader field of information science.

A bibliometrician is a researcher or a specialist in bibliometrics. It is near-synonymous with an informetrican, a scientometrican and a webometrician, who study webometrics.

<span class="mw-page-title-main">Co-citation</span> Frequency with which two documents are cited together by other documents

Co-citation is the frequency with which two documents are cited together by other documents. If at least one other document cites two documents in common, these documents are said to be co-cited. The more co-citations two documents receive, the higher their co-citation strength, and the more likely they are semantically related. Like bibliographic coupling, co-citation is a semantic similarity measure for documents that makes use of citation analyses.

Birger Hjørland is a professor of knowledge organization at the Royal School of Library and Information Science (RSLIS) in Copenhagen. His main areas of study pertain to theory of library and information science and of knowledge organization. Hjørland has contributed important developments to domain analysis and concept theory. He has been cited as an anchor of North American knowledge organization studies, as well as an information science pioneer.

References

Reprinted as: