The Open Content Alliance (OCA) was a consortium of organizations contributing to a permanent, publicly accessible archive of digitized texts. Its creation was announced in October 2005 by Yahoo!, the Internet Archive, the University of California, the University of Toronto and others. [1] Scanning for the Open Content Alliance was administered by the Internet Archive, which also provided permanent storage and access through its website.
The OCA was, in part, a response to Google Book Search, which was announced in October 2004. OCA's approach to seeking permission from copyright holders differed significantly from that of Google Book Search. OCA digitized copyrighted works only after asking and receiving permission from the copyright holder ("opt-in"). By contrast, Google Book Search digitized copyrighted works unless explicitly told not to do so ("opt-out"), and contends that digitizing for the purposes of indexing is fair use.
Microsoft had a special relationship with the Open Content Alliance until May 2008. Microsoft joined the Open Content Alliance in October 2005 as part of its Live Book Search project. [2] However, in May 2008 Microsoft announced it would be ending the Live Book Search project and no longer funding the scanning of books through the Internet Archive. [3] Microsoft removed any contractual restrictions on the content they had scanned and they relinquished the scanning equipment to their digitization partners and libraries to continue digitization programs. [3] Between about 2006 and 2008 Microsoft sponsored the scanning of over 750,000 books, 300,000 of which are now part of the Internet Archive's on-line collections.
Brewster Kahle, a founder of the Open Content Alliance, actively opposed the proposed Google Book Settlement until its defeat in March 2011.
The following are contributors to the OCA:
Biodiversity Heritage Library, a cooperative project of:
The Internet Archive is an American nonprofit organization founded in 1996 by Brewster Kahle that runs a digital library website, archive.org. It provides free access to collections of digitized materials including websites, software applications, music, audiovisual, and print materials. The Archive also advocates a free and open Internet. As of September 5, 2024, the Internet Archive held more than 42.1 million print materials, 13 million videos, 1.2 million software programs, 14 million audio files, 5 million images, 272,660 concerts, and over 866 billion web pages in its Wayback Machine. Its mission is committing to provide "universal access to all knowledge".
The Million Book Project was a book digitization project led by Carnegie Mellon University School of Computer Science and University Libraries from 2007 to 2008. Working with government and research partners in India and China, the project scanned books in many languages, using OCR to enable full text searching, and providing free-to-read access to the books on the web. As of 2007, they have completed the scanning of 1 million books and have made the entire catalog accessible online.
Digitization is the process of converting information into a digital format. The result is the representation of an object, image, sound, document, or signal obtained by generating a series of numbers that describe a discrete set of points or samples. The result is called digital representation or, more specifically, a digital image, for the object, and digital form, for the signal. In modern practice, the digitized data is in the form of binary numbers, which facilitates processing by digital computers and other operations, but digitizing simply means "the conversion of analog source material into a numerical format"; the decimal or any other number system can be used instead.
Google Books is a service from Google that searches the full text of books and magazines that Google has scanned, converted to text using optical character recognition (OCR), and stored in its digital database. Books are provided either by publishers and authors through the Google Books Partner Program, or by Google's library partners through the Library Project. Additionally, Google has partnered with a number of magazine publishers to digitize their archives.
A universal library is a library with universal collections. This may be expressed in terms of it containing all existing information, useful information, all books, all works or even all possible works. This ideal, although unrealizable, has influenced and continues to influence librarians and others and be a goal which is aspired to. Universal libraries are often assumed to have a complete set of useful features.
Live Search Books was a search service for books launched in December 2006, part of Microsoft's Live Search range of services. Microsoft was working with a number of libraries, including the British Library, to digitize books and make them searchable, and in the case of out-of-copyright books, available across the web.
Google News Archive is an extension of Google News providing free access to scanned archives of newspapers and links to other newspaper archives on the web, both free and paid.
Book scanning or book digitization is the process of converting physical books and magazines into digital media such as images, electronic text, or electronic books (e-books) by using an image scanner. Large scale book scanning projects have made many books available online.
Sidney Verba was an American political scientist, librarian and library administrator. His academic interests were mainly American and comparative politics. He was the Carl H. Pforzheimer University Professor at Harvard University and also served Harvard as the director of the Harvard University Library from 1984 to 2007.
The Michigan Digitization Project is a project in partnership with Google Books to digitize the entire print collection of the University of Michigan Library. The digitized collection is available through the University of Michigan Library catalog, Mirlyn, the HathiTrust Digital Library, and Google Books. Full-text of works that are out of copyright or in the public domain are available.
The Biodiversity Heritage Library (BHL) is the world’s largest open-access digital library for biodiversity literature and archives. BHL operates as a worldwide consortium of natural history, botanical, research, and national libraries working together to address this challenge by digitizing the natural history literature held in their collections and making it freely available for open access as part of a global "biodiversity community". The BHL consortium works with the international taxonomic community, publishers, bioinformaticians, and information technology professionals to develop tools and services to facilitate greater access, interoperability, and reuse of content and data. BHL provides a range of services, data exports, and APIs to allow users to download content, harvest source data files, and reuse materials for research purposes. Through taxonomic intelligence tools developed by Global Names Architecture, BHL indexes the taxonomic names throughout the collection, allowing researchers to locate publications about specific taxa. In partnership with the Internet Archive and through local digitization efforts, BHL's portal provides free access to hundreds of thousands of volumes, comprising over 59 million pages, from the 15th–21st centuries.
The Hawkins Electrical Guide was a technical engineering book written by Nehemiah Hawkins, first published in 1914, intended to explain the highly complex principles of the new technology of electricity in a way that could be understood by the common man. The book is notable for the extremely high number of detailed illustrations it contains, and the small softbound size of the volumes.
HathiTrust Digital Library is a large-scale collaborative repository of digital content from research libraries including content digitized via Google Books and the Internet Archive digitization initiatives, as well as content digitized locally by libraries.
A digital library is an online database of digital objects that can include text, still images, audio, video, digital documents, or other digital media formats or a library accessible through the internet. Objects can consist of digitized content like print or photographs, as well as originally produced digital content like word processor files or social media posts. In addition to storing content, digital libraries provide means for organizing, searching, and retrieving the content contained in the collection. Digital libraries can vary immensely in size and scope, and can be maintained by individuals or organizations. The digital content may be stored locally, or accessed remotely via computer networks. These information retrieval systems are able to exchange information with each other through interoperability and sustainability.
Authors Guild v. Google 804 F.3d 202 was a copyright case heard in federal court for the Southern District of New York, and then the Second Circuit Court of Appeals between 2005 and 2015. It concerned fair use in copyright law and the transformation of printed copyrighted books into an online searchable database through scanning and digitization. It centered on the legality of the Google Book Search Library Partner project that had been launched in 2003.
The Digital Public Library of America (DPLA) is a US project aimed at providing public access to digital holdings in order to create a large-scale public digital library. It officially launched on April 18, 2013, after two-and-a-half years of development.
The Biodiversity Heritage Library for Europe (BHL-Europe) was a three-year (2009–2012) EU project aimed to the coordination of digitization of literature on biodiversity. It involved 28 major natural history museums, botanical gardens, libraries and other European institutions. BHL-Europe was founded in Berlin in May 2009 and regarded itself as a European partner project of the Biodiversity Heritage Library (BHL) project, which was founded in 2005 and initially formed by ten United States and British libraries.
Authors Guild v. HathiTrust, 755 F.3d 87, is a United States copyright decision finding search and accessibility uses of digitized books to be fair use.
American Libraries is a digital collection of ebooks and texts at the Internet Archive. This collection contains over 1,900,000 items sponsored by these partners:
The Boston Library Consortium (BLC) is a library consortium based in the Boston area with 26 member institutions across New England.