Open Library

Last updated
Open Library
Open Library logo.svg
Open Library homepage in September 2011
Type of site
Digital library index
Available in English, Czech, German, French, Croatian Italian, Portuguese, Telugu, Chinese, Ukrainian
Founder(s) Aaron Swartz, Brewster Kahle, Alexis Rossi, Anand Chitipothu, and Rebecca Malamud
Revenue Donation
Launched2006;18 years ago (2006)
Current statusActive
Content license

Open Library is an online project intended to create "one web page for every book ever published". Created by Aaron Swartz, [3] [4] Brewster Kahle, [5] Alexis Rossi, [6] Anand Chitipothu, [6] and Rebecca Malamud, [6] Open Library is a project of the Internet Archive, a nonprofit organization. It has been funded in part by grants from the California State Library and the Kahle/Austin Foundation. Open Library provides online digital copies in multiple formats, created from images of many public domain, out-of-print, and in-print books.


Book database and digital lending library

Its book information is collected from the Library of Congress, other libraries, and, as well as from user contributions through a wiki-like interface. [4] If books are available in digital form, a button labeled "Read" appears next to its catalog listing. Digital copies of the contents of each scanned book are distributed as encrypted e-books (created from images of scanned pages), audiobooks and streaming audio (created from the page images using OCR and text-to-speech software), unencrypted images of full pages from and, and APIs for automated downloading of page images. [7] Links to where books can be purchased or borrowed are also provided.

There are different entities in the database:

Open Library claims to have over 20 million records in its database. [8] Copies of the contents of tens of thousands of modern books have been made available from 150 libraries and publishers for ebook controlled digital lending. [9] Other books including in-print and in-copyright books have been scanned from copies in library collections, library discards, and donations, and are also available for lending in digital form. [10] In total, the Open Library offers copies of over 1.4 million books for what it calls "digital lending", but critics have called distribution of digital copies a violation of copyright law. [11]


Open Library began in 2006 with Aaron Swartz as the original engineer and leader of the Open Library's technical team. [3] [4] The project was led by George Oates from April 2009 to December 2011. [12] Oates was responsible for a complete site redesign during her tenure. [13] In 2015, the project was continued by Giovanni Damiola [6] and then Brenton Cheng [6] and Mek Karpeles [6] in 2016.

The site was redesigned and relaunched in May 2010. Its codebase is on GitHub. [14] The site uses Infobase, its own database framework based on PostgreSQL, and Infogami, its own Wiki engine written in Python. [15] The source code to the site is published under the GNU Affero General Public License. [16] [2]

Book sponsorship program

In the week of October 21, 2019, the Open Library website introduced a Book Sponsorship program, [17] which according to Cory Doctorow, "lets you direct a cash donation to pay for the purchase and scanning of any books. In return, you are first in line to check that book out when it is available, and then anyone who holds an Open Library library card can check it out.". [18] The feature was developed by Mek Karpeles, Tabish Shaikh, [6] and other members of the community. [19]

Books for the blind and dyslexic

The website was relaunched adding ADA compliance and offering over one million modern and older books to the print disabled in May 2010 [20] using the DAISY Digital Talking Book. [21] Under certain provisions of United States copyright law, libraries are sometimes able to reproduce copyrighted works in formats accessible to users with disabilities. [22] [23]

The Open Library has justified its ability to offer full contents of books in digital formats as part of the first-sale doctrine and fair use law. [24] [25] The Open Library owns a physical copy of each book that they have made available, and thus argue that the lending out of one digital scan of the book in a controlled manner falls within the first-sale doctrine, a practice known as controlled digital lending and in use by multiple public and academic libraries. [25]

Since its launch, the Open Library has been accused of mass copyright violation by numerous groups, [25] including the American Authors Guild, [26] the British Society of Authors, [27] the Australian Society of Authors, [28] the Science Fiction and Fantasy Writers of America, [29] the US National Writers Union, [30] and a coalition of 37 national and international organizations of "writers, translators, photographers, and graphic artists; unions, organizations, and federations representing the creators of works included in published books; book publishers; and reproduction rights and public lending rights organizations". [31] The UK Society of Authors threatened legal action in 2019 unless the Open Library agreed to cease distribution of copyrighted works. [32]

Hachette v. Internet Archive

The Open Library further came under criticism from several authors and publishers groups when it created the National Emergency Library in response to the COVID-19 pandemic in March 2020. Under these circumstances, the National Emergency Library removed the waitlists of all books in its Open Library collection and allowed any number of digital copies of a book to be downloaded as an encrypted file that would be unusable after two weeks, asserting that this unlimited borrowing was a reasonable exception under the national emergency to allow educational functions to continue since physical libraries and bookstores were forced to be shuttered. [25] The Authors Guild, the Association of American Publishers, the National Writers Union, and others argued that this allowed unlimited copyright infringement and denied revenues from distribution of authorized digital copies of books to authors who also needed relief during the COVID-19 national emergency. [25] Though the Open Library asserted that the copies of entire books in e-book format were still encrypted and the unlimited borrowing was for educational purposes, the National Writers Union asserted that images of each page of each book could still be accessed on the Web without encryption or other controls. [7] [33]

Four major publishers—Hachette, Penguin Random House, John Wiley & Sons, and HarperCollins, all members of the Association of American Publishers—filed a lawsuit in the Southern New York Federal District Court against the Internet Archive in June 2020, asserting the Open Library project violated numerous copyrights. [34] In their suit, the publishers claimed "Without any license or any payment to authors or publishers, [the Internet Archive] scans print books, uploads these illegally scanned books to its servers, and distributes verbatim digital copies of the books in whole via public-facing websites. With just a few clicks, any Internet-connected user can download complete digital copies of in-copyright books from [the] defendant." [35] The publishers were represented by the law firms Davis Wright Tremaine and Oppenheim + Zebrak. [36] The Internet Archive ended the National Emergency Library on June 16, 2020, instead of the intended June 30 date, and requested the publishers to "call off their costly assault". [37] In July 2022, both parties filed requests for summary judgement. A first hearing was held on March 20, 2023. [38] A summary judgement was issued March 24, 2023, in favor of the plaintiffs. In its ruling the United States District Court for the Southern District of New York determined that the Internet Archive committed copyright infringement by scanning and distributing copies of books online. Stemming from the creation of the National Emergency Library (NEL) during the onset of the COVID-19 pandemic, publishing company Hachette Book Group alleged that the Open Library and the National Emergency Library facilitated copyright infringement.

On March 25, 2023, the court ruled against Internet Archive, which plans on appealing. [39]

See also

Related Research Articles

The Baen Free Library is a digital library of the science fiction and fantasy publishing house Baen Books where 61 e-books as of June 2016 can be downloaded free in a number of formats, without copy protection. It was founded in late 1999 by science fiction writer Eric Flint and publisher Jim Baen to determine whether the availability of books free of charge on the Internet encourages or discourages the sale of their paper books.

The Association of American Publishers (AAP) is the national trade association of the American book publishing industry. AAP lobbies for book, journal and education publishers in the United States. AAP members include most of the major commercial publishers in the United States, as well as smaller and non-profit publishers, university presses and scholarly societies.

<span class="mw-page-title-main">Internet Archive</span> American nonprofit digital archive

The Internet Archive is an American nonprofit digital library founded on May 10, 1996, and chaired by free information advocate Brewster Kahle. It provides free access to collections of digitized materials including websites, software applications, music, audiovisual and print materials. The Archive also advocates for a free and open Internet. As of February 4, 2024, the Internet Archive holds more than 44 million print materials, 10.6 million videos, 1 million software programs, 15 million audio files, 4.8 million images, 255,000 concerts, and over 835 billion web pages in its Wayback Machine. Its mission is committing to provide "universal access to all knowledge".

The Million Book Project was a book digitization project led by Carnegie Mellon University School of Computer Science and University Libraries from 2007 to 2008. Working with government and research partners in India and China, the project scanned books in many languages, using OCR to enable full text searching, and providing free-to-read access to the books on the web. As of 2007, they have completed the scanning of 1 million books and have made the entire catalog accessible online.

<span class="mw-page-title-main">National Library of New Zealand</span> Legal-deposit national library

The National Library of New Zealand is charged with the obligation to "enrich the cultural and economic life of New Zealand and its interchanges with other nations". Under the Act, the library's duties include collecting, preserving and protecting New Zealand's documentary heritage, supporting other libraries in New Zealand, and collaborating with peer institutions abroad. The library headquarters is on the corner of Aitken and Molesworth Streets in Wellington, close to the New Zealand Parliament Buildings and the Court of Appeal.

<span class="mw-page-title-main">Out of print</span> Status of a book title at a publishing house

An out-of-print (OOP) or out-of-commerce item or work is something that is no longer being published. The term applies to all types of printed matter, visual media, sound recordings, and video recordings. An out-of-print book is a book that is no longer being published. The term can apply to specific editions of more popular works, which may then go in and out of print repeatedly, or to the sole printed edition of a work, which is not picked up again by any future publishers for reprint.

<span class="mw-page-title-main">Authors Guild</span> American professional organization

The Authors Guild is America's oldest and largest professional organization for writers and provides advocacy on issues of free expression and copyright protection. Since its founding in 1912 as the Authors League of America, it has counted among its board members notable authors of fiction, nonfiction, and poetry, including numerous winners of the Nobel and Pulitzer Prizes and National Book Awards. It has over 9,000 members, who receive free legal advice and guidance on contracts with publishers as well as insurance services and assistance with subsidiary licensing and royalties.

<span class="mw-page-title-main">Google Books</span> Service from Google

Google Books is a service from Google that searches the full text of books and magazines that Google has scanned, converted to text using optical character recognition (OCR), and stored in its digital database. Books are provided either by publishers and authors through the Google Books Partner Program, or by Google's library partners through the Library Project. Additionally, Google has partnered with a number of magazine publishers to digitize their archives.

<span class="mw-page-title-main">National Writers Union</span> United States trade union

National Writers Union (NWU), founded on 19 November 1981, is the trade union in the United States for freelance and contract writers: journalists, book and short fiction authors, business and technical writers, web content providers and poets. Organized into 17 local chapters nationwide, it had been Local 1981 of the United Automobile Workers, AFL–CIO since merging with them in 1992. On 11 May 2020, the NWU disaffiliated with the UAW.

<span class="mw-page-title-main">Australian Society of Authors</span>

The Australian Society of Authors (ASA) was formed in 1963 as the organisation to promote and protect the rights of Australia's authors and illustrators. The Fellowship of Australian Writers played a key role it its establishment. The organisation established Public Lending Right (PLR) in 1975 and Educational Lending Right (ELR) in 2000. The ASA was also instrumental in setting up Copyright Agency, the Australian Copyright Council and the International Authors Forum.

<span class="mw-page-title-main">HathiTrust</span> Digital library

HathiTrust Digital Library is a large-scale collaborative repository of digital content from research libraries including content digitized via Google Books and the Internet Archive digitization initiatives, as well as content digitized locally by libraries.

A digital library, also called an online library, an internet library, a digital repository, a library without walls, or a digital collection, is an online database of digital objects that can include text, still images, audio, video, digital documents, or other digital media formats or a library accessible through the internet. Objects can consist of digitized content like print or photographs, as well as originally produced digital content like word processor files or social media posts. In addition to storing content, digital libraries provide means for organizing, searching, and retrieving the content contained in the collection. Digital libraries can vary immensely in size and scope, and can be maintained by individuals or organizations. The digital content may be stored locally, or accessed remotely via computer networks. These information retrieval systems are able to exchange information with each other through interoperability and sustainability.

<i>Authors Guild, Inc. v. Google, Inc.</i> U.S. copyright law case, 2015

Authors Guild v. Google 804 F.3d 202 was a copyright case heard in federal court for the Southern District of New York, and then the Second Circuit Court of Appeals between 2005 and 2015. It concerned fair use in copyright law and the transformation of printed copyrighted books into an online searchable database through scanning and digitization. It centered on the legality of the Google Book Search Library Partner project that had been launched in 2003.

Self-publishing is the publication of media by its author at their own cost, without the involvement of a publisher. The term usually refers to written media, such as books and magazines, either as an ebook or as a physical copy using print on demand technology. It may also apply to albums, pamphlets, brochures, games, video content, artwork, and zines. Web fiction is also a major medium for self-publishing.

ebook Book-length publication in digital form

An ebook, also known as an e-book or eBook, is a book publication made available in electronic form, consisting of text, images, or both, readable on the flat-panel display of computers or other electronic devices. Although sometimes defined as "an electronic version of a printed book", some e-books exist without a printed equivalent. E-books can be read on dedicated e-reader devices, also on any computer device that features a controllable viewing screen, including desktop computers, laptops, tablets and smartphones.

United States copyright registrations, renewals, and other catalog entries since 1978 are published online at the United States Copyright Office website. Entries prior to 1978 are not published in the online catalog. Copyright registrations and renewals after 1890 were formerly published in semi-annual softcover catalogs called The Catalog of Copyright Entries (CCE) or Copyright Catalog, or were published in microfiche.

The Boston Library Consortium (BLC) is a library consortium based in the Boston area with 26 member institutions across New England.

<span class="mw-page-title-main">Controlled digital lending</span> Digital library lending model

Controlled digital lending (CDL) is a model by which libraries digitize materials in their collection and make them available for lending. It is based on interpretations of the United States copyright principles of fair use and copyright exhaustion.

Oppenheim + Zebrak is a United States law firm specializing in copyright infringement and anti-piracy. The firm was founded in 2011 by Matt Oppenheim and Scott Zebrak and is based in Tenleytown, Washington, DC.

<i>Hachette v. Internet Archive</i> 2023 American court case

Hachette Book Group, Inc. v. Internet Archive, No. 20-cv-4160 (JGK), 2023 WL 2623787, is a case in which the United States District Court for the Southern District of New York determined that the Internet Archive committed copyright infringement by scanning and distributing copies of books online. Stemming from the creation of the National Emergency Library (NEL) during the onset of the COVID-19 pandemic, publishing companies Hachette Book Group, Penguin Random House, HarperCollins, and Wiley alleged that the Internet Archive's Open Library and National Emergency Library facilitated copyright infringement. The case involves the fair use of controlled digital lending (CDL) systems.


  1. Bookfinch; Chitipothu, Anand; Oates, George; West, Jessamyn (2013-10-10). "Using Open Library Data § Who owns the Open Library catalog?". Archived from the original on 2022-04-11. Retrieved 2021-04-06.
  2. 1 2 "openlibrary/LICENSE at master · internetarchive/openlibrary · GitHub". Archived from the original on 2017-01-22. Retrieved 2015-06-26.
  3. 1 2 "A library bigger than any building". BBC News. 2007-07-31. Archived from the original on 2009-11-27. Retrieved 2010-07-06.
  4. 1 2 3 Grossman, Wendy M (2009-01-22). "Why you can't find a library book in your search engine". The Guardian. London. Archived from the original on 2014-01-14. Retrieved 2010-07-06.
  5. "Aaron Swartz: howtoget". Archived from the original on 2015-05-23. Retrieved 2015-06-05.
  6. 1 2 3 4 5 6 7 "The Open Library Team". Open Library. Archived from the original on 2018-07-17. Retrieved 2018-07-16.
  7. 1 2 Hasbrouck, Edward (16 April 2020). "What is the Internet Archive doing with our books?". National Writers Union. Retrieved 2020-05-07.
  8. "About Us". Archived from the original on 2015-06-27. Retrieved 2015-06-26.
  9. "Internet Archive Forums: In-Library eBook Lending Program Launched". 2011-02-22. Archived from the original on 2015-07-17. Retrieved 2015-06-26.
  10. "FAQ on Controlled Digital Lending (CDL)". 13 February 2019. Archived from the original on 2020-03-30. Retrieved 2019-02-14.
  11. Lee, Timothy B. (2020-03-28). "Internet Archive offers 1.4 million copyrighted books for free online". Ars Technica. Archived from the original on 2020-03-28. Retrieved 2020-04-20.
  12. "George". Archived from the original on 2017-02-22. Retrieved 2015-06-26.
  13. Oates, George (2010-03-17). "Announcing the Open Library redesign « The Open Library Blog". Archived from the original on 2015-06-27. Retrieved 2015-06-26.
  14. "internetarchive/openlibrary · GitHub". Archived from the original on 2015-08-10. Retrieved 2015-06-26.
  15. "About the Technology". Archived from the original on 2015-06-27. Retrieved 2015-06-26.
  16. "Developers / Licensing". Archived from the original on 2015-06-27. Retrieved 2015-06-26.
  17. "The Internet Archive Book Drive | Open Library". Archived from the original on 2022-06-05. Retrieved 2022-06-05.
  18. Doctorow, Cory (2019-10-22). "The Internet Archive's Open Library will let you sponsor a book, paying for it to be scanned". BoingBoing. Archived from the original on 2019-10-23. Retrieved 2019-10-24.
  19. El-Sabrout, Omar Rafik (23 October 2019). "Scan On Demand: Building the World's Open Library, Together". The Open Library Blog. Archived from the original on 2019-10-24. Retrieved 2019-10-24.
  20. "Project puts 1M books online for blind, dyslexic |". 2010-05-05. Archived from the original on 2011-12-17. Retrieved 2015-06-26.
  21. "Welcome to Daisy Books for the Print Disabled". Internet Archive. Archived from the original on 2013-01-04. Retrieved 2012-12-10.
  22. "NLS Factsheets: Copyright Law Amendment, 1996: PL 104-197". Library of Congress NLS Factsheets. Library of Congress. Archived from the original on 2017-05-21.
  23. Scheid, Maria. "Copyright and Accessibility". Copyright Corner. The Ohio State University Libraries. Archived from the original on 2016-06-30.
  24. Hansen, David R.; Courtney, Kyle K. (2018). A White Paper on Controlled Digital Lending of Library Books (Report). Controlled Digital Lendings by Libraries. Archived from the original on 2019-08-02. Retrieved 2020-04-02.
  25. 1 2 3 4 5 Grady, Constance (2020-04-02). "Why authors are so angry about the Internet Archive's Emergency Library". Vox . Archived from the original on 2020-04-04. Retrieved 2020-04-02.
  26. The Authors Guild. "Open Letter to Internet Archive and Other Proponents of 'Controlled Digital Lending'". JotForm. Archived from the original on 2019-07-28. Retrieved 2019-04-04.
  27. The Society of Authors. "Open letter to Internet Archive about 'Controlled Digital Lending'". JotForm. Archived from the original on 2019-07-28. Retrieved 2019-04-04.
  28. "Open Library: copyright infringement". Australian Society of Authors. 2019-01-21. Archived from the original on 2019-08-20. Retrieved 2019-02-10.
  29. "Infringement Alert". Science Fiction and Fantasy Writers of America. 2018-01-08. Archived from the original on 2019-02-12. Retrieved 2019-02-10.
  30. Hasbrouck, Edward (2019-02-13). "NWU denounces 'Controlled Digital Lending'". National Writers Union.
  31. "Controlled Digital Lending (CDL): An appeal to readers and librarians from the victims of CDL". National Writers Union. 13 February 2019. Archived from the original on 2020-07-28. Retrieved 2019-02-14.
  32. Flood, Alison (2019-01-22). "Internet Archive's ebook loans face UK copyright challenge". The Guardian . London. Archived from the original on 2019-02-12. Retrieved 2019-02-10.
  33. Hasbrouck, Edward (24 March 2020). "Internet Archive removes controls on "lending" of bootleg e-books". National Writers Union. Retrieved 2020-05-07.
  34. Bustillos, Maria (2020-09-10). "Publishers Are Taking the Internet to Court". The Nation. Archived from the original on 2021-08-23. Retrieved 2020-10-19.
  35. Brandom, Russell (2020-06-01). "Publishers sue Internet Archive over Open Library ebook lending". The Verge . Archived from the original on 2020-06-01. Retrieved 2020-06-01.
  36. "Publishers File Suit Against Internet Archive for Systematic Mass Scanning and Distribution of Literary Works". AAP. 2020-06-01. Archived from the original on 2020-06-05. Retrieved 2020-06-05.
  37. Lee, Timothy (2020-06-11). "Internet Archive ends "emergency library" early to appease publishers". Ars Technica . Archived from the original on 2020-06-14. Retrieved 2020-06-14.
  38. Albanese, Andrew (February 21, 2023). "Oral Argument Set in Internet Archive Copyright Case". Publishers Weekly. Archived from the original on March 18, 2023. Retrieved March 18, 2023.
  39. Jay Peters, Sean Hollister (24 March 2023). "The Internet Archive has lost its first fight to scan and lend e-books like a library". The Verge. Retrieved 5 August 2023.