HathiTrust

Last updated
HathiTrust
HathiTrust logo.svg
Type of site
Digital library
OwnerUniversity consortium
Revenue US$3,777,445 (2019 projections for proposal) [1]
URL hathitrust.org
CommercialPartially [2]
LaunchedOctober 2008;15 years ago (October 2008)
Current statusActive
Content license
Public domain (with restrictions on Google scans), various [3]
Written in Perl, Java [2]

HathiTrust Digital Library is a large-scale collaborative repository of digital content from research libraries including content digitized via Google Books and the Internet Archive digitization initiatives, as well as content digitized locally by libraries.

Contents

Etymology

Hathi (IPA: [hah-tee] ), derived from the Sanskrit hastin , is the Hindi/Urdu word for 'elephant', an animal famed for its long-term memory. [4]

History

HathiTrust was founded in October 2008 by the twelve universities of the Committee on Institutional Cooperation and the eleven libraries of the University of California. [5] The partnership includes over 60 research libraries [6] across the United States, Canada, and Europe, and is based on a shared governance structure. Costs are shared by the participating libraries and library consortia. [7] The repository is administered by the University of Michigan. [8] The executive director of HathiTrust is Mike Furlough, [9] who succeeded founding director John Wilkin after Wilkin stepped down in 2013. [10] The HathiTrust Shared Print Program is a distributed collective collection whose participating libraries have committed to retaining almost 18 million monograph volumes for 25 years, representing three-quarters of HathiTrust digital book holdings. [11]

In September 2011, the Authors Guild sued HathiTrust ( Authors Guild, Inc. v. HathiTrust ), alleging massive copyright violation. [12] A federal court ruled against the Authors Guild in October 2012, finding that HathiTrust's use of books scanned by Google was fair use under US law. [13] The court's opinion relied on the transformativeness doctrine of federal copyright law, holding that the Trust had transformed the copyrighted works without infringing on the copyright holders' rights. That decision was largely affirmed by the Second Circuit on June 10, 2014, which found that providing search and accessibility for the visually impaired were grounds to consider the service transformative and fair use, and remanded to the lower court to reconsider whether the plaintiffs had standing to sue regarding HathiTrust's library preservation copies. [14]

In October 2015, HathiTrust comprised over 13.7 million volumes, including 5.3 million in the public domain in the United States. HathiTrust provides a number of discovery and access services, notably, full-text search across the entire repository. In 2016 over 6.17 million users located in the United States and in 236 other nations used HathiTrust in 10.92 million sessions. [15]

As of 2021, the copyright policy states that "many works in our collection are protected by copyright law, so we cannot ordinarily publicly display large portions of those protected works unless we have permission from the copyright holder", and thus "if we cannot determine the copyright or permission status of a work, we restrict access to that work until we can establish its status. Because of differences in international copyright laws, access is also restricted for users outside the United States to works published outside the United States after and including 1896." [16]

PageTurner

PageTurner is the web application on the HathiTrust website for viewing publications. [17] From PageTurner readers can navigate through a publication, download a PDF version of it, and view pages in different ways, such as one page at a time, scrolling, flipping, or thumbnail views. [17] [18]

Emergency Temporary Access Service

The Emergency Temporary Access Service [19] (ETAS) is a service provided by HathiTrust that makes it possible in certain special situations, such as closure of a library for a public health emergency, for users of HathiTrust member libraries to obtain lawful access to copyright digital materials in place of the corresponding physical books held by the same library.

See also

Related Research Articles

<span class="mw-page-title-main">Internet Archive</span> American nonprofit digital archive

The Internet Archive is an American nonprofit digital library founded on May 10, 1996, and chaired by free information advocate Brewster Kahle. It provides free access to collections of digitized materials including websites, software applications, music, audiovisual and print materials. The Archive also advocates for a free and open Internet. As of February 4, 2024, the Internet Archive holds more than 44 million print materials, 10.6 million videos, 1 million software programs, 15 million audio files, 4.8 million images, 255,000 concerts, and over 835 billion web pages in its Wayback Machine. Its mission is committing to provide "universal access to all knowledge".

Electronic publishing includes the digital publication of e-books, digital magazines, and the development of digital libraries and catalogues. It also includes the editing of books, journals, and magazines to be posted on a screen.

The Million Book Project was a book digitization project led by Carnegie Mellon University School of Computer Science and University Libraries from 2007 to 2008. Working with government and research partners in India and China, the project scanned books in many languages, using OCR to enable full text searching, and providing free-to-read access to the books on the web. As of 2007, they have completed the scanning of 1 million books and have made the entire catalog accessible online.

<span class="mw-page-title-main">Digitization</span> Converting information into digital form

Digitization is the process of converting information into a digital format. The result is the representation of an object, image, sound, document, or signal obtained by generating a series of numbers that describe a discrete set of points or samples. The result is called digital representation or, more specifically, a digital image, for the object, and digital form, for the signal. In modern practice, the digitized data is in the form of binary numbers, which facilitates processing by digital computers and other operations, but digitizing simply means "the conversion of analog source material into a numerical format"; the decimal or any other number system can be used instead.

In library and archival science, digital preservation is a formal process to ensure that digital information of continuing value remains accessible and usable in the long term. It involves planning, resource allocation, and application of preservation methods and technologies, and combines policies, strategies and actions to ensure access to reformatted and "born-digital" content, regardless of the challenges of media failure and technological change. The goal of digital preservation is the accurate rendering of authenticated content over time.

<span class="mw-page-title-main">Google Scholar</span> Academic search service by Google

Google Scholar is a freely accessible web search engine that indexes the full text or metadata of scholarly literature across an array of publishing formats and disciplines. Released in beta in November 2004, the Google Scholar index includes peer-reviewed online academic journals and books, conference papers, theses and dissertations, preprints, abstracts, technical reports, and other scholarly literature, including court opinions and patents.

<span class="mw-page-title-main">University of Michigan Library</span> University library system

The University of Michigan Library is the academic library system of the University of Michigan. The university's 38 constituent and affiliated libraries together make it the second largest research library by number of volumes in the United States.

<span class="mw-page-title-main">Open Library</span> Online project for book data of the Internet Archive

Open Library is an online project intended to create "one web page for every book ever published". Created by Aaron Swartz, Brewster Kahle, Alexis Rossi, Anand Chitipothu, and Rebecca Malamud, Open Library is a project of the Internet Archive, a nonprofit organization. It has been funded in part by grants from the California State Library and the Kahle/Austin Foundation. Open Library provides online digital copies in multiple formats, created from images of many public domain, out-of-print, and in-print books.

<span class="mw-page-title-main">Google Books</span> Service from Google

Google Books is a service from Google that searches the full text of books and magazines that Google has scanned, converted to text using optical character recognition (OCR), and stored in its digital database. Books are provided either by publishers and authors through the Google Books Partner Program, or by Google's library partners through the Library Project. Additionally, Google has partnered with a number of magazine publishers to digitize their archives.

The California Digital Library (CDL) was founded by the University of California in 1997. Under the leadership of then UC President Richard C. Atkinson, the CDL's original mission was to forge a better system for scholarly information management and improved support for teaching and research. In collaboration with the ten University of California Libraries and other partners, CDL assembled one of the world's largest digital research libraries. CDL facilitates the licensing of online materials and develops shared services used throughout the UC system. Building on the foundations of the Melvyl Catalog, CDL has developed one of the largest online library catalogs in the country and works in partnership with the UC campuses to bring the treasures of California's libraries, museums, and cultural heritage organizations to the world. CDL continues to explore how services such as digital curation, scholarly publishing, archiving and preservation support research throughout the information lifecycle.

<span class="mw-page-title-main">Book scanning</span> Process of converting physical media into digital media

Book scanning or book digitization is the process of converting physical books and magazines into digital media such as images, electronic text, or electronic books (e-books) by using an image scanner. Large scale book scanning projects have made many books available online.

The Michigan Digitization Project is a project in partnership with Google Books to digitize the entire print collection of the University of Michigan Library. The digitized collection is available through the University of Michigan Library catalog, Mirlyn, the HathiTrust Digital Library, and Google Books. Full-text of works that are out of copyright or in the public domain are available.

A digital library, also called an online library, an internet library, a digital repository, a library without walls, or a digital collection, is an online database of digital objects that can include text, still images, audio, video, digital documents, or other digital media formats or a library accessible through the internet. Objects can consist of digitized content like print or photographs, as well as originally produced digital content like word processor files or social media posts. In addition to storing content, digital libraries provide means for organizing, searching, and retrieving the content contained in the collection. Digital libraries can vary immensely in size and scope, and can be maintained by individuals or organizations. The digital content may be stored locally, or accessed remotely via computer networks. These information retrieval systems are able to exchange information with each other through interoperability and sustainability.

<i>Authors Guild, Inc. v. Google, Inc.</i> U.S. copyright law case, 2015

Authors Guild v. Google 721 F.3d 132 was a copyright case heard in federal court for the Southern District of New York, and then the Second Circuit Court of Appeals between 2005 and 2015. It concerned fair use in copyright law and the transformation of printed copyrighted books into an online searchable database through scanning and digitization. It centered on the legality of the Google Book Search Library Partner project that had been launched in 2003.

ebook Book-length publication in digital form

An ebook, also known as an e-book or eBook, is a book publication made available in electronic form, consisting of text, images, or both, readable on the flat-panel display of computers or other electronic devices. Although sometimes defined as "an electronic version of a printed book", some e-books exist without a printed equivalent. E-books can be read on dedicated e-reader devices, also on any computer device that features a controllable viewing screen, including desktop computers, laptops, tablets and smartphones.

<span class="mw-page-title-main">Orphan works in the United States</span>

An orphan work is a work whose copyright owner is impossible to identify or contact. This inability to request permission from the copyright owner often means orphan works cannot be used in new works nor digitized, except when fair use exceptions apply. Until recently, public libraries could not digitize orphaned books without risking being fined up to $150,000 if the owner of the copyright were to come forward. This problem was briefly addressed in the 2011 case Authors Guild, Inc. v. Google, but the settlement in that case was later overturned.

<span class="mw-page-title-main">Digital Public Library of America</span> US digital library project

The Digital Public Library of America (DPLA) is a US project aimed at providing public access to digital holdings in order to create a large-scale public digital library. It officially launched on April 18, 2013, after two-and-a-half years of development.

A memory institution is an organization maintaining a repository of public knowledge, a generic term used about institutions such as libraries, archives, heritage institutions, aquaria and arboreta, and zoological and botanical gardens, as well as providers of digital libraries and data aggregation services which serve as memories for given societies or mankind. Memory institutions serve the purpose of documenting, contextualizing, preserving and indexing elements of human culture and collective memory. These institutions allow and enable society to better understand themselves, their past, and how the past impacts their future. These repositories are ultimately preservers of communities, languages, cultures, customs, tribes, and individuality. Memory institutions are repositories of knowledge, while also being actors of the transitions of knowledge and memory to the community. These institutions ultimately remain some form of collective memory. Increasingly such institutions are considered as a part of a unified documentation and information science perspective.

<i>Authors Guild, Inc. v. HathiTrust</i> American legal case

Authors Guild v. HathiTrust, 755 F.3d 87, is a United States copyright decision finding search and accessibility uses of digitized books to be fair use.

<span class="mw-page-title-main">Controlled digital lending</span> Digital library lending model

Controlled digital lending (CDL) is a model by which libraries digitize materials in their collection and make them available for lending. It is based on interpretations of the United States copyright principles of fair use and copyright exhaustion.

References

  1. "2018 Member Meeting" (PDF). HathiTrust. October 2018. p. 56. Archived (PDF) from the original on 2019-01-18. Retrieved 2018-12-31. Slides in PDF.
  2. 1 2 "Technological Profile". HathiTrust. Archived from the original on 16 June 2016. Retrieved 12 May 2016.
  3. "Access and Use Policies". HathiTrust. Archived from the original on 16 June 2016. Retrieved 12 May 2016.
  4. "Launch of HathiTrust: Major Library Partners Launch HathiTrust Shared Digital Repository" (Press release). HathiTrust. October 13, 2008. Archived from the original on August 19, 2019. Retrieved 2019-06-21.
  5. Karels, Liene (November 2010). "HathiTrust adds new members, goes global". University of Michigan. Archived from the original on 2014-03-02. Retrieved 2019-06-21.
  6. "HathiTrust Partnership Community". HathiTrust. Archived from the original on 5 September 2015. Retrieved 28 August 2015.
  7. "Cost". HathiTrust. Archived from the original on 2019-01-18. Retrieved 2019-06-21.
  8. "Governance". HathiTrust. Archived from the original on 2019-06-21. Retrieved 2019-06-21.
  9. "HathiTrust Staff". HathiTrust. Archived from the original on 2019-06-21. Retrieved 2019-06-21.
  10. "Update on May 2013 Activities". HathiTrust. June 14, 2013. Archived from the original on 2023-03-16. Retrieved 2023-03-16.
  11. "Shared Print Program". HathiTrust Digital Library. Archived from the original on 2019-11-27. Retrieved 2019-12-18.
  12. Bosman, Julie (September 12, 2011). "Lawsuit Seeks the Removal of a Digital Book Collection". The New York Times. Archived from the original on December 19, 2012. Retrieved November 1, 2012.
  13. Albanese, Andrew (11 October 2012). "Google Scanning Is Fair Use Says Judge". Publishers Weekly . Archived from the original on 15 October 2012. Retrieved 11 October 2012.
  14. "Authors Guild v. HathiTrust" (PDF). Archived from the original (PDF) on August 8, 2014.
  15. Zaytsev, Angelina (February 2017). "14 Million Books & 6 Million Visitors: HathiTrust Growth and Usage in 2016" (PDF). HathiTrust. Archived (PDF) from the original on 2017-02-24. Retrieved 2019-06-21.
  16. "Trust copyright policy - restrictions on access". Archived from the original on July 26, 2011.
  17. 1 2 Meltzer, Ellen (May 9, 2011). "Viewing HathiTrust books just got better". cdlib.org. California Digital Library. Archived from the original on June 21, 2019. Retrieved 2019-06-21.
  18. "HathiTrust User's Guide" (PDF). HathiTrust. May 2012. p. 8. Archived (PDF) from the original on 2017-01-09. Retrieved 2019-06-21.
  19. "Emergency Temporary Access Service". HathiTrust Digital Library.

Further reading