(first document posted)
|Size||Over 60,000 documents|
|Website||Project Gutenberg Home Page|
Project Gutenberg (PG) is a volunteer effort to digitize and archive cultural works, and to "encourage the creation and distribution of eBooks." As of 20 May 2020 [update] , Project Gutenberg had reached 62,108 items in its collection of free eBooks.It was founded in 1971 by American writer Michael S. Hart and is the oldest digital library. Most of the items in its collection are the full texts of books in the public domain. The Project tries to make these as free as possible, in long-lasting, open formats that can be used on almost any computer.
The releases are available in plain text, but other formats, such as HTML, PDF, EPUB, MOBI, and Plucker are included wherever possible. Most releases are in the English language, but many non-English works are also available. There are multiple affiliated projects that provide additional content, including region- and language-specific works. Project Gutenberg is closely affiliated with Distributed Proofreaders, an Internet-based community for proofreading scanned texts.
Michael S. Hart began Project Gutenberg in 1971 with the digitization of the United States Declaration of Independence.Hart, a student at the University of Illinois, obtained access to a Xerox Sigma V mainframe computer in the university's Materials Research Lab. Through friendly operators, he received an account with a virtually unlimited amount of computer time; its value at that time has since been variously estimated at $100,000 or $100,000,000. Hart explained he wanted to "give back" this gift by doing something one could consider to be of great value. His initial goal was to make the 10,000 most consulted books available to the public at little or no charge by the end of the 20th century.
This particular computer was one of the 15 nodes on ARPANET, the computer network that would become the Internet. Hart believed one day the general public would be able to access computers and decided to make works of literature available in electronic form for free. He used a copy of the United States Declaration of Independence in his backpack, and this became the first Project Gutenberg e-text. He named the project for Johannes Gutenberg, the fifteenth century German printer who propelled the movable type printing press revolution.
By the mid-1990s, Hart was running Project Gutenberg from Illinois Benedictine College. More volunteers had joined the effort. He manually entered all of the text until 1989 when image scanners and optical character recognition software improved and became more available, making book scanning more feasible.Hart later came to an arrangement with Carnegie Mellon University, which agreed to administer Project Gutenberg's finances. As the volume of e-texts increased, volunteers began to take over the project's day-to-day operations that Hart had run.
Starting in 2004, an improved online catalog made Project Gutenberg content easier to browse, access and hyperlink. Project Gutenberg is now hosted by ibiblio at the University of North Carolina at Chapel Hill.
Italian volunteer Pietro Di Miceli developed and administered the first Project Gutenberg website and started the development of the Project online Catalog. In his ten years in this role (1994–2004), the Project web pages won a number of awards, often being featured in "best of the Web" listings, and contributing to the project's popularity.
Hart died on 6 September 2011 at his home in Urbana, Illinois at the age of 64.
In 2000, a non-profit corporation, the Project Gutenberg Literary Archive Foundation, Inc. was chartered in Mississippi, United States to handle the project's legal needs. Donations to it are tax-deductible. Long-time Project Gutenberg volunteer Gregory Newby became the foundation's first CEO.
Also in 2000, Charles Franks founded Distributed Proofreaders (DP), which allowed the proofreading of scanned texts to be distributed among many volunteers over the Internet. This effort increased the number and variety of texts being added to Project Gutenberg, as well as making it easier for new volunteers to start contributing. DP became officially affiliated with Project Gutenberg in 2002. As of 2018 [update] , the 36,000+ DP-contributed books comprised almost two-thirds of the nearly 60,000 books in Project Gutenberg.
In August 2003, Project Gutenberg created a CD containing approximately 600 of the "best" e-books from the collection. The CD is available for download as an ISO image. When users are unable to download the CD, they can request to have a copy sent to them, free of charge.
In December 2003, a DVD was created containing nearly 10,000 items. At the time, this represented almost the entire collection. In early 2004, the DVD also became available by mail.
In July 2007, a new edition of the DVD was released containing over 17,000 books, and in April 2010, a dual-layer DVD was released, containing nearly 30,000 items.
The majority of the DVDs, and all of the CDs mailed by the project, were recorded on recordable media by volunteers. However, the new dual layer DVDs were manufactured, as it proved more economical than having volunteers burn them. As of October 2010 [update] , the project has mailed approximately 40,000 discs. As of 2017, the delivery of free CDs has been discontinued, though the ISO image is still available for download.
As of August 2015 [update] , Project Gutenberg claimed over 60,000 items in its collection, with an average of over 50 new e-books being added each week. These are primarily works of literature from the Western cultural tradition. In addition to literature such as novels, poetry, short stories and drama, Project Gutenberg also has cookbooks, reference works and issues of periodicals. The Project Gutenberg collection also has a few non-text items such as audio files and music-notation files.
Most releases are in English, but there are also significant numbers in many other languages. As of April 2016 [update] , the non-English languages most represented are: French, German, Finnish, Dutch, Italian, and Portuguese.
Whenever possible, Gutenberg releases are available in plain text, mainly using US-ASCII character encoding but frequently extended to ISO-8859-1 (needed to represent accented characters in French and Scharfes s in German, for example). Besides being copyright-free, the requirement for a Latin (character set) text version of the release has been a criterion of Michael Hart's since the founding of Project Gutenberg, as he believes this is the format most likely to be readable in the extended future.Out of necessity, this criterion has had to be extended further for the sizable collection of texts in East Asian languages such as Chinese and Japanese now in the collection, where UTF-8 is used instead.
Other formats may be released as well when submitted by volunteers. The most common non-ASCII format is HTML, which allows markup and illustrations to be included. Some project members and users have requested more advanced formats, believing them to be easier to read. But some formats that are not easily editable, such as PDF, are generally not considered to fit with the goals of Project Gutenberg. Also Project Gutenberg has two options for master formats that can be submitted (from which all other files are generated): customized versions of the Text Encoding Initiative standard (since 2005)and reStructuredText (since 2011).
Beginning in 2009, the Project Gutenberg catalog began offering auto-generated alternate file formats, including HTML (when not already provided), EPUB and plucker.
Michael Hart said in 2004, "The mission of Project Gutenberg is simple: 'To encourage the creation and distribution of ebooks'".His goal was "to provide as many e-books in as many formats as possible for the entire world to read in as many languages as possible". Likewise, a project slogan is to "break down the bars of ignorance and illiteracy", because its volunteers aim to continue spreading public literacy and appreciation for the literary heritage just as public libraries began to do in the late 19th century.
Project Gutenberg is intentionally decentralized; there is no selection policy dictating what texts to add. Instead, individual volunteers work on what they are interested in, or have available. The Project Gutenberg collection is intended to preserve items for the long term, so they cannot be lost by any one localized accident. In an effort to ensure this, the entire collection is backed-up regularly and mirrored on servers in many different locations.
Project Gutenberg is careful to verify the status of its ebooks according to United States copyright law. Material is added to the Project Gutenberg archive only after it has received a copyright clearance, and records of these clearances are saved for future reference. Project Gutenberg does not claim new copyright on titles it publishes. Instead, it encourages their free reproduction and distribution.
Most books in the Project Gutenberg collection are distributed as public domain under United States copyright law. There are also a few copyrighted texts, such as those of science fiction author Cory Doctorow, that Project Gutenberg distributes with permission. These are subject to further restrictions as specified by the copyright holder, although they generally tend to be licensed under Creative Commons.
"Project Gutenberg" is a trademark of the organization, and the mark cannot be used in commercial or modified redistributions of public domain texts from the project. There is no legal impediment to the reselling of works in the public domain if all references to Project Gutenberg are removed, but Gutenberg contributors have questioned the appropriateness of directly and commercially reusing content that has been formatted by volunteers. There have been instances of books being stripped of attribution to the project and sold for profit in the Kindle Store and other booksellers, one being the 1906 book Fox Trapping.
The website is not accessible within Germany, as a result of a court order from S. Fischer Verlag regarding the works of Heinrich Mann, Thomas Mann and Alfred Döblin. Although they were in the public domain in the United States, the German court (Frankfurt am Main Regional Court) recognized the infringement of copyrights still active in Germany, and asserted that the Project Gutenberg website was under German jurisdiction because it hosts content in the German language and is accessible in Germany.This judgment was confirmed by the Frankfurt Court of Appeal on 30 April 2019 (11 U 27/18, available at }. The Frankfurt Court of Appeal has not given permission for a further appeal to the Federal Court of Justice (Bundesgerichtshof), however, an application for permission to appeal has been filed with the Federal Court of Justice. As of 4 October 2020 that application was still pending (Federal Court of Justice I ZR 97/19).
This article needs to be updated.December 2019)(
The text files use the format of plain text encoded in UTF-8 and wrapped at 65–70 characters, with paragraphs separated by a double line break. In recent decades, the resulting relatively bland appearance and the lack of a markup possibility have often been perceived as a drawback of this format. [ dubious ] Project Gutenberg attempts to address this by making many texts available in HTML, ePub, and PDF versions as well, but faithful to the mission of offering data that is easy to handle with computer code, plain ASCII text remains the most important format; the ePub versions still contain extra line breaks between paragraphs and the autogenerated HTML versions are simply the ASCII text between
<pre> tags. Another not-for-profit project, Standard Ebooks, aims to address these issues with its collection of public domain titles that are formatted and styled. It corrects issues related to design and typography.
In December 1994, Project Gutenberg was criticized by the Text Encoding Initiative for failing to include documentation or discussion of the decisions unavoidable in preparing a text, or in some cases, not documenting which of several (conflicting) versions of a text has been the one digitized.
The selection of works (and editions) available has been determined by popularity, ease of scanning, being out of copyright, and other factors; this would be difficult to avoid in any crowd-sourced project.
In March 2004, a initiative was begun by Michael Hart and John S. Guagliardoto provide low-cost intellectual properties. The initial name for this project was Project Gutenberg 2 (PG II), which created controversy among PG volunteers because of the re-use of the project's trademarked name for a commercial venture.
All affiliated projects are independent organizations that share the same ideals and have been given permission to use the Project Gutenberg trademark. They often have a particular national or linguistic focus.
Michael Stern Hart was an American author, best known as the inventor of the e-book and the founder of Project Gutenberg (PG), the first project to make e-books freely available via the Internet. He published e-books years before the Internet existed via the ARPANET, and later on BBS networks and Gopher servers.
Distributed Proofreaders is a web-based project that supports the development of e-texts for Project Gutenberg by allowing many people to work together in proofreading drafts of e-texts for errors. As of June 2020, the site had digitized 39,000 titles.
The Internet Archive is an American digital library with the stated mission of "universal access to all knowledge." It provides free public access to collections of digitized materials, including websites, software applications/games, music, movies/videos, moving images, and millions of books. In addition to its archiving function, the Archive is an activist organization, advocating a free and open Internet. The Internet Archive currently holds over 20 million books and texts, 3 million movies and videos, 400,000 software programs, 7 million audio files, and 463 billion web pages in the Wayback Machine.
Electronic publishing includes the digital publication of e-books, digital magazines, and the development of digital libraries and catalogues. It also includes an editorial aspect, that consists of editing books, journals or magazines that are mostly destined to be read on a screen.
The World English Bible is a free updated revision of the American Standard Version (1901). It is one of the few public domain, present-day English translations of the entire Bible, and it is freely distributed to the public using electronic formats. The Bible was created by volunteers using the ASV as the base text as part of the ebible.org project through Rainbow Missions, Inc., a Colorado nonprofit corporation.
An online encyclopedia, also called a digital encyclopedia, is an encyclopedia accessible through the internet. The idea to build a free encyclopedia using the Internet can be traced at least to the 1994 Interpedia proposal; it was planned as an encyclopedia on the Internet to which everyone could contribute materials. The project never left the planning stage and was overtaken by a key branch of old printed encyclopedias.
Project Rastko — Internet Library of Serb Culture is a non-profit and non-governmental publishing, cultural and educational project dedicated to Serb and Serb-related arts and humanities. It is named after Rastko Nemanjić.
Wikisource is an online digital library of free-content textual sources on a wiki, operated by the Wikimedia Foundation. Wikisource is the name of the project as a whole and the name for each instance of that project ; multiple Wikisources make up the overall project of Wikisource. The project's aim is to host all forms of free text, in many languages, and translations. Originally conceived as an archive to store useful or important historical texts, it has expanded to become a general-content library. The project officially began in November 24, 2003 under the name Project Sourceberg, a play on the famous Project Gutenberg. The name Wikisource was adopted later that year and it received its own domain name seven months later.
reStructuredText is a file format for textual data used primarily in the Python programming language community for technical documentation.
A Short Biographical Dictionary of English Literature is a collection of biographies of writers by John William Cousin (1849–1910), published in 1910. Most of the entries consist of only one paragraph but some entries, like William Shakespeare's, are quite lengthy.
Project Gutenberg Australia, abbreviated as PGA, is an Internet site which was founded in 2001 by Colin Choat. It is a sister site of Project Gutenberg, though there is no formal relationship between the two organizations. The site hosts free ebooks or e-texts which are in the public domain in Australia. Volunteers have prepared and submitted the ebooks.
Open Library is an online project intended to create "one web page for every book ever published". Created by Aaron Swartz, Brewster Kahle, Alexis Rossi, Anand Chitipothu, and Rebecca Malamud, Open Library is a project of the Internet Archive, a nonprofit organization. It has been funded in part by grants from the California State Library and the Kahle/Austin Foundation. Open Library provides online digital copies in multiple formats, created from images of many public domain, out-of-print, and in-print books.
Google Books is a service from Google Inc. that searches the full text of books and magazines that Google has scanned, converted to text using optical character recognition (OCR), and stored in its digital database. Books are provided either by publishers and authors through the Google Books Partner Program, or by Google's library partners through the Library Project. Additionally, Google has partnered with a number of magazine publishers to digitize their archives.
LibriVox is a group of worldwide volunteers who read and record public domain texts creating free public domain audiobooks for download from their website and other digital library hosting sites on the internet. It was founded in 2005 by Hugh McGuire to provide "Acoustical liberation of books in the public domain" and the LibriVox objective is "To make all books in the public domain available, for free, in audio format on the internet".
The Choral Public Domain Library (CPDL) is a sheet music archive which focuses on choral and vocal music in the public domain or otherwise freely available for printing and performing.
Project Gutenberg Canada, also known as Project Gutenburg of Canada, is a Canadian digital library founded July 1, 2007. Their website allows Canadian residents to create e-texts and download books that are otherwise not in the public domain in other countries.
Distributed Proofreaders Canada is a volunteer organization that converts books into digital format and releases them as public domain books in formats readable by electronic devices. It was launched in December 2007 and as of 2020 has published about 4,600 books. Books that are released are stored on a book archive called Faded Page. While its focus is on Canadian publications and preserving Canadiana, it also includes books from other countries as well. It is modelled after Distributed Proofreaders, and performs the same function as similar projects in other parts of the world such as Project Gutenberg in the United States and Project Gutenberg Australia.
An electronic book, also known as an e-book or eBook, is a book publication made available in digital form, consisting of text, images, or both, readable on the flat-panel display of computers or other electronic devices. Although sometimes defined as "an electronic version of a printed book", some e-books exist without a printed equivalent. E-books can be read on dedicated e-reader devices, but also on any computer device that features a controllable viewing screen, including desktop computers, laptops, tablets and smartphones.
Feedbooks is a digital library and cloud publishing service for both public domain and original books founded in June 2007 and based in Paris, France. The main focus of the web site is providing e-books with particularly high-quality typesetting in multiple formats, particularly EPUB, Kindle, and PDF formats. Custom PDF generation settings, like trim size dimensions and margins, are possible on the site. Feedbooks offers over 80,000 ebooks. As of 2011, Feedbooks distributed around 3 million ebooks every month.
Standard Ebooks is a not-for-profit platform that curates, refines, and republishes existing copies of freely available public domain e-books no longer protected by U.S. copyright law. Its code is open source and is available for contribution from volunteers. Standard Ebooks sources its titles from sources like Project Gutenberg, the Internet Archive, and Wikisource, among others.
You can view or edit ASCII text using just about every text editor or viewer in the world. [...] Unicode is steadily gaining ground, with at least some support in every major operating system, but we're nowhere near the point where everyone can just open a text based on Unicode and read and edit it.