Distributed Proofreaders

Last updated
Distributed Proofreaders
Distributed Proofreaders logo.svg
Distributed Proofreaders.png
Screenshot of the proofreading interface on Distributed Proofreaders.
Type of site
Not-for-profit
Available in3 languages
List of languages
Country of origin United States of America
OwnerDistributed Proofreaders Foundation (DPF)
Founder Charles Franks
General managerLinda Hamilton
URL www.pgdp.net
CommercialNo
RegistrationOptional
Launched2000;25 years ago (2000)
Current statusActive
Content license
Public Domain
Written in PHP [1]
OCLC  number 1087497129

Distributed Proofreaders (commonly abbreviated as DP or PGDP) is a web-based project that supports the development of e-texts for Project Gutenberg by allowing many people to work together in proofreading drafts of e-texts for errors. As of April 2025, the site had digitized 49,000 titles. [2]

Contents

History

Distributed Proofreaders was founded by Charles Franks in 2000 as an independent site to assist Project Gutenberg. [3] [4] Distributed Proofreaders became an official Project Gutenberg site in 2002. [4]

On 8 November 2002, Distributed Proofreaders was slashdotted, [5] [6] and more than 4,000 new members joined in one day, causing an influx of new proofreaders and software developers, which helped to increase the quantity and quality of e-text production.

In 2006, the Distributed Proofreaders Foundation was formed to provide Distributed Proofreaders with its own legal entity and not-for-profit status, separate from Project Gutenberg. [7] [8] The founding trustees were Charles Franks, Juliet Sutherland, and Gregory B. Newby. [7]

In July 2015, the 30,000th Distributed Proofreaders produced e-text was posted to Project Gutenberg. DP-contributed e-texts comprised more than half of works in Project Gutenberg by 2009. [4]

Proofreading process

DP servers are located in the United States, and therefore works must be cleared by Project Gutenberg as being in the public domain according to United States copyright law before they can be proofread and eventually published. [9]

Public domain works, typically books with expired copyright, are scanned by volunteers or sourced from digitization projects, and the images are run through optical character recognition (OCR) software. [9] Since OCR software is far from perfect, the resulting text always includes errors. [10] To correct them, pages are made available to volunteers via the Internet; the original page image and the recognized text appear side by side. [11] Each set is presented to multiple volunteers to enter corrections, which results in a combined dataset that minimizes errors. [12] This process distributes the time-consuming error-correction process with a method akin to distributed computing. [3]

A post-processor combines the pages and prepares the text for uploading to Project Gutenberg. [9]

Besides custom software created to support the project, DP also runs a forum and a wiki for project coordinators and participants.

DP Europe

In January 2004, Distributed Proofreaders Europe started, hosted by Project Rastko, Serbia. [13] This site had the ability to process text in Unicode UTF-8 encoding. Books proofread centered on European culture, with a considerable proportion of non-English texts including Hebrew, Arabic, Urdu, and many others. As of October 2013, DP Europe had produced 787 e-texts, the last of these in November 2011.

DP Canada

In December 2007, Distributed Proofreaders Canada launched to support the production of e-books for Project Gutenberg Canada and take advantage of shorter Canadian copyright terms. Although it was established by members of the original Distributed Proofreaders site, it is a separate entity. [14] All its projects are posted to Faded Page, their book archive website. In addition, it supplies books to Project Gutenberg Canada, and, where copyright laws are compatible, to the original Project Gutenberg.

Milestones

The source for many of these entries is the DP Timeline. [15]

MilestoneDatee-text
First1 Oct 2000The Odyssey, Homer, Lang tr. (first pages for proofreading)
1,000th19 Feb 2003Tales of St. Austin's, P. G. Wodehouse
2,000th3 Sep 2003Hamlet — the 'Bad Quarto', William Shakespeare
3,000th14 Jan 2004The Anatomy of Melancholy, Robert Burton
4,000th6 Apr 2004Aventures du Capitaine Hatteras, Jules Verne
5,000th24 Aug 2004A Short Biographical Dictionary of English Literature, John William Cousin
10,000th9 Mar 2007(See 10,000th E-book below)
15,000th12 May 2009Philosophical Transactions of the Royal Society - Vol 1 - 1666, Various. Henry Oldenburg (editor)
20,000th10 April 2011(See 20,000th E-book below)
25,000th10 April 2013The Art and Practice of Silver Printing, H. P. Robinson and Capt. Abney [16]
30,000th7 July 2015Graded Literature Readers: Fourth Book [17]
35,000th26 Jan 2018Shores of the Polar Sea, a Narrative of the Arctic Expedition of 1875–1876 [18]
40,000th10 October 2020All four volumes of London Labour and the London Poor [19]
45,000th18 January 2023Elihu Stewart's Down the Mackenzie and Up the Yukon in 1906 [20]

10,000th E-book

On 9 March 2007, Distributed Proofreaders announced the completion of more than 10,000 titles. In celebration, a collection of fifteen titles was published:

20,000th E-book

On April 10, 2011, the 20,000th book milestone was celebrated as a group release of bilingual books: [21]

30,000th E-book

On 7 July 2015, the 30,000th book milestone was celebrated with a group of thirty texts. One was numbered 30,000: [22]

40,000th E-book

On 10 October 2020, the 40,000th book milestone was celebrated with the completion of a four-volume work, London Labour and the London Poor , by Henry Mayhew. [23]

See also

References

  1. "Distributed Proofreaders". github.com. Retrieved 2022-02-01.
  2. Cantoni, Linda (2025-04-12). "Celebrating 49,000 Titles | Hot off the Press". Hot off the Press: Book Reviews and Notes from Distributed Proofreaders. Distributed Proofreaders. Retrieved 2025-10-24.
  3. 1 2 Lessig, Lawrence (2009). Remix: Making Art and Commerce Thrive in the Hybrid Economy. Penguin. p. 167. ISBN   978-0-14-311613-4.
  4. 1 2 3 "Project Gutenberg". Encyclopedia Britannica. 11 August 2025. Retrieved 20 October 2025.
  5. "Gutenberg:Volunteers' Voices". Project Gutenberg. Archived from the original on 2008-09-18. Retrieved 2008-07-12.
  6. "Distributed Proofreading's slashdotting". Boing Boing. 12 November 2002. Archived from the original on 2007-11-09. Retrieved 2008-07-12.
  7. 1 2 "Distributed Proofreaders Foundation History". DPWiki. 2025-06-18. Retrieved 2025-10-20.
  8. John, Last (October 31, 2025). "Obituary: Project Gutenberg CEO Greg Newby helped put a trove of literature online". The Globe and Mail. Retrieved November 4, 2025.
  9. 1 2 3 Newby, G. B.; Franks, C. (2003). "Distributed proofreading". 2003 Joint Conference on Digital Libraries, 2003. Proceedings. IEEE Comput. Soc: 361–363. doi:10.1109/JCDL.2003.1204888. ISBN   978-0-7695-1939-5.
  10. Piotrowski, Michael (2022-05-31). Natural Language Processing for Historical Texts. Springer Nature. p. 43. ISBN   978-3-031-02146-6.
  11. Gentry, Craig; Ramzan, Zulfikar; Stuart Stubblebine (February 28 – March 3, 2005). "Secure Distributed Human Computation". In Andrew S. Patrick; Moti Yung (eds.). Financial cryptography and data security: 9th International Conference. Lecture Notes in Computer Science. Vol. 3570. Roseau, The Commonwealth of Dominica: Springer. p. 329. doi:10.1145/1064009.1064026. ISBN   3-540-26656-9.
  12. Christianson, Bruce; Crispo, Bruno; Malcolm, James A.; Roe, Michael (2009-10-15). Security Protocols: 14th International Workshop, Cambridge, UK, March 27-29, 2006, Revised Selected Papers. Springer Science & Business Media. p. 178. ISBN   978-3-642-04903-3.
  13. Lebert, Marie (November 4, 2010). "Distributed Proofreaders, producteur des livres du Projet Gutenberg, a 10 ans". Actualitté (in French). Archived from the original on October 5, 2011. Retrieved 2011-06-30.
  14. Lebert, Marie (November 5, 2010). "Distributed Proofreaders just celebrated its 10th anniversary". Teleread. Retrieved 12 November 2025.
  15. "DP Timeline". DPWiki. Retrieved 2020-08-13.
  16. "A Silver Anniversary—25,000 Titles posted to Project Gutenberg!". Pgdp.net. 10 April 2013. Retrieved 20 October 2025.
  17. "Celebrating 30,000 Titles". Pgdp.net. 7 July 2015. Retrieved 20 October 2025.
  18. "Celebrating 35,000 Titles". Pgdp.net. 26 January 2018. Retrieved 20 October 2025.
  19. "Celebrating 40,000 Titles". Pgdp.net. 10 October 2020.
  20. "Celebrating 45,000 Titles". Pgdp.net. 18 January 2023.
  21. Distributed Proofreaders celebrates 20,000 books posted Archived 2011-06-19 at the Wayback Machine , Distributed Proofreaders, April 10, 2011
  22. "Distributed Proofreaders • View topic - 30,000 Unique Titles Preserved!". Pgdp.net. Retrieved 2016-09-15.
  23. "Celebrating 40,000 Titles". Pgdp.net. Retrieved 2025-09-01.