Pubget

Last updated
Pubget
Founded Cambridge, MA, USA (2007)
Headquarters Boston, MA, USA
Key people
Ramy Arnaout,
Ian Connor,
Ryan Jones
Parent Copyright Clearance Center
Website www.pubget.com

Pubget Corp was a wholly owned subsidiary of Copyright Clearance Center that developed cloud-based search and content access tools for scientists. It provided advertising services, enterprise search services, and a public search engine. [1] The company was founded in 2007 by Beth Israel Hospital clinical pathologist, Ramy Arnaout, out of his own need to find papers. [2] [3] [4] Pubget moved its headquarters from Cambridge, Massachusetts to Boston's Innovation District in 2011. [4] [5]

Contents

Pubget.com was a free service for non-profit institutions and their libraries and researchers. The site provided direct access to full-text content from 450 libraries around the world. It was announced in January 2012 that Pubget was acquired by Copyright Clearance Center. [6] The service was closed in 2017.

Products and Services

Search Engine
Pubget's search engine retrieved article citations and full text PDFs from PubMed, ArXiv, Karger, American Society for Microbiology, IEEE, RSS feeds, XML from publishers, and Open Archive sources. [7] The company's search engine contained over 28 million scientific documents and added 10,000 papers each day. Pubget created a link directly from the article citation to the paper itself via a continuously updated database of links. [8] Because of this database, users were directly linked from a citation to the full-text paper.

Access to closed full-text PDFs was granted through the institution's subscriptions. Pubget did not bypass copyright laws and therefore displayed only the abstract of restricted papers if the end user did not have institutional access.

PaperStats
Pubget PaperStats was a usage and spend analysis tool for libraries. PaperStats automatically harvested serials usage statistics delivering consolidated usage, cost, and other reports directly from publishers. Content performance could be assessed through cost-per-view analysis. Upon introduction, PaperStats was beta tested with the USC Norris Medical Library and yielded positive results for Pubget, USC and the library community. [7] [9]

PaperStore
The Pubget PaperStore provided Pubget users the option of purchasing full text papers from thousands of journals on the search engine results page. Content rights and delivery were provided by document delivery vendor, Reprints Desk. [7]

Advertising
Pubget provided several advertising solutions. Customers included Bio-Rad, Agilent, and other scientific brands. Ads were matched with paper content via contextual targeting. For example, manufacturers of a piece of scientific equipment could pay to advertise alongside a paper that mentions using said product. [2] [10] Pubget, however, did not reveal data on individual users and their searches. [2]

Textmining
Pubget's textmining technology allowed research and development teams to uncover specific text strings across large groups of papers. [11]

PaperStream
PaperStream was a web app that allowed lab teams to share, store, and find documents all in one place. [12] PaperStream organized companies’ subscriptions, purchased papers, and internal documents into an automated library database. [13] [14]

API
Pubget's API provided access to its search and linking technology from third-party websites. [15] [16] [ irrelevant citation ]

Related Research Articles

<span class="mw-page-title-main">Streaming media</span> Multimedia delivery method

Streaming media is multimedia that is delivered and consumed in a continuous manner from a source, with little or no intermediate storage in network elements. Streaming refers to the delivery method of content, rather than the content itself.

CiteSeerX is a public search engine and digital library for scientific and academic papers, primarily in the fields of computer and information science.

<span class="mw-page-title-main">Internet Archive</span> American non-profit digital archive

The Internet Archive is an American digital library with the stated mission of "universal access to all knowledge". It provides free public access to collections of digitized materials, including websites, software applications/games, music, movies/videos, moving images, and millions of books. In addition to its archiving function, the Archive is an activist organization, advocating a free and open Internet. As of January 1, 2023, the Internet Archive holds over 36 million books and texts, 11.6 million movies, videos and TV shows and clips, 950 thousand software programs, 15 million audio files, 4.5 million images, 251 thousand concerts, and 780 billion web pages in the Wayback Machine.

In the context of the World Wide Web, deep linking is the use of a hyperlink that links to a specific, generally searchable or indexed, piece of web content on a website, rather than the website's home page. The URL contains all the information needed to point to a particular item. Deep linking is different from mobile deep linking, which refers to directly linking to in-app content using a non-HTTP URI.

<span class="mw-page-title-main">JSTOR</span> Distributor of eBooks and other digital media

JSTOR is a digital library founded in 1995 in New York City. Originally containing digitized back issues of academic journals, it now encompasses books and other primary sources as well as current issues of journals in the humanities and social sciences. It provides full-text searches of almost 2,000 journals.

<span class="mw-page-title-main">Social Science Research Network</span> Repository for preprints

The Social Science Research Network (SSRN) is a repository for preprints devoted to the rapid dissemination of scholarly research in the social sciences, humanities, life sciences, and health sciences, among others. Elsevier bought SSRN from Social Science Electronic Publishing Inc. in May 2016.

<span class="mw-page-title-main">Google Scholar</span> Academic search service by Google

Google Scholar is a freely accessible web search engine that indexes the full text or metadata of scholarly literature across an array of publishing formats and disciplines. Released in beta in November 2004, the Google Scholar index includes peer-reviewed online academic journals and books, conference papers, theses and dissertations, preprints, abstracts, technical reports, and other scholarly literature, including court opinions and patents.

Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites. Web scraping software may directly access the World Wide Web using the Hypertext Transfer Protocol or a web browser. While web scraping can be done manually by a software user, the term typically refers to automated processes implemented using a bot or web crawler. It is a form of copying in which specific data is gathered and copied from the web, typically into a central local database or spreadsheet, for later retrieval or analysis.

<span class="mw-page-title-main">Google Books</span> Service from Google

Google Books is a service from Google Inc. that searches the full text of books and magazines that Google has scanned, converted to text using optical character recognition (OCR), and stored in its digital database. Books are provided either by publishers and authors through the Google Books Partner Program, or by Google's library partners through the Library Project. Additionally, Google has partnered with a number of magazine publishers to digitize their archives.

Google News Archive is an extension of Google News providing free access to scanned archives of newspapers and links to other newspaper archives on the web, both free and paid.

BRS/Search is a full-text database and information retrieval system. BRS/Search uses a fully inverted indexing system to store, locate, and retrieve unstructured data. It was the search engine that in 1977 powered Bibliographic Retrieval Services (BRS) commercial operations with 20 databases ; it has changed ownership several times during its development and is currently sold as Livelink ECM Discovery Server by Open Text Corporation.

<span class="mw-page-title-main">HathiTrust</span> Digital library

HathiTrust Digital Library is a large-scale collaborative repository of digital content from research libraries including content digitized via Google Books and the Internet Archive digitization initiatives, as well as content digitized locally by libraries.

<span class="mw-page-title-main">Metadata</span> Data about data

Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including:

A digital library, also called an online library, an internet library, a digital repository, or a digital collection is an online database of digital objects that can include text, still images, audio, video, digital documents, or other digital media formats or a library accessible through the internet. Objects can consist of digitized content like print or photographs, as well as originally produced digital content like word processor files or social media posts. In addition to storing content, digital libraries provide means for organizing, searching, and retrieving the content contained in the collection. Digital libraries can vary immensely in size and scope, and can be maintained by individuals or organizations. The digital content may be stored locally, or accessed remotely via computer networks. These information retrieval systems are able to exchange information with each other through interoperability and sustainability.

Copyright Clearance Center (CCC) is a U.S. company based in Danvers, Massachusetts,, that provides collective copyright licensing services for corporate and academic users of copyrighted materials. CCC procures agreements with rightsholders, primarily academic publishers, and then acts as their agent in arranging collective licensing for institutions and one-time licensing for document delivery services, coursepacks, and other access and uses of texts.

<span class="mw-page-title-main">Trove</span> Australian online library database aggregator

Trove is an Australian online library database owned by the National Library of Australia in which it holds partnerships with source providers National and State Libraries Australia, an aggregator and service which includes full text documents, digital images, bibliographic and holdings data of items which are not available digitally, and a free faceted-search engine as a discovery tool.

An over-the-top (OTT) media service is a media service offered directly to viewers via the Internet. OTT bypasses cable, broadcast, and satellite television platforms: the types of companies that traditionally act as controllers or distributors of such content. It has also been used to describe no-carrier cellphones, for which all communications are charged as data, avoiding monopolistic competition, or apps for phones that transmit data in this manner, including both those that replace other call methods and those that update software.

ReadCube is a technology company developing software for researchers, publishers, academic and commercial organizations. ReadCube's product line includes the reference manager ReadCube Papers, Anywhere Access and custom services for publishers. It is part of the Digital Science's portfolio.

An online video platform (OVP), provided by a video hosting service, enables users to upload, convert, store and play back video content on the Internet, often via a private server structured, large-scale system that may generate revenue. Users will generally upload video content via the hosting service's website, mobile or desktop application, or other interfaces (API). An example of an OVP is YouTube. The type of video content uploaded might be anything from shorts to full-length TV shows and movies. The video host stores the video on its server and offers users the ability to enable different types of embed codes or links that allow others to view the video content. The website, mainly used as the video hosting website, is usually called the video-sharing website.

<span class="mw-page-title-main">CORE (research service)</span>

CORE is a service provided by the Knowledge Media Institute based at The Open University, United Kingdom. The goal of the project is to aggregate all open access content distributed across different systems, such as repositories and open access journals, enrich this content using text mining and data mining, and provide free access to it through a set of services. The CORE project also aims to promote open access to scholarly outputs. CORE works closely with digital libraries and institutional repositories.

References

  1. "Pubget Everywhere". Pubget. Archived from the original on 16 July 2011. Retrieved 17 June 2011.
  2. 1 2 3 Davies, Kevin (10 June 2009). "Got PubMed? Pubget Searches and Delivers Scientific PDFs". Bio-IT World. Archived from the original on 1 June 2011. Retrieved 17 June 2011.
  3. "Founder's Friday: Pubget". Greenhorn Connect. 7 January 2011. Archived from the original on 3 June 2011. Retrieved 21 June 2011.
  4. 1 2 Goodison, Donna (28 May 2011). "Southie Firm Speeds Up Access to Research Papers". Boston Herald. Archived from the original on 18 June 2011. Retrieved 21 June 2011.
  5. "Welcome home, Pubget". Innovation District. 13 May 2011. Archived from the original on 18 June 2011. Retrieved 16 June 2011.
  6. "Copyright Clearance Center Acquires Pubget". 9 January 2012. Archived from the original on 28 October 2018. Retrieved 11 May 2020.
  7. 1 2 3 Featherstone, Robin; Hersey, Denise (4 October 2010). "The quest for full text: an in-depth examination of Pubget for medical searchers". Medical Reference Services Quarterly. 29 (4): 307–319. doi:10.1080/02763869.2010.518911. PMID   21058175. S2CID   36459379.
  8. Murray, P.E. (4 August 2009). "Analysis of Pubget – An Expedited Fulltext Service for Life Science Journal Articles". Disruptive Library Technology Jester. Archived from the original on 8 July 2011. Retrieved 21 June 2011.
  9. Curran, Megan (2 March 2011). "Debating Beta: Considerations for Libraries". Journal of Electronic Resources in Medical Libraries. 8 (2): 117–125. doi:10.1080/15424065.2011.576604. S2CID   62711345.
  10. "Media Kit: Pubget Ads" (PDF). Pubget, Inc. Archived (PDF) from the original on 26 March 2012. Retrieved 24 June 2011.
  11. "Textmining Fact Sheet" (PDF). Pubget, Inc. Archived (PDF) from the original on 26 March 2012. Retrieved 15 June 2011.
  12. "Pubget PaperStream". Pubget, Inc. Archived from the original on 2 October 2011. Retrieved 24 June 2011.
  13. "Pubget PaperStream For Companies". Pubget, Inc. Archived from the original on 24 June 2011. Retrieved 24 June 2011.
  14. "Pubget PaperStream For Researchers". Pubget, Inc. Archived from the original on 7 October 2011. Retrieved 24 June 2011.
  15. "PubgetCloud" (PDF). Pubget, Inc. Archived from the original (PDF) on March 26, 2012. Retrieved 16 June 2011.
  16. Munger, Dave (10 June 2009). "Pubget – Useful, Growing Resource for Anyone Interested in Research". Researchblogging News. Archived from the original on 14 March 2012. Retrieved 29 June 2011.