QuickCode

Last updated
QuickCode
ScraperWiki logo.svg
Available inEnglish
Revenue Sponsored by 4iP [1]
URL quickcode.io
Current statusInactive
Content license
GNU Affero General Public License [2]

QuickCode (formerly ScraperWiki) was a web-based platform for collaboratively building programs to extract and analyze public (online) data, in a wiki-like fashion. "Scraper" refers to screen scrapers, programs that extract data from websites. "Wiki" means that any user with programming experience can create or edit such programs for extracting new data, or for analyzing existing datasets. [1] The main use of the website is providing a place for programmers and journalists to collaborate on analyzing public data. [3] [4] [5] [6] [7] [8]

Contents

The service was renamed circa 2016, as "it isn't a wiki or just for scraping any more". [9] At the same time, the eponymous parent company was renamed 'The Sensible Code Company'. [9]

History

ScraperWiki was founded in 2009 by Julian Todd and Aidan McGuire. It was initially funded by 4iP, the venture capital arm of TV station Channel 4. Since then, it has attracted an additional £1 Million round of funding from Enterprise Ventures.

Aidan McGuire is the chief executive officer of The Sensible Code Company

See also

Related Research Articles

<span class="mw-page-title-main">Hacktivism</span> Computer-based activities as a means of protest

Internet activism, hacktivism, or hactivism, is the use of computer-based techniques such as hacking as a form of civil disobedience to promote a political agenda or social change. With roots in hacker culture and hacker ethics, its ends are often related to free speech, human rights, or freedom of information movements.

Open-source journalism, a close cousin to citizen journalism or participatory journalism, is a term coined in the title of a 1999 article by Andrew Leonard of Salon.com. Although the term was not actually used in the body text of Leonard's article, the headline encapsulated a collaboration between users of the internet technology blog Slashdot and a writer for Jane's Intelligence Review. The writer, Johan J. Ingles-le Nobel, had solicited feedback on a story about cyberterrorism from Slashdot readers, and then re-wrote his story based on that feedback and compensated the Slashdot writers whose information and words he used.

<span class="mw-page-title-main">Stratfor</span> American geopolitical advising firm

Strategic Forecasting Inc., commonly known as Stratfor, is an American strategic intelligence publishing company founded in 1996. Stratfor's business model is to provide individual and enterprise subscriptions to Stratfor Worldview, its online publication, and to perform intelligence gathering for corporate clients. The focus of Stratfor's content is security issues and analyzing geopolitical risk.

<span class="mw-page-title-main">Hackathon</span> Event in which groups of software developers work at an accelerated pace

A hackathon is an event where people engage in rapid and collaborative engineering over a relatively short period of time such as 24 or 48 hours. They are often run using agile software development practices, such as sprint-like design wherein computer programmers and others involved in software development, including graphic designers, interface designers, product managers, project managers, domain experts, and others collaborate intensively on engineering projects, such as software engineering.

Scrape, scraper or scraping may refer to:

Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites. Web scraping software may directly access the World Wide Web using the Hypertext Transfer Protocol or a web browser. While web scraping can be done manually by a software user, the term typically refers to automated processes implemented using a bot or web crawler. It is a form of copying in which specific data is gathered and copied from the web, typically into a central local database or spreadsheet, for later retrieval or analysis.

Collaborative journalism is a growing practice in the field of journalism. One definition is "a cooperative arrangement between two or more news and information organizations, which aims to supplement each organization’s resources and maximize the impact of the content produced." It is practiced by both professional and amateur reporters. It is not to be confused with citizen journalism.

<span class="mw-page-title-main">GitHub</span> Hosting service for software projects

GitHub is a developer platform that allows developers to create, store, manage and share their code. It uses Git software, providing the distributed version control of access control, bug tracking, software feature requests, task management, continuous integration, and wikis for every project. Headquartered in California, it has been a subsidiary of Microsoft since 2018.

Tableau Software, LLC is an American interactive data visualization software company focused on business intelligence. It was founded in 2003 in Mountain View, California, and is currently headquartered in Seattle, Washington. In 2019, the company was acquired by Salesforce for $15.7 billion. At the time, this was the largest acquisition by Salesforce since its foundation. It was later surpassed by Salesforce's acquisition of Slack.

<span class="mw-page-title-main">Francis Irving</span>

Francis Irving is a British software engineer, freedom of information activist and former CEO of ScraperWiki.

<span class="mw-page-title-main">Jane Silber</span>

Jane Silber is a board member of Canonical Ltd. and was its chief executive officer from 2010 to 2017. Silber is also the chair of the board of The Sensible Code Company and Diffblue.

Data journalism or data-driven journalism (DDJ) is journalism based on the filtering and analysis of large data sets for the purpose of creating or elevating a news story.

DocumentCloud is an open-source software as a service platform that allows users to upload, analyze, annotate, collaborate on and publish primary source documents. Since its launch in 2009, it has been used primarily by journalists to find information in the documents they gather in the course of their reporting and, in the interests of transparency, publish the documents. As of May 2023, DocumentCloud users had uploaded more than 5 million documents.

Beautiful Soup is a Python package for parsing HTML and XML documents, including those with malformed markup. It creates a parse tree for documents that can be used to extract data from HTML, which is useful for web scraping.

Data scraping is a technique where a computer program extracts data from human-readable output coming from another program.

OutWit Hub is a Web data extraction software application designed to automatically extract information from online or local resources. It recognizes and grabs links, images, documents, contacts, recurring vocabulary and phrases, rss feeds and converts structured and unstructured data into formatted tables which can be exported to spreadsheets or databases. The first version was released in 2010. The current version (9.0) is available for Windows 10 & Windows 11, Linux and MacOS 10.

<span class="mw-page-title-main">Point-of-sale malware</span>

Point-of-sale malware is usually a type of malicious software (malware) that is used by cybercriminals to target point of sale (POS) and payment terminals with the intent to obtain credit card and debit card information, a card's track 1 or track 2 data and even the CVV code, by various man-in-the-middle attacks, that is the interception of the processing at the retail checkout point of sale system. The simplest, or most evasive, approach is RAM-scraping, accessing the system's memory and exporting the copied information via a remote access trojan (RAT) as this minimizes any software or hardware tampering, potentially leaving no footprints. POS attacks may also include the use of various bits of hardware: dongles, trojan card readers, (wireless) data transmitters and receivers. Being at the gateway of transactions, POS malware enables hackers to process and steal thousands, even millions, of transaction payment data, depending upon the target, the number of devices affected, and how long the attack goes undetected. This is done before or outside of the card information being (usually) encrypted and sent to the payment processor for authorization.

<span class="mw-page-title-main">Vault 7</span> CIA files on cyber war and surveillance

Vault 7 is a series of documents that WikiLeaks began to publish on 7 March 2017, detailing the activities and capabilities of the United States Central Intelligence Agency (CIA) to perform electronic surveillance and cyber warfare. The files, dating from 2013 to 2016, include details on the agency's software capabilities, such as the ability to compromise cars, smart TVs, web browsers including Google Chrome, Microsoft Edge, Mozilla Firefox, and Opera, the operating systems of most smartphones including Apple's iOS and Google's Android, and computer operating systems including Microsoft Windows, macOS, and Linux. A CIA internal audit identified 91 malware tools out of more than 500 tools in use in 2016 being compromised by the release. The tools were developed by the Operations Support Branch of the CIA.

Search engine scraping is the process of harvesting URLs, descriptions, or other information from search engines. This is a specific form of screen scraping or web scraping dedicated to search engines only.

<span class="mw-page-title-main">Open source</span> Source code made freely available

Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open source model is a decentralized software development model that encourages open collaboration. A main principle of open source software development is peer production, with products such as source code, blueprints, and documentation freely available to the public. The open source movement in software began as a response to the limitations of proprietary code. The model is used for projects such as in open source appropriate technology, and open source drug discovery.

References

  1. 1 2 Jamie Arnold (2009-12-01). "4iP invests in ScraperWiki". 4iP.
  2. "GNU Affero General Public License v3.0 - sensiblecodeio". GitHub. Retrieved 30 December 2017.
  3. Cian Ginty (2010-11-19). "Hacks and hackers unite to get solid stories from difficult data". The Irish Times.
  4. Paul Bradshaw (2010-07-07). "An introduction to data scraping with Scraperwiki". Online Journalism Blog.
  5. Charles Arthur (2010-11-22). "Analysing data is the future for journalists, says Tim Berners-Lee". The Guardian.
  6. Deirdre McArdle (2010-11-19). "In The Papers 19 November". ENN.
  7. "Journalists and developers join forces for Lichfield 'hack day'". The Lichfield Blog. 2010-11-15. Archived from the original on 2010-11-24. Retrieved 2010-12-09.
  8. Alison Spillane (2010-11-17). "Online tool helps to create greater public data transparency". Politico.
  9. 1 2 "ScraperWiki" . Retrieved 7 February 2017.