In 2006, the Internet company AOL released a large excerpt from its web search query logs to the public. AOL did not identify users in the report, but personally identifiable information was present in many of the queries. This allowed some users to be identified by their search queries. Although AOL took down the file within a few days, it had already been widely copied and still remains available.
On August 4, 2006, AOL Research, headed by Dr. Abdur Chowdhury, released a compressed text file on one of its websites containing twenty million search queries for over 650,000 users over a 3-month period; it was intended for research. AOL deleted the file on their site by August 7, but not before it had been copied and distributed on the Internet.
AOL did not identify users in the report; however, personally identifiable information was present in many of the queries. As the queries were attributed by AOL to particular user numerically identified accounts, an individual could be identified and matched to their account and search history. [1] The New York Times was able to locate an individual from the released and anonymized search records by cross referencing them with phonebook listings. [2] Consequently, the ethical implications of using this data for research are under debate. [3] [4]
AOL acknowledged it was a mistake and removed the data; however, the removal was too late. The data was redistributed by others and can still be downloaded from mirror sites. [5] [6]
In January 2007, Business 2.0 Magazine on CNNMoney ranked the release of the search data as #57 of its "101 Dumbest Moments in Business" for 2007. [7]
In September 2006, a class action lawsuit was filed against AOL in the U.S. District Court for the Northern District of California. The lawsuit accuses AOL of violating the Electronic Communications Privacy Act and of fraudulent and deceptive business practices, among other claims, and seeks at least $5,000 for every person whose search data was exposed. [8] The case was settled in 2013. [9]
Although the searchers were only identified by a numeric ID, some people's search results have become notable for various reasons.
Through clues revealed in the search queries, The New York Times successfully uncovered the identities of several searchers. With her permission, they exposed user #4417749 as Thelma Arnold, a 62-year-old widow from Lilburn, Georgia. [10] This privacy breach was widely reported, and led to the resignation of AOL's CTO, Maureen Govern, on August 21, 2006. The media quoted an insider as saying that two employees had been fired: the researcher who released the data, and his immediate supervisor, who reported to Govern. [11] [12]
One product of the AOL scandal was the proliferation of blog entries examining the exposed data. Certain users' search logs were identified as interesting, humorous, disturbing, or dangerous. [13] [14]
Consumer watchdog website The Consumerist posted a blog entry by editor Ben Popken identifying the anonymous user number 927 [15] as having an especially bizarre and macabre search history, ranging from butterfly orchids and the band Fall Out Boy, [16] to search terms relating to child pornography and zoophilia. [17] The blog posting has since been viewed nearly 4,000 times and referenced on a number of other high-profile sites. [18] In addition to sparking the interest of the Internet community, User 927 inspired a theatrical production, written by Katharine Clark Gray in Philadelphia. The play, also named User 927, has since been cited on several of the same blogs that originally discovered the real user's existence. [19]
A series of movies on the web site Minimovies called I Love Alaska puts voice and imagery to User 711391 which the authors have labeled as "an episodic documentary". [20]
AOL is an American web portal and online service provider based in New York City. It is a brand marketed by the current incarnation of Yahoo! Inc.
Gmail is a free email service provided by Google. As of 2019, it had 1.5 billion active users worldwide making it the largest email service in the world. It also provides a webmail interface, accessible through a web browser, and is also accessible through the official mobile application. Google also supports the use of third-party email clients via the POP and IMAP protocols.
Madster was a peer-to-peer file sharing service. It was released in Napster's wake in August 2000 shut down in December 2002 as a result of a lawsuit by the Recording Industry Association of America.
Startpage is a Dutch search engine company that highlights privacy as its distinguishing feature. The website advertises that it allows users to obtain Google Search results while protecting users' privacy by not storing personal information or search data and removing all trackers. Startpage.com also includes an Anonymous View browsing feature that allows users the option to open search results via proxy for increased anonymity.
Google Maps is a web mapping platform and consumer application offered by Google. It offers satellite imagery, aerial photography, street maps, 360° interactive panoramic views of streets, real-time traffic conditions, and route planning for traveling by foot, car, bike, air and public transportation. As of 2020, Google Maps was being used by over one billion people every month around the world.
HTTP cookies are small blocks of data created by a web server while a user is browsing a website and placed on the user's computer or other device by the user's web browser. Cookies are placed on the device used to access a website, and more than one cookie may be placed on a user's device during a session.
RapLeaf was a US-based marketing data and software company, which was acquired by email data provider TowerData in 2013.
Facebook is an online social media and social networking service owned by American technology giant Meta Platforms. Created in 2004 by Mark Zuckerberg with four other Harvard College students and roommates Eduardo Saverin, Andrew McCollum, Dustin Moskovitz, and Chris Hughes, its name derives from the face book directories often given to American university students. Membership was initially limited to Harvard students, gradually expanding to other North American universities. Since 2006, Facebook allows everyone to register from 13 years old, except in the case of a handful of nations, where the age limit is 14 years. As of December 2022, Facebook claimed 3 billion monthly active users, and ranked third worldwide among the most visited websites. It was the most downloaded mobile app of the 2010s.
Christopher Soghoian is a privacy researcher and activist. He is currently working for Senator Ron Wyden as the senator’s Senior Advisor for Privacy & Cybersecurity. From 2012 to 2016, he was the principal technologist at the American Civil Liberties Union.
Consumerist was a non-profit consumer affairs website owned by Consumer Media LLC, a subsidiary of Consumer Reports, with content created by a team of full-time reporters and editors. The site's focus was on consumerism and consumers' experiences and issues with companies and corporations, concentrating mostly on U.S. consumers. As an early proponent of crowdsourced journalism, some content was based on reader-submitted tips and complaints. The majority of the site's articles consisted of original content and reporting by the site's staff. On October 30, 2017, Consumer Reports shut down Consumerist, stating that coverage of consumer issues would now be found on the main Consumer Reports website.
DuckDuckGo (DDG) is an internet privacy company. DuckDuckGo offers a number of products oriented towards helping people protect their privacy online, most notably, a private search engine, a tracker-blocking browser extension, email protection and app tracking protection.
The Dolphin Browser is a web browser for the Android and iOS operating systems developed by MoboTap Inc. It was one of the first alternative browsers for the Android platform that introduced support for multi-touch gestures. Dolphin Browser uses its native platform's default browser engine.
Google Drive is a file storage and synchronization service developed by Google. Launched on April 24, 2012, Google Drive allows users to store files in the cloud, synchronize files across devices, and share files. In addition to a web interface, Google Drive offers apps with offline capabilities for Windows and macOS computers, and Android and iOS smartphones and tablets. Google Drive encompasses Google Docs, Google Sheets, and Google Slides, which are a part of the Google Docs Editors office suite that permits collaborative editing of documents, spreadsheets, presentations, drawings, forms, and more. Files created and edited through the Google Docs suite are saved in Google Drive.
iOS 6 is the sixth major release of the iOS mobile operating system developed by Apple Inc, being the successor to iOS 5. It was announced at the company's Worldwide Developers Conference on June 11, 2012, and was released on September 19, 2012. It was succeeded by iOS 7 on September 18, 2013.
Google Flu Trends (GFT) was a web service operated by Google. It provided estimates of influenza activity for more than 25 countries. By aggregating Google Search queries, it attempted to make accurate predictions about flu activity. This project was first launched in 2008 by Google.org to help predict outbreaks of flu.
Facebook Graph Search was a semantic search engine that Facebook introduced in March 2013. It was designed to give answers to user natural language queries rather than a list of links. The name refers to the social graph nature of Facebook, which maps the relationships among users. The Graph Search feature combined the big data acquired from its over one billion users and external data into a search engine providing user-specific search results. In a presentation headed by Facebook CEO Mark Zuckerberg, it was announced that the Graph Search algorithm finds information from within a user's network of friends. Microsoft's Bing search engine provided additional results. In July it was made available to all users using the U.S. English version of Facebook. After being made less publicly visible starting December 2014, the original Graph Search was almost entirely deprecated in June 2019.
Brave is a free and open-source web browser developed by Brave Software, Inc. based on the Chromium web browser. Brave is a privacy-focused browser, which automatically blocks most advertisements and website trackers in its default settings. Users can turn on optional ads that reward them for their attention in the form of Basic Attention Tokens (BAT), which can be used as a cryptocurrency or to make payments to registered websites and content creators.
I Love Alaska is a 2009 documentary chronicling the AOL search history of "user 711391," whose searches are narrated by a monotone female voiceover. The film was produced by Submarine Channel, and released episodically in 2009 before being uploaded to stream for free on Minimovies.org.
Search engine privacy is a subset of internet privacy that deals with user data being collected by search engines. Both types of privacy fall under the umbrella of information privacy. Privacy concerns regarding search engines can take many forms, such as the ability for search engines to log individual search queries, browsing history, IP addresses, and cookies of users, and conducting user profiling in general. The collection of personally identifiable information (PII) of users by search engines is referred to as tracking.