AOL search log release

Last updated

In 2006, the Internet company AOL released a large excerpt from its web search query logs to the public. AOL did not identify users in the report, but personally identifiable information was present in many of the queries. This allowed some users to be identified by their search queries. Although AOL took down the file within a few days, it had already been widely copied and still remains available.

Contents

Overview

On August 4, 2006, AOL Research, headed by Dr. Abdur Chowdhury, released a compressed text file on one of its websites containing twenty million search queries for over 650,000 users over a 3-month period; it was intended for research. AOL deleted the file on their site by August 7, but not before it had been copied and distributed on the Internet.

AOL did not identify users in the report; however, personally identifiable information was present in many of the queries. As the queries were attributed by AOL to particular user numerically identified accounts, an individual could be identified and matched to their account and search history. [1] The New York Times was able to locate an individual from the released and anonymized search records by cross referencing them with phonebook listings. [2] Consequently, the ethical implications of using this data for research are under debate. [3] [4]

AOL acknowledged it was a mistake and removed the data; however, the removal was too late. The data was redistributed by others and can still be downloaded from mirror sites. [5] [6]

In January 2007, Business 2.0 Magazine on CNNMoney ranked the release of the search data as #57 of its "101 Dumbest Moments in Business" for 2007. [7]

Lawsuits

In September 2006, a class action lawsuit was filed against AOL in the U.S. District Court for the Northern District of California. The lawsuit accuses AOL of violating the Electronic Communications Privacy Act and of fraudulent and deceptive business practices, among other claims, and seeks at least $5,000 for every person whose search data was exposed. [8] The case was settled in 2013. [9]

Notable users

Although the searchers were only identified by a numeric ID, some people's search results have become notable for various reasons.

Thelma Arnold

Through clues revealed in the search queries, The New York Times successfully uncovered the identities of several searchers. With her permission, they exposed user #4417749 as Thelma Arnold, a 62-year-old widow from Lilburn, Georgia. [10] This privacy breach was widely reported, and led to the resignation of AOL's CTO, Maureen Govern, on August 21, 2006. The media quoted an insider as saying that two employees had been fired: the researcher who released the data, and his immediate supervisor, who reported to Govern. [11] [12]

User 927

One product of the AOL scandal was the proliferation of blog entries examining the exposed data. Certain users' search logs were identified as interesting, humorous, disturbing, or dangerous. [13] [14]

Consumer watchdog website The Consumerist posted a blog entry by editor Ben Popken identifying the anonymous user number 927 [15] as having an especially bizarre and macabre search history, ranging from butterfly orchids and the band Fall Out Boy, [16] to search terms relating to child pornography and zoophilia. [17] The blog posting has since been viewed nearly 4,000 times and referenced on a number of other high-profile sites. [18] In addition to sparking the interest of the Internet community, User 927 inspired a theatrical production, written by Katharine Clark Gray in Philadelphia. The play, also named User 927, has since been cited on several of the same blogs that originally discovered the real user's existence. [19]

User 711391

A series of movies on the web site Minimovies called I Love Alaska puts voice and imagery to User 711391 which the authors have labeled as "an episodic documentary". [20]

See also

Related Research Articles

<span class="mw-page-title-main">AOL</span> American internet portal

AOL is an American web portal and online service provider based in New York City. It is a brand marketed by the current incarnation of Yahoo! Inc.

<span class="mw-page-title-main">Gmail</span> Email service provided by Google

Gmail is a free email service provided by Google. As of 2019, it had 1.5 billion active users worldwide making it the largest email service in the world. It also provides a webmail interface, accessible through a web browser, and is also accessible through the official mobile application. Google also supports the use of third-party email clients via the POP and IMAP protocols.

Madster was a peer-to-peer file sharing service. It was released in Napster's wake in August 2000 shut down in December 2002 as a result of a lawsuit by the Recording Industry Association of America.

<span class="mw-page-title-main">Startpage.com</span> Privacy-focused search engine based in the Netherlands

Startpage is a Dutch search engine company that highlights privacy as its distinguishing feature. The website advertises that it allows users to obtain Google Search results while protecting users' privacy by not storing personal information or search data and removing all trackers. Startpage.com also includes an Anonymous View browsing feature that allows users the option to open search results via proxy for increased anonymity.

<span class="mw-page-title-main">Google Maps</span> Googles web mapping service (launched 2005)

Google Maps is a web mapping platform and consumer application offered by Google. It offers satellite imagery, aerial photography, street maps, 360° interactive panoramic views of streets, real-time traffic conditions, and route planning for traveling by foot, car, bike, air and public transportation. As of 2020, Google Maps was being used by over one billion people every month around the world.

<span class="mw-page-title-main">HTTP cookie</span> Small pieces of data stored by a web browser while on a website

HTTP cookies are small blocks of data created by a web server while a user is browsing a website and placed on the user's computer or other device by the user's web browser. Cookies are placed on the device used to access a website, and more than one cookie may be placed on a user's device during a session.

RapLeaf was a US-based marketing data and software company, which was acquired by email data provider TowerData in 2013.

<span class="mw-page-title-main">Facebook</span> Social-networking service owned by Meta Platforms

Facebook is an online social media and social networking service owned by American technology giant Meta Platforms. Created in 2004 by Mark Zuckerberg with four other Harvard College students and roommates Eduardo Saverin, Andrew McCollum, Dustin Moskovitz, and Chris Hughes, its name derives from the face book directories often given to American university students. Membership was initially limited to Harvard students, gradually expanding to other North American universities. Since 2006, Facebook allows everyone to register from 13 years old, except in the case of a handful of nations, where the age limit is 14 years. As of December 2022, Facebook claimed 3 billion monthly active users, and ranked third worldwide among the most visited websites. It was the most downloaded mobile app of the 2010s.

<span class="mw-page-title-main">Christopher Soghoian</span> American computer scientist

Christopher Soghoian is a privacy researcher and activist. He is currently working for Senator Ron Wyden as the senator’s Senior Advisor for Privacy & Cybersecurity. From 2012 to 2016, he was the principal technologist at the American Civil Liberties Union.

<i>Consumerist</i> Non-profit consumer affairs website

Consumerist was a non-profit consumer affairs website owned by Consumer Media LLC, a subsidiary of Consumer Reports, with content created by a team of full-time reporters and editors. The site's focus was on consumerism and consumers' experiences and issues with companies and corporations, concentrating mostly on U.S. consumers. As an early proponent of crowdsourced journalism, some content was based on reader-submitted tips and complaints. The majority of the site's articles consisted of original content and reporting by the site's staff. On October 30, 2017, Consumer Reports shut down Consumerist, stating that coverage of consumer issues would now be found on the main Consumer Reports website.

<span class="mw-page-title-main">DuckDuckGo</span> Internet search engine

DuckDuckGo (DDG) is an internet privacy company. DuckDuckGo offers a number of products oriented towards helping people protect their privacy online, most notably, a private search engine, a tracker-blocking browser extension, email protection and app tracking protection.

<span class="mw-page-title-main">Dolphin Browser</span> Web browser for Android and iOS

The Dolphin Browser is a web browser for the Android and iOS operating systems developed by MoboTap Inc. It was one of the first alternative browsers for the Android platform that introduced support for multi-touch gestures. Dolphin Browser uses its native platform's default browser engine.

Google Drive is a file storage and synchronization service developed by Google. Launched on April 24, 2012, Google Drive allows users to store files in the cloud, synchronize files across devices, and share files. In addition to a web interface, Google Drive offers apps with offline capabilities for Windows and macOS computers, and Android and iOS smartphones and tablets. Google Drive encompasses Google Docs, Google Sheets, and Google Slides, which are a part of the Google Docs Editors office suite that permits collaborative editing of documents, spreadsheets, presentations, drawings, forms, and more. Files created and edited through the Google Docs suite are saved in Google Drive.

iOS 6 2012 mobile operating system

iOS 6 is the sixth major release of the iOS mobile operating system developed by Apple Inc, being the successor to iOS 5. It was announced at the company's Worldwide Developers Conference on June 11, 2012, and was released on September 19, 2012. It was succeeded by iOS 7 on September 18, 2013.

<span class="mw-page-title-main">Google Flu Trends</span> Former web service operated by Google

Google Flu Trends (GFT) was a web service operated by Google. It provided estimates of influenza activity for more than 25 countries. By aggregating Google Search queries, it attempted to make accurate predictions about flu activity. This project was first launched in 2008 by Google.org to help predict outbreaks of flu.

<span class="mw-page-title-main">Facebook Graph Search</span> Semantic search engine by Facebook

Facebook Graph Search was a semantic search engine that Facebook introduced in March 2013. It was designed to give answers to user natural language queries rather than a list of links. The name refers to the social graph nature of Facebook, which maps the relationships among users. The Graph Search feature combined the big data acquired from its over one billion users and external data into a search engine providing user-specific search results. In a presentation headed by Facebook CEO Mark Zuckerberg, it was announced that the Graph Search algorithm finds information from within a user's network of friends. Microsoft's Bing search engine provided additional results. In July it was made available to all users using the U.S. English version of Facebook. After being made less publicly visible starting December 2014, the original Graph Search was almost entirely deprecated in June 2019.

<span class="mw-page-title-main">Brave (web browser)</span> Chromium-based open-source web browser

Brave is a free and open-source web browser developed by Brave Software, Inc. based on the Chromium web browser. Brave is a privacy-focused browser, which automatically blocks most advertisements and website trackers in its default settings. Users can turn on optional ads that reward them for their attention in the form of Basic Attention Tokens (BAT), which can be used as a cryptocurrency or to make payments to registered websites and content creators.

<i>I Love Alaska</i> 2009 Dutch film

I Love Alaska is a 2009 documentary chronicling the AOL search history of "user 711391," whose searches are narrated by a monotone female voiceover. The film was produced by Submarine Channel, and released episodically in 2009 before being uploaded to stream for free on Minimovies.org.

Search engine privacy is a subset of internet privacy that deals with user data being collected by search engines. Both types of privacy fall under the umbrella of information privacy. Privacy concerns regarding search engines can take many forms, such as the ability for search engines to log individual search queries, browsing history, IP addresses, and cookies of users, and conducting user profiling in general. The collection of personally identifiable information (PII) of users by search engines is referred to as tracking.

References

  1. Michael Arrington (August 6, 2006). "AOL proudly releases massive amounts of user search data". TechCrunch. Archived from the original on August 12, 2006. Retrieved August 7, 2006.
  2. Barbaro, Michael; Zeller Jr, Tom (August 9, 2006). "A Face Is Exposed for AOL Searcher No. 4417749". The New York Times . Archived from the original on March 12, 2018. Retrieved April 6, 2018.
  3. Katie Hafner (August 23, 2006). "Tempting Data, Privacy Concerns; Researchers Yearn To Use AOL Logs, But They Hesitate". The New York Times . Archived from the original on July 8, 2011. Retrieved September 13, 2006.
  4. Nate Anderson (August 23, 2006). "The ethics of using AOL search data". Ars Technica. Archived from the original on September 2, 2006. Retrieved September 13, 2006.
  5. Dawn Kawamoto; Elinor Mills (August 7, 2006). "AOL apologizes for release of user search data". CNET. Archived from the original on September 22, 2021. Retrieved August 12, 2013.
  6. "AOL search data mirrors". Archived from the original on October 2, 2012. Retrieved August 12, 2013.
  7. "101 Dumbest Moments in Business: Full list". CNN. Archived from the original on July 7, 2018. Retrieved August 2, 2020.
  8. Elinor Mills (September 25, 2006). "AOL sued over Web search data release". CNET. Archived from the original on February 14, 2013. Retrieved August 12, 2013.
  9. "AOL Settles Data Valdez Lawsuit For $5 Million". www.mediapost.com. Archived from the original on April 27, 2021. Retrieved April 27, 2021.
  10. "A Face Is Exposed for AOL Searcher No. 4417749". Archived from the original on September 1, 2013. Retrieved August 12, 2013.
  11. Li, Kenneth (August 21, 2006). "AOL chief technology officer resigns: sources". Reuters. Archived from the original on June 1, 2007.
  12. AOL executive quits after posting of search data – International Herald Tribune Archived November 26, 2006, at the Wayback Machine
  13. Frind, Markus (July 7, 2006). "AOL Search Data Shows Users Planning to commit Murder". The Paradigm Shift. WordPress.com. Archived from the original (blog) on June 5, 2008. Retrieved June 7, 2008.
  14. Johnny, Titanium (August 13, 2006). "AOL Search Log Special, Part 1" (blog). SomethingAwful.com. Archived from the original on July 21, 2013. Retrieved January 10, 2013.
  15. "AOL user #927: yoko ono - anatai dayo". searchids.com. Archived from the original on February 28, 2021. Retrieved September 22, 2021.
  16. Carlson, Nicholas. "AOL User 927's Entire Sordid Search Log". Business Insider. Archived from the original on August 16, 2021. Retrieved April 9, 2021.
  17. Popken, Ben (July 7, 2006). "AOL User 927 Illuminated" (blog). The Consumerist . Archived from the original on December 14, 2012. Retrieved December 9, 2012.
  18. "Leaked AOL search logs take stage in new play" (blog). CNet News Blog. CNET. Archived from the original on October 25, 2012. Retrieved January 28, 2010.
  19. Popken, Ben (April 29, 2008). "AOL User 927, The Theatrical Production". The Consumerist. Archived from the original (blog) on April 1, 2018. Retrieved December 9, 2012.
  20. "I Love Alaska - Lernert Engelberts & Sander Plug". minimovies.org. Submarinechannel. January 2009. Archived from the original on January 14, 2019. Retrieved January 14, 2019.