Contact scraping

Last updated

In online advertising, contact scraping is the practice of obtaining access to a customer's e-mail account in order to retrieve contact information that is then used for marketing purposes.

Contents

The New York Times refers to the practices of Tagged, MyLife and desktopdating.net as "contact scraping". [1]

Several commercial packages are available that implement contact scraping for their customers, including ViralInviter, TrafficXplode, and TheTsunamiEffect. [2]

Contact scraping is one of the applications of web scraping, and the example of email scraping tools include Uipath, Import.io, and Screen Scraper. The alternative web scraping tools include UzunExt, R functions, and Python Beautiful Soup. The legal issues of contact scraping is under the legality of web scraping.

Web scraping tools

Following web scraping tools can be used as alternatives for contact scraping:

  1. UzunExt is an approach of data scraping in which string methods and crawling process are applied to extract information without using a DOM Tree . [3]
  2. R functions data. rm() and data. rm.a() can be used as a web scraping strategy. [4]
  3. Python Beautiful Soup libraries can be used to scrape data and converted data into csv files. [5]

United States

In the United States, there exists three most commonly legal claims related to web scraping: compilation copyright infringement, violation of the Computer Fraud and Abuse Act (CFAA), and electronic trespass to chattel claims. For example, the users of "scraping tools" may violate the electronic trespass to chattel claims. [6] One of the well-known cases is Intel Corp. v. Hamidi, in which the US court decided that the computer context was not included in the common law trespass claims. [7] [8] However, the three legal claims have been changed doctrinally, and it is uncertain whether the claims will still exist in the future. [6] [9] For instance, the applicability of the CFAA has been narrowed due to the technical similarities between web scraping and web browsing. [10] In the case of EF Cultural Travel BV v. Zefer Corp., the court declined to apply CFAA since EF failed to meet the standard for "damage". [11]

European Union

By the Article 14 of the EU’s General Data Protection Regulation (GDPR), data controllers are obligated to inform individuals before processing personal data. [12] In the case of Bisnode vs. Polish Supervisory Authority, Bisnode obtained personal data from the government public register of business activity, and the data were used for business purpose. However, Bisnode only obtained email addresses for some of the people, so the mail notifications were only sent to those individuals. Instead of directly informing other people, Bisnode simply posted a notice on its website, and thus it failed to comply with the GDPR’s Article 14 obligations. [13] [14]

Australia

In Australia, address‑harvesting software and harvested‑address lists must not be supplied, acquired, or used under the Spam Act 2003. The Spam Act also requires all marketing emails to be sent with the consent of the recipients, and all emails must include an opt-out facility. [15] The company behind the GraysOnline shopping websites was fined after sending emails that breached the Spam Act. GraysOnline sent messages without an option for recipients to opt-out of receiving further emails, and it sent emails to people who had previously withdrawn their consent from receiving Grays' emails. [16] [17]

China

Under the Cybersecurity Law of the People's Republic of China, web crawling of publicly available information is regarded as legal, but it would be illegal to obtain nonpublic, sensitive personal information without consent. [18] On November 24, 2017, three people were convicted of the crime of illegally scraping information system data stored on the server of Beijing ByteDance Networking Technology Co., Ltd. [19]

See also

Related Research Articles

<span class="mw-page-title-main">Spamming</span> Unsolicited electronic messages, especially advertisements

Spamming is the use of messaging systems to send multiple unsolicited messages (spam) to large numbers of recipients for the purpose of commercial advertising, for the purpose of non-commercial proselytizing, for any prohibited purpose, or simply repeatedly sending the same message to the same user. While the most widely recognized form of spam is email spam, the term is applied to similar abuses in other media: instant messaging spam, Usenet newsgroup spam, Web search engine spam, spam in blogs, wiki spam, online classified ads spam, mobile phone messaging spam, Internet forum spam, junk fax transmissions, social spam, spam mobile apps, television advertising and file sharing spam. It is named after Spam, a luncheon meat, by way of a Monty Python sketch about a restaurant that has Spam in almost every dish in which Vikings annoyingly sing "Spam" repeatedly.

Trespass is an area of tort law broadly divided into three groups: trespass to the person, trespass to chattels, and trespass to land.

<span class="mw-page-title-main">Computer Fraud and Abuse Act</span> 1986 United States cybersecurity law

The Computer Fraud and Abuse Act of 1986 (CFAA) is a United States cybersecurity bill that was enacted in 1986 as an amendment to existing computer fraud law, which had been included in the Comprehensive Crime Control Act of 1984. Prior to computer-specific criminal laws, computer crimes were prosecuted as mail and wire fraud, but the applying law was often insufficient.

<span class="mw-page-title-main">Email spam</span> Unsolicited electronic advertising by e-mail

Email spam, also referred to as junk email, spam mail, or simply spam, is unsolicited messages sent in bulk by email (spamming). The name comes from a Monty Python sketch in which the name of the canned pork product Spam is ubiquitous, unavoidable, and repetitive. Email spam has steadily grown since the early 1990s, and by 2014 was estimated to account for around 90% of total email traffic.

Trespass to chattels is a tort whereby the infringing party has intentionally interfered with another person's lawful possession of a chattel. The interference can be any physical contact with the chattel in a quantifiable way, or any dispossession of the chattel. As opposed to the greater wrong of conversion, trespass to chattels is argued to be actionable per se.

Email marketing is the act of sending a commercial message, typically to a group of people, using email. In its broadest sense, every email sent to a potential or current customer could be considered email marketing. It involves using email to send advertisements, request business, or solicit sales or donations. Email marketing strategies commonly seek to achieve one or more of three primary objectives, to building loyalty, trust, or brand awareness. The term usually refers to sending email messages with the purpose of enhancing a merchant's relationship with current or previous customers, encouraging customer loyalty and repeat business, acquiring new customers or convincing current customers to purchase something immediately, and sharing third-party ads.

Email harvesting or scraping is the process of obtaining lists of email addresses using various methods. Typically these are then used for bulk email or spam.

Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites. Web scraping software may directly access the World Wide Web using the Hypertext Transfer Protocol or a web browser. While web scraping can be done manually by a software user, the term typically refers to automated processes implemented using a bot or web crawler. It is a form of copying in which specific data is gathered and copied from the web, typically into a central local database or spreadsheet, for later retrieval or analysis.

Intel Corp. v. Hamidi, 30 Cal. 4th 1342 (2003), is a decision of the California Supreme Court, authored by Associate Justice Kathryn Werdegar. In Hamidi the California Supreme Court held that a former Intel Corporation employee's e-mails to current Intel employees, despite requests by Intel to stop sending messages, did not constitute trespass of Intel's e-mail system.

<i>Register.com v. Verio</i> American legal case

Register.com v. Verio, 356 F.3d 393, was a decision of the United States Court of Appeals for the Second Circuit that addressed several issues relevant to Internet law, such as browse wrap licensing, trespass to servers, and enforcement of the policies of the Internet Corporation for Assigned Names and Numbers (ICANN). The decision upheld the ruling of a lower court which prevented a provider of web development services from automatically harvesting publicly available registration data from a domain name registrar's servers for advertising purposes.

<i>EBay v. Bidders Edge</i> Leading case

eBay v. Bidder's Edge, 100 F. Supp. 2d 1058, was a leading case applying the trespass to chattels doctrine to online activities. In 2000, eBay, an online auction company, successfully used the 'trespass to chattels' theory to obtain a preliminary injunction preventing Bidder's Edge, an auction data aggregator, from using a 'crawler' to gather data from eBay's website. The opinion was a leading case applying 'trespass to chattels' to online activities, although its analysis has been criticized in more recent jurisprudence.

<i>CompuServe Inc. v. Cyber Promotions, Inc.</i>

CompuServe Inc. v. Cyber Promotions, Inc. was a ruling by the United States District Court for the Southern District of Ohio in 1997 that set an early precedent for granting online service providers the right to prevent commercial enterprises from sending unsolicited email advertising – also known as spam – to its subscribers. It was one of the first cases to apply United States tort law to restrict spamming on computer networks. The court held that Cyber Promotions' intentional use of CompuServe's proprietary servers to send unsolicited email was an actionable trespass to chattels and granted a preliminary injunction preventing the spammer from sending unsolicited advertisements to any email address maintained by CompuServe.

<i>Gordon v. Virtumundo, Inc.</i>

Gordon v. Virtumundo, Inc., 575 F.3d 1040, is a 2009 court opinion in which the United States Court of Appeals for the Ninth Circuit addressed the standing requirements necessary for private plaintiffs to bring suit under the Controlling the Assault of Non-Solicited Pornography and Marketing Act of 2003, or CAN-SPAM Act of 2003, 15 U.S.C. ch. 103, as well as the scope of the CAN-SPAM Act's federal preemption. Prior to this case, the CAN-SPAM Act's standing requirements had not been addressed at the Court of Appeals level, and only the Fourth Circuit had addressed the CAN-SPAM Act's preemptive scope.

<span class="mw-page-title-main">General Data Protection Regulation</span> European Union regulation on personal data

The General Data Protection Regulation is a European Union regulation on Information privacy in the European Union (EU) and the European Economic Area (EEA). The GDPR is an important component of EU privacy law and human rights law, in particular Article 8(1) of the Charter of Fundamental Rights of the European Union. It also governs the transfer of personal data outside the EU and EEA. The GDPR's goals are to enhance individuals' control and rights over their personal information and to simplify the regulations for international business. It supersedes the Data Protection Directive 95/46/EC and, among other things, simplifies the terminology.

<span class="mw-page-title-main">Microsoft Digital Crimes Unit</span>

The Microsoft Digital Crimes Unit (DCU) is a Microsoft sponsored team of international legal and internet security experts employing the latest tools and technologies to stop or interfere with cyber crime and cyber threats. The Microsoft Digital Crimes Unit was assembled in 2008. In 2013, a Cybercrime center for the DCU was opened in Redmond, Washington. There are about 100 members of the DCU stationed just in Redmond, Washington at the original Cybercrime Center. Members of the DCU include lawyers, data scientists, investigators, forensic analysts, and engineers. The DCU has international offices located in major cities such as: Beijing, Berlin, Bogota, Delhi, Dublin, Hong Kong, Sydney, and Washington, D.C. The DCU's main focuses are child protection, copyright infringement and malware crimes. The DCU must work closely with law enforcement to ensure the perpetrators are punished to the full extent of the law. The DCU has taken down many major botnets such as the Citadel, Rustock, and Zeus. Around the world malware has cost users about $113 billion and the DCU's jobs is to shut them down in accordance with the law.

<i>Craigslist Inc. v. 3Taps Inc.</i> 2013 Northern District of California Court case

Craigslist Inc. v. 3Taps Inc., 942 F.Supp.2d 962 was a Northern District of California Court case in which the court held that sending a cease-and-desist letter and enacting an IP address block is sufficient notice of online trespassing, which a plaintiff can use to claim a violation of the Computer Fraud and Abuse Act.

<i>America Online, Inc. v. IMS</i>

America Online, Inc. v. IMS, 24 F. Supp. 2d 548 was one of a series of legal battles America Online launched against junk e-mail. In this case, the court held that defendants' unauthorized mailing of unsolicited bulk e-mail constituted a trespass to chattels under Virginia state law.

<i>Pulte Homes, Inc. v. Laborers International Union</i>

Pulte Homes, Inc. v. Laborers' International Union of North America, 648 F.3d 295, is a Sixth Circuit Court of Appeals case that reinstated a Computer Fraud and Abuse Act ("CFAA") claim brought by an employer against a labor union for "bombarding" the company's phone and computer systems with emails and voicemail, making it impossible for the company to communicate with customers. It held that causing a transmission that diminishes a plaintiff's ability to use its systems and data constitutes "causing damage" in violation of the CFAA.

<i>Omega World Travel, Inc. v. Mummagraphics, Inc.</i>

Omega World Travel, Inc. v. Mummagraphics, Inc., 469 F.3d 348, is a case in the United States Court of Appeals for the Fourth Circuit in which Mummagraphics, Inc. is sued by Omega World Travel, Inc. (Omega) and Cruise.com after Mummagraphic alleged that they received 11 commercial e-mail messages in violation of the Controlling the Assault of Non-Solicited Pornography and Marketing (CAN-SPAM) Act of 2003 as well as Oklahoma state law. In the initial filing, the United States District Court for the Eastern District of Virginia had awarded summary judgment to Omega on all of Mummagraphics' claims finding that the commercial emails from Omega did not violate the CAN-SPAM Act, and that the CAN-SPAM Act preempted Oklahoma state law. The Court of Appeals affirmed.

<i>hiQ Labs v. LinkedIn</i> 2019 United States court case

hiQ Labs, Inc. v. LinkedIn Corp., 938 F.3d 985, was a United States Ninth Circuit case about web scraping. The 9th Circuit affirmed the district court's preliminary injunction, preventing LinkedIn from denying the plaintiff, hiQ Labs, from accessing LinkedIn's publicly available LinkedIn member profiles. hiQ is a small data analytics company that used automated bots to scrape information from public LinkedIn profiles.

References

  1. Typing In an E-Mail Address, and Giving Up Your Friends’ as Well
  2. 'Viral inviters' want your e-mail contact list
  3. Uzun, E. (2020). "A Novel Web Scraping Approach Using the Additional Information Obtained From Web Pages". IEEE Access. 8: 61726–61740. doi: 10.1109/ACCESS.2020.2984503 . ISSN   2169-3536. S2CID   215740364.
  4. Vallone, A., Coro, C. and Beatriz, S. (2020). "Strategies to access web-enabled urban spatial data for socioeconomic research using R functions". Journal of Geographical Systems: Spatial Theory, Models, Methods, and Data. 22 (2): 217–34. Bibcode:2020JGS....22..217V. doi:10.1007/s10109-019-00309-y. S2CID   202181499.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  5. Vela, Belen; Cavero, Jose Maria; Caceres, Paloma; Cuesta, Carlos E. (2019). "A Semi-Automatic Data–Scraping Method for the Public Transport Domain". IEEE Access. 7: 105627–105637. doi: 10.1109/access.2019.2932197 . ISSN   2169-3536. S2CID   201068464.
  6. 1 2 Hirschey, Jeffrey (2014). "Symbiotic Relationships: Pragmatic Acceptance of Data Scraping". SSRN Electronic Journal. doi:10.2139/ssrn.2419167. ISSN   1556-5068.
  7. "Internet Law, Ch. 06: Trespass to Chattels". www.tomwbell.com. Retrieved 2020-11-12.
  8. Beckham, J. Brian (2003). "Intel v. Hamidi: Spam as a Trespass to Chattels - Deconstruction of a Private Right of Action in California". The John Marshall Journal of Information Technology & Privacy Law. 22: 205–228.
  9. "FAQ about linking – Are website terms of use binding contracts?". www.chillingeffects.org. 2007-08-20. Archived from the original on 2002-03-08. Retrieved 2007-08-20.
  10. Christensen, J. (2020). "The Demise of the Cfaa in Data Scraping Cases". Notre Dame Journal of Law, Ethics & Public Policy. 34 (2): 529–47.
  11. "Controversy Surrounds 'Screen Scrapers': Software Helps Users Access Web Sites But Activity by Competitors Comes Under SCrutiny". Findlaw. Retrieved 2020-11-12.
  12. Philip H. Liu, Mark Edward Davis (2015–16). "Web Scraping - Limits on Free Samples". Landslide. 8.
  13. Tomáš Pikulíka, Peter Štarchoň (2020). "Public registers with personal data under scrutiny of DPA regulators". Procedia Computer Science. 170: 1174–1179. doi: 10.1016/j.procs.2020.03.033 .
  14. Oxford Analytica (2019). "Europe's national regulators hold key to GDPR success". Expert Briefings.
  15. Infrastructure. "Spam Act 2003". www.legislation.gov.au. Retrieved 2020-12-01.
  16. Torresan, Danielle (2013). "Keeping Good Companies". Informit. 65: 668–669.
  17. "Unauthorised photographs on the internet — back on the Attorney-General's agenda". Internet Law Bulletin. 8. 2005.
  18. Lee, Jyh-An (2018). "Hacking into China's Cybersecurity Law" (PDF). Wake Forest Law Review. 53: 57–104.
  19. Li Qian, Jiang Tao (2020). "Rethinking Criminal Sanctions on Data Scraping in China Based on a Case Study of Illegally Obtaining Specific Data by Crawlers". China Legal Science. 8: 136.