Content ID (algorithm)

Last updated

Content ID is a digital fingerprinting system developed by Google which is used to easily identify and manage copyrighted content on YouTube. Videos uploaded to YouTube are compared against audio and video files registered with Content ID by content owners, looking for any matches. Content owners have the choice to have matching content taken down or to monetize it. The system began to be implemented around 2007. By 2016, it had cost $60 million to develop and led to around $2 billion in payments to copyright holders. [1] By 2018, Google had invested at least $100 million into the system. [2]

Fingerprint (computing) digital identifier to identify a certain piece of data, derived from the data by an algorithm

In computer science, a fingerprinting algorithm is a procedure that maps an arbitrarily large data item to a much shorter bit string, its fingerprint, that uniquely identifies the original data for all practical purposes just as human fingerprints uniquely identify people for practical purposes. This fingerprint may be used for data deduplication purposes. This is also referred to as file fingerprinting, data fingerprinting, or structured data fingerprinting.

Google American multinational Internet and technology corporation

Google LLC is an American multinational technology company that specializes in Internet-related services and products, which include online advertising technologies, search engine, cloud computing, software, and hardware. It is considered one of the Big Four technology companies, alongside Amazon, Apple and Facebook.

YouTube Video-sharing service owned by Google

YouTube is an American video-sharing website headquartered in San Bruno, California. Three former PayPal employees—Chad Hurley, Steve Chen, and Jawed Karim—created the service in February 2005. Google bought the site in November 2006 for US$1.65 billion; YouTube now operates as one of Google's subsidiaries.

Contents

Overview

Content ID [3] creates an ID File for copyrighted audio and video material, and stores it in a database. When a video is uploaded, it is checked against the database, and flags the video as a copyright violation if a match is found. [4] When this occurs, the content owner has the choice of blocking the video to make it unviewable, tracking the viewing statistics of the video, or adding advertisements to the "infringing" video with proceeds automatically going to the content owner.

Only uploaders who meet specific criteria can use Content ID. [5] [6] These criteria make the use of Content ID without the aid of a major backer difficult, limiting its usage to big corporations in practice. [7]

Context

Between 2007 and 2009 Organizations including Viacom, Mediaset, and the English Premier League filed lawsuits against YouTube, claiming that it has done too little to prevent the uploading of copyrighted material. [8] [9] [10] Viacom, demanding $1 billion in damages, said that it had found more than 150,000 unauthorized clips of its material on YouTube that had been viewed "an astounding 1.5 billion times".

Viacom (2005–present) American global mass media company founded in 2006

The second and current incarnation of Viacom Inc., an American multinational media conglomerate company with interests primarily in film and television, was formed as a spin-off of the original Viacom on December 31, 2005. It is one of two companies which succeeded the original Viacom, alongside CBS Corporation; both are owned by National Amusements, a theater company controlled by billionaire Sumner Redstone. The spin-off was structured so that CBS Corporation would be the legal successor to the old Viacom, with the new Viacom being an entirely new company.

Mediaset S.p.A., also known as Gruppo Mediaset in Italian, is an Italian-based mass media company which is the largest commercial broadcaster in the country. Founded in 1987 by former Italian prime minister Silvio Berlusconi and still controlled today with a 38.6% stake by his family's holding company Fininvest, the group competes primarily against the public broadcaster RAI, the privately owned La7 and Sky plc's Sky Italia.

Premier League Association football league in England

The Premier League is the top level of the English football league system. Contested by 20 clubs, it operates on a system of promotion and relegation with the English Football League (EFL).

During the same court battle, Viacom won a court ruling requiring YouTube to hand over 12 terabytes of data detailing the viewing habits of every user who has watched videos on the site. On March 18, 2014, the lawsuit was settled after seven years with an undisclosed agreement. [11]

History

In June 2007, YouTube began trials of a system for automatic detection of uploaded videos that infringe copyright. Google CEO Eric Schmidt regarded this system as necessary for resolving lawsuits such as the one from Viacom, which alleged that YouTube profited from content that it did not have the right to distribute. [12] The system was initially called "Video Identification" [13] [14] and later became known as Content ID [3] . By 2010, YouTube had "already invested tens of millions of dollars in this technology". [14] In 2011, YouTube described Content ID as "very accurate in finding uploads that look similar to reference files that are of sufficient length and quality to generate an effective ID File". [4]

Eric Schmidt American software engineer and businessman

Eric Emerson Schmidt is an American businessman and software engineer. He is known for being the CEO of Google from 2001 to 2011, Executive Chairman of Google from 2011 to 2015 and executive chairman of Alphabet Inc. from 2015 to 2017. In 2017, Forbes ranked Schmidt as the 119th-richest person in the world, with an estimated wealth of US$11.1 billion.

By 2012, Content ID accounted for over a third of the monetized views on YouTube. [15]

In 2016, Google stated that Content ID had paid out around $2 billion to copyright holders (compared to around $1 billion by 2014), and had cost $60 million to develop. [1]

Since mid 2018, Google is Beta testing a new tool called Copyright Match, a simplified version of Content ID with more limited options, which would be available to uploaders with more than 100000 views. [7] [16] However contrary to Content ID which sends copyright notices automatically, with Copyright Match no action is taken until the creator chooses to do so.

Trademark lawsuit

In 2006, YouTube and Audible Magic signed an agreement to license the use of Audible Magic's own "Content ID" fingerprinting technology. When Google bought YouTube in November the same year, the license was transferred to Google. [17] The agreement was terminated in 2009, but in 2014 Google obtained a trademark for their own "Content ID" implementation. [18] Audible Magic sued Google the same year on the basis that they owned the "Content ID" trademark and therefore that Google trademarking their implementation was a fraud.

Criticisms

An independent test in 2009 uploaded multiple versions of the same song to YouTube, and concluded that while the system was "surprisingly resilient" in finding copyright violations in the audio tracks of videos, it was not infallible. [19] The use of Content ID to remove material automatically has led to controversy in some cases, as the videos have not been checked by a human for fair use. [20]

If a YouTube user disagrees with a decision by Content ID, it is possible to fill in a form disputing the decision. [21] Prior to 2016, videos weren't monetized until the dispute was resolved.

In December 2013, Google changed the way the system worked (seemingly to cover YouTube in case of lawsuits), leading to numerous content creation copyright notices being sent to gameplay videos YouTube content creators. Those notices led to ad revenues being automatically diverted to third parties, which sometimes had no connection to the games. [22] [23]

Since April 2016, videos continue to be monetized while the dispute is in progress, and the money goes to whoever won the dispute. [24] Should the uploader want to monetize the video again, they may remove the disputed audio in the "Video Manager". [25] YouTube has cited the effectiveness of Content ID as one of the reasons why the site's rules were modified in December 2010 to allow some users to upload videos of unlimited length. [26]

The music industry has criticized Content ID as inefficient, with Universal Music Publishing Group (UPMG) estimating in a 2015 filing to the US Copyright Office "that Content ID fails to identify upwards of 40 percent of the use of UMPG’s compositions on YouTube". [1] [27] Google has countered these assertions by stating that (as of 2016) Content ID detected over 98% of known copyright infringement on YouTube and humans filing removal notices only 2%. [1]

In January 2018, a YouTube uploader who created a white noise generator received copyright notices about a video he uploaded which was created using this tool, and therefore containing only white noise. [28]

In September 2018, a German university professor uploaded videos with several classical music performances for which their copyright had expired, because both the composers were dead long ago, and the performances were not covered anymore by copyright. After he received several copyright violations by YouTube, he could lift the majority of them, but Deutsche Grammophon refused to lift two of them even if their copyright had expired. [29] [30] [31] In other cases, copyright violations notices were even sent to uploaders who recorded themselves playing public domain classical music, with Sony Music asserting copyright over more than 1,100 compositions by Johann Sebastian Bach via Content ID. [32] Commentators noted that this was also the case on other platforms such as Facebook. [33]

In December 2018 TheFatRat complained that Content ID gave preference to an obvious scammer who used the automated system to claim ownership of his content and thereby steal his revenue. [34]

In April 2019 WatchMojo - one of the largest YouTube channels with over 20 million subscribers and 15 billion views with an extensive library of videos that rely on fair use - released a video that relied on its 10-year experiences managing claims and strikes via Content ID to highlight instances of alleged abuse. [35] In a follow-up video, the channel estimated that rights holders had unlawfully claimed over $2 billion from 2014-19. [36] [37]

See also

Related Research Articles

Flickr Image and video hosting website

Flickr is an image hosting service and video hosting service. It was created by Ludicorp in 2004. It has changed ownership several times and has been owned by SmugMug since April 20, 2018.

Google Video A video search engine from Google.

Google Video was a free video hosting service from Google, similar to YouTube, that allowed video clips to be hosted on Google servers and embedded on to other websites. This allowed websites to host lots of video remotely without running into bandwidth or storage-capacity issues.

Spamigation is mass litigation conducted to intimidate large numbers of people. The term was coined by Brad Templeton of the Electronic Frontier Foundation to explain the tactics of the Recording Industry Association of America (RIAA), which files large numbers of lawsuits against individuals for file sharing, and DirecTV, which once filed large numbers of lawsuits against users of smart cards.

Dailymotion Video streaming site

Dailymotion is a European video-sharing technology platform primarily owned by Vivendi. North American launch partners included BBC News, VICE, Bloomberg, and Hearst Digital Media. Dailymotion is available worldwide in 25 languages and 43 localised versions featuring local home pages and local content. It has more than 300 million unique monthly users.

RapidShare was an online file hosting service that opened in 2002. In 2009, it was among the Internet's 20 most visited websites and claimed to have 10 petabytes of files uploaded by users with the ability to handle up to three million users simultaneously. Following the takedown of similar service Megaupload in 2012, RapidShare changed its business model to deter the use of its services for distribution of files to large numbers of anonymous users and to focus on personal subscription-only cloud-based file storage. Its popularity fell sharply as a result and, by the end of March 2015, RapidShare ceased to operate.

Digital Millennium Copyright Act copyright law in the United States of America

The Digital Millennium Copyright Act (DMCA) is a 1998 United States copyright law that implements two 1996 treaties of the World Intellectual Property Organization (WIPO). It criminalizes production and dissemination of technology, devices, or services intended to circumvent measures that control access to copyrighted works. It also criminalizes the act of circumventing an access control, whether or not there is actual infringement of copyright itself. In addition, the DMCA heightens the penalties for copyright infringement on the Internet. Passed on October 12, 1998, by a unanimous vote in the United States Senate and signed into law by President Bill Clinton on October 28, 1998, the DMCA amended Title 17 of the United States Code to extend the reach of copyright, while limiting the liability of the providers of online services for copyright infringement by their users.

Notice and take down is a process operated by online hosts in response to court orders or allegations that content is illegal. Content is removed by the host following notice. Notice and take down is widely operated in relation to copyright infringement, as well as for libel and other illegal content. In United States and European Union law, notice and takedown is mandated as part of limited liability, or safe harbour, provisions for online hosts. As a condition for limited liability online hosts must expeditiously remove or disable access to content they host when they are notified of the alleged illegality.

<i>IO Group, Inc. v. Veoh Networks, Inc.</i>

IO Group, Inc. v. Veoh Networks, Inc., 586 F. Supp. 2d 1132, is an American legal case involving an internet television network named Veoh that allowed users of its site to view streaming media of various adult entertainment producer IO Group's films. The United States District Court for the Northern District of California ruled that Veoh qualified for the safe harbors provided by the Digital Millennium Copyright Act (DMCA), 17 U.S.C. § 512 (2006). According to commentators, this case could foreshadow the resolution of Viacom v. YouTube.

<i>Viacom International Inc. v. YouTube, Inc.</i> U.S. District Court case

Viacom International, Inc. v. YouTube, Inc., No. 07 Civ. 2103, is a U.S. District Court for the Southern District of New York case in which Viacom sued YouTube, a video-sharing site owned by Google, alleging that YouTube had engaged in "brazen" and "massive" copyright infringement by allowing users to upload and view hundreds of thousands of videos owned by Viacom without permission. A motion for summary judgment seeking dismissal was filed by Google and was granted in 2010 on the grounds that the Digital Millennium Copyright Act's "safe harbor" provisions shielded Google from Viacom's copyright infringement claims. In 2012, on appeal to the United States Court of Appeals for the Second Circuit, it was overturned in part. On April 18, 2013, District Judge Stanton again granted summary judgment in favor of defendant YouTube. An appeal was begun, but the parties settled in March 2014.

IBM Cloud Video, formerly Ustream, is an American live video streaming and video hosting company. It is based in San Francisco and has more than 180 employees in their San Francisco, Los Angeles, and Budapest offices. Company partners include Panasonic, Samsung, Logitech, CBS News, PBS NewsHour, Viacom, and IMG Media. It received $11.1 million in Series A funding for new product development from DCM and investors Labrador Ventures and Band of Angels.

Lets Play Walkthrough of a video game

A Let's Play (LP) is a video documenting the playthrough of a video game, usually including commentary and/or camera view of the face by the gamer. A Let's Play differs from a video game walkthrough or strategy guide by focusing on an individual's subjective experience with the game, often with humorous, irreverent, or critical commentary from the gamer, rather than being an objective source of information on how to progress through the game. While Let's Plays and live streaming of game playthroughs are related, Let's Plays tend to be curated experiences that include editing and scripted narration, while streaming is an unedited experience performed on the fly.

Rumblefish Inc. is a music licensing company specializing in all forms of synchronization licensing with a focus on 'micro-licensing' and online network monetization such as with YouTube's Content ID. It covers over 1.8 million pieces of music and it licenses over 20,000 soundtracks on more than nine million social videos.

A multi-channel network (MCN) is an organization that works with video platforms, to offer assistance to a channel owner in areas such as "product, programming, funding, cross-promotion, partner management, digital rights management, monetization/sales, and/or audience development" in exchange for a percentage of the ad revenue from the channel.

<i>Warner/Chappell Music Inc. v. Fullscreen Inc.</i>

Warner/Chappell Music Inc. et al. v. Fullscreen Inc. et al. (13-cv-05472) was a case against multi-channel network Fullscreen (company), filed by the National Music Publishers Association on behalf of Warner/Chappell Music and 15 other music publishers, which alleged that Fullscreen illegally reaped the profits of unlicensed cover videos on YouTube without paying any royalties to the rightful publishers and songwriters.

Google has been involved in multiple lawsuits over issues such as privacy, advertising, intellectual property and various Google services such as Google Books and YouTube. The company's legal department expanded from one to nearly 100 lawyers in the first five years of business, and by 2014 had grown to around 400 lawyers. Google's Chief Legal Officer is Senior Vice President of Corporate Development David Drummond

Automatic content recognition (ACR) is an identification technology to recognize content played on a media device or present in a media file. Devices containing ACR support enable users to quickly obtain additional information about the content they see without any user-based input or search efforts. For example, developers of the application can then provide personalized complementary content to viewers.

YouTube has various copyright protection methods, such as copyright strikes, Content ID and Copyright Verification Program. However over the years these have been criticized for favoring corporations and unfair claims on videos.

YouTube copyright strike

A YouTube copyright strike is a copyright policing practice used by YouTube for the purpose of managing copyright infringement and complying with the Digital Millennium Copyright Act. The Digital Millennium Copyright Act (DMCA) is the basis for the design of the YouTube copyright strike system. For YouTube to retain DMCA safe harbor protection, it must respond to copyright infringement claims with a notice and take down process. YouTube's own practice is to issue a "YouTube copyright strike" on the user accused of copyright infringement. When a YouTube user has three copyright strikes, YouTube terminates that user's YouTube channel, removes all of their videos from that user's YouTube channel, and prohibits that user from creating another YouTube channel.

References

  1. 1 2 3 4 Popper, Ben (2016-07-13). "YouTube to the music industry: here's the money". The Verge. Retrieved 2018-09-20.
  2. Manara, Cedric (2018-11-07). "Protecting what we love about the internet: our efforts to stop online piracy". Google. Retrieved 2018-12-02.
  3. 1 2 "YouTube Content ID". YouTube. September 28, 2010. Retrieved May 25, 2015.
  4. 1 2 More about Content ID YouTube. Retrieved December 4, 2011.
  5. "Qualifying for Content ID". Google . Retrieved 2018-09-09.
  6. "Content eligible for Content ID". Google . Retrieved 2018-09-09.
  7. 1 2 "YouTube Beta Testing Content ID for Everyone". plagiarismtoday.com. 2018-05-02. Retrieved 2018-09-09.
  8. "Viacom will sue YouTube for $1bn". BBC News. March 13, 2007. Retrieved May 26, 2008.
  9. "Mediaset Files EUR500 Million Suit Vs Google's YouTube". CNNMoney.com. July 30, 2008. Retrieved August 19, 2009.
  10. "Premier League to take action against YouTube". The Daily Telegraph . Telegraph Media Group. May 5, 2007. Retrieved March 26, 2017.
  11. "Google and Viacom settle seven-year YouTube row". BBC News . March 18, 2014. Retrieved March 18, 2014.
  12. Delaney, Kevin J. (June 12, 2007). "YouTube to Test Software To Ease Licensing Fights". Wall Street Journal. Retrieved December 4, 2011.
  13. YouTube Advertisers (February 4, 2008), Video Identification , retrieved August 29, 2018
  14. 1 2 King, David (December 2, 2010). "Content ID turns three". Official YouTube Blog. Retrieved August 29, 2018.
  15. Press Statistics YouTube. Retrieved March 13, 2012.
  16. "YouTube to Launch Tool to Detect Re-Uploaded Videos Automatically". Variety. 2018-07-11. Retrieved 2018-09-09.
  17. "Audible Magic Accuses YouTube of Fraud Over Content ID Trademark". torrentfreak.com. 2017-01-11. Retrieved 2018-09-09.
  18. "Audible Magic Accuses YouTube of Fraud Over Content ID Trademark". digitalmusicnews.com. 2017-01-12. Retrieved 2018-09-09. However, in 2013, Google signed a declaration stating that it knew of no other company entitled to use the Content ID brand
  19. Von Lohmann, Fred (April 23, 2009). "Testing YouTube's Audio Content ID System" . Retrieved December 4, 2011.
  20. Von Lohmann, Fred (February 3, 2009). "YouTube's January Fair Use Massacre" . Retrieved December 4, 2011.
  21. Content ID disputes YouTube. Retrieved December 4, 2011.
  22. "YouTube video game shows hit with copyright blitz". Polygon. 2013-12-10. Retrieved 2018-09-09.
  23. "YouTube Responds To Content ID Crackdown, Plot Thickens". Forbes. 2013-12-17. Retrieved 2018-09-09.
  24. Hernandez, Patricia. "YouTube's Content ID System Gets One Much-Needed Fix". Kotaku. Retrieved September 16, 2017.
  25. "Remove Content ID claimed songs from my videos – YouTube Help". support.google.com. Retrieved September 17, 2017.
  26. Siegel, Joshua; Mayle, Doug (December 9, 2010). "Up, Up and Away – Long videos for more users". Official YouTube Blog. Google . Retrieved March 25, 2017.
  27. "Comments of Universal Music Group". Scribd. 2015. Retrieved 2018-09-20.
  28. "YouTube's problematic Content ID says white noise is copyrighted". Thenextweb. 2018-01-05. Retrieved 2018-09-09.
  29. Kaiser, Ulrich (2018-09-03). "Google: Sorry professor, old Beethoven recordings on YouTube are copyrighted". Arstechnica . Retrieved 2018-09-09.
  30. "YouTube's Content-ID Flags Music Prof's Public Domain Beethoven and Wagner Uploads". torrentfreak.com. 2018-09-03. Retrieved 2018-09-09.
  31. "How The EU May Be About To Kill The Public Domain: Copyright Filters Takedown Beethoven". Techdirt. 2018-08-28. Retrieved 2018-09-09.
  32. "The Empire Strikes Bach". freebeacon.com. 2018-09-08. Retrieved 2018-09-09.
  33. "The future is here today: you can't play Bach on Facebook because Sony says they own his compositions". Boing Boing. 2018-09-05. Retrieved 2018-09-09.
  34. Beschizza, Rob (26 December 2018). "YouTube let a contentID scammer steal a popular video". Boing Boing.
  35. WatchMojo.com (2019-05-02), Exposing Worst ContentID Abusers! #WTFU , retrieved 2019-07-02
  36. "'YouTube Content-ID Abusers Could Face Millions of Dollars in Damages'". TorrentFreak. 2019-05-10. Retrieved 2019-07-02.
  37. WatchMojo.com (2019-05-09), Are Rights Holders Unlawfully Claiming Billions in AdSense Revenue? , retrieved 2019-07-02