Video fingerprinting or video hashing are a class of dimension reduction techniques [1] in which a system identifies, extracts and then summarizes characteristic components of a video as a unique or a set of multiple perceptual hashes or fingerprints, enabling that video to be uniquely identified. This technology has proven to be effective at searching and comparing video files. [2] [3]
Video fingerprinting was first developed into practical use by Philips in 2002. [4] [ failed verification ]
Different methods exist for video fingerprinting. Van Oostveen relied on changes in patterns of image intensity over successive video frames. [4] This makes the video fingerprinting robust against limited changes in color—or the transformation of color into gray scale of the original video. Others have tried to reduce the size of the fingerprint by only looking around shot changes. [5]
Video fingerprinting does not rely on any addition to the video stream. A video fingerprint cannot be removed, because it is not added. In addition, a reference video fingerprint can be created at any point, from any copy of the video. [6] [7]
Video fingerprinting should not be confused with digital watermarking, which relies on inserting identifying features into the content and therefore changing the content. Some watermarks can be inserted in a way that they are imperceptible to a viewer. A robust watermark can be difficult to detect and remove, but the removal of invisible watermarks is a significant weakness.
Since watermarks must be inserted into the video, they only identify copies of the particular video made after that point in time. For example, if a watermark is inserted at broadcast it cannot be used to identify copies of the video made before the broadcast.
Watermarks offer some advantages over fingerprinting. A unique watermark can be added to the content at any stage in the distribution process, and multiple independent watermarks can be inserted into the same video content. This can be particularly useful in tracing the history of a copy of a video. Detecting watermarks in a video can indicate the source of an unauthorized copy.
While video fingerprinting systems must search a potentially large database of reference fingerprints, a watermark detection system only has to do the computation to detect the watermark. This computation can be significant and when multiple watermark keys must be tested, then watermarking can fail to scale to the volumes required by commercial applications such as user generated video services.
Video fingerprinting is of interest in the digital rights management (DRM) area, particularly regarding the distribution of unauthorized content on the Internet. Video fingerprinting systems enable content providers (e.g., film studios) or publishers (e.g., user-generated content [UGC] sites) to determine if any of the publisher's files contain content registered with the fingerprint service. If registered content is detected, the publisher can take the appropriate action—remove it from the site, monetize it add correct attribution, etc.
Video fingerprinting may be used for broadcast monitoring (e.g., advertisement monitoring, news monitoring) and general media monitoring. Broadcast monitoring solutions can inform content providers and content owners with playlists of when and where their video content has been used. A typical application is described in this Video Fingerprinting Use Case for Television Productions and Broadcasters. [8]
From a content provider's point of view, both video and audio fingerprinting need to be used in most applications. [9] Consider the online publication of "mash-ups." Mash-ups can consist of content from several sources that are compiled together and set to a unique audio track. Since the audio track is different from the original version, the copyrighted material in these mash-ups would go undetected using only audio fingerprinting techniques. In other cases, mash-ups consist of the soundtrack from a commercial video source, like a movie, used with a different video stream. In this case, a video fingerprint would not match, but an audio fingerprint would. When the audio and video streams are not from the same masterwork, the question of fair use may arise.
This discrepancy has real applications in the global online community in terms of film distribution. Films shown in countries other than their country of origin are often dubbed into other languages. This change in audio renders the films virtually unrecognizable by audio fingerprinting technologies unless copies of all known versions have been fingerprinted. Employing video fingerprinting, however, enables the content owner to fingerprint just once and have each subsequent version remain recognizable. If the customer wishes to know which language soundtrack is present on a particular video, then an audio fingerprint must be used.
Another use is for companies to track the leak of confidential recordings or videos, or for celebrities to track the presence on the Internet of unauthorized videos (for instance, videos of themselves taken by amateurs using a camcorder or a mobile phone).
Video fingerprinting applied to smart TV is enabling an emerging category of interactive television applications. Television devices integrated with real-time fingerprinting software can automatically recognize the video content on-screen in order to enable interactive features and applications on top of the programming. Entrepreneur Mark Cuban has made investments to leverage this technology to create interactive features for his cable networks HDNet and its successor AXS. [10]
Video fingerprints can also be used to create content-aware video advertising. As one implementation, if a video service provider distributes content that contains a nationally broadcast TV commercial, a localized overlay of text/graphics may be performed on the national commercial. This way, the national commercial will have a local overlay of information specific to that commercial. For example, if the national commercial contains a 15-second spot for a Ford Explorer SUV, through the fingerprint technology, local operators may put an overlay of local dealership information—phone number, promotion, etc.—over the 15-second commercial, creating a localized commercial for the SUV that appears to be targeted only at the local audience.
Video fingerprinting is also used by authorities to track the distribution of illegal content such as happy slapping, terrorist and child sexual abuse related videos.
In 2008 the Dutch company Ziuz, together with the Dutch Police, TNO and University of Amsterdam developed video fingerprinting for detecting child sexual abuse related videos. [11] [12]
In April 2014 the British company Friend MTS Ltd. donated its video fingerprinting technology (known as F1) to the International Centre for Missing & Exploited Children (ICMEC) to help increase the efficiency of child pornography investigations and to halt the continued sharing of similar files over the internet. [6] [13] ICMEC distributes the technology to law enforcement agencies, software providers and online service providers to hinder the spread of such material. [14] [15]
A digital video recorder (DVR), also referred to as a personal video recorder (PVR) particularly in Canadian and British English, is an electronic device that records video in a digital format to a disk drive, USB flash drive, SD memory card, SSD or other local or networked mass storage device. The term includes set-top boxes (STB) with direct to disk recording, portable media players and TV gateways with recording capability, and digital camcorders. Personal computers are often connected to video capture devices and used as DVRs; in such cases the application software used to record video is an integral part of the DVR. Many DVRs are classified as consumer electronic devices. Similar small devices with built-in displays and SSD support may be used for professional film or video production, as these recorders often do not have the limitations that built-in recorders in cameras have, offering wider codec support, the removal of recording time limitations and higher bitrates.
A digital on-screen graphic, digitally originated graphic is a watermark-like station logo that most television broadcasters overlay over a portion of the screen area of their programs to identify the channel. They are thus a form of permanent visual station identification, increasing brand recognition and asserting ownership of the video signal.
An overlay network is a computer network that is layered on top of another network. The concept of overlay networking is distinct from the traditional model of OSI layered networks, and almost always assumes that the underlay network is an IP network of some kind.
A digital watermark is a kind of marker covertly embedded in a noise-tolerant signal such as audio, video or image data. It is typically used to identify ownership of the copyright of such a signal. Digital watermarking is the process of hiding digital information in a carrier signal; the hidden information should, but does not need to, contain a relation to the carrier signal. Digital watermarks may be used to verify the authenticity or integrity of the carrier signal or to show the identity of its owners. It is prominently used for tracing copyright infringements and for banknote authentication.
MediaMax, sometimes referred to as MediaMax CD-3 is a software package created by SunnComm which was sold as a form of copy protection for compact discs. It was used by the record label RCA Records/BMG, and targets both Microsoft Windows and Mac OS X. Elected officials and computer security experts regard the software as a form of malware since its purpose is to intercept and inhibit normal computer operation without the user's authorization. MediaMax received media attention in late 2005 in fallout from the Sony XCP copy protection scandal.
P2PTV refers to peer-to-peer (P2P) software applications designed to redistribute video streams in real time on a P2P network; the distributed video streams are typically TV channels from all over the world but may also come from other sources. The draw to these applications is significant because they have the potential to make any TV channel globally available by any individual feeding the stream into the network where each peer joining to watch the video is a relay to other peer viewers, allowing a scalable distribution among a large audience with no incremental cost for the source.
The High-Definition Audio-Video Network Alliance (HANA) was a cross-industry collaboration of members addressing the end-to-end needs of connected, HD, home entertainment products and services. Leading companies formed the organization from the four industries most affected by the HD revolution: content providers, consumer electronics, service providers, and information technology. HANA created design guidelines for secure high-definition audio-video networks that would speed the creation of new, high-quality, easy-to-use HD products. HANA membership was open to all companies involved in the digital entertainment industry. HANA was dissolved in September 2009, and the 1394 Trade Association assumed control of all HANA-generated intellectual property.
Self Protecting Digital Content (SPDC), is a copy protection architecture designed by Cryptography Research, Inc. for Blu-ray discs.
A tag editor is an app that can add, edit, or remove embedded metadata on multimedia file formats. Content creators, such as musicians, photographers, podcasters, and video producers, may need to properly label and manage their creations, adding such details as title, creator, date of creation, and copyright notice.
The International Centre for Missing & Exploited Children (ICMEC), headquartered in Alexandria, Virginia, USA, with a regional presence in the United Kingdom, Europe, Turkey, Africa, Canada, Latin America, Caribbean, Southeast Asia, India, Japan, South Korea, Taiwan and Australasia, is a private 501(c)(3) non-governmental, nonprofit global organization. It combats child sexual exploitation, child pornography, child trafficking and child abduction.
In computer science, a fingerprinting algorithm is a procedure that maps an arbitrarily large data item to a much shorter bit string, its fingerprint, that uniquely identifies the original data for all practical purposes just as human fingerprints uniquely identify people for practical purposes. This fingerprint may be used for data deduplication purposes. This is also referred to as file fingerprinting, data fingerprinting, or structured data fingerprinting.
An audio watermark is a unique electronic identifier embedded in an audio signal, typically used to identify ownership of copyright. It is similar to a watermark on a photograph.
A Digital Cinema Package (DCP) is a collection of digital files used to store and convey digital cinema (DC) audio, image, and data streams.
Cinavia, originally called Verance Copy Management System for Audiovisual Content (VCMS/AV), is an analog watermarking and steganography system under development by Verance since 1999, and released in 2010. In conjunction with the existing Advanced Access Content System (AACS) digital rights management (DRM) inclusion of Cinavia watermarking detection support became mandatory for all consumer Blu-ray Disc players from 2012.
Video copy detection is the process of detecting illegally copied videos by analyzing them and comparing them to original content.
An acoustic fingerprint is a condensed digital summary, a digital fingerprint, deterministically generated from an audio signal, that can be used to identify an audio sample or quickly locate similar items in a music database.
PhotoDNA is a proprietary image-identification and content filtering technology widely used by online service providers.
Perceptual hashing is the use of a fingerprinting algorithm that produces a snippet, hash, or fingerprint of various forms of multimedia. A perceptual hash is a type of locality-sensitive hash, which is analogous if features of the multimedia are similar. This is in contrast to cryptographic hashing, which relies on the avalanche effect of a small change in input value creating a drastic change in output value. Perceptual hash functions are widely used in finding cases of online copyright infringement as well as in digital forensics because of the ability to have a correlation between hashes so similar data can be found.
Automatic content recognition (ACR) is a technology used to identify content played on a media device or presented within a media file. Devices with ACR can allow for the collection of content consumption information automatically at the screen or speaker level itself, without any user-based input or search efforts. This information may be collected for purposes such as personalized advertising, content recommendations, or sale to companies that aggregate customer data.
Audible Magic Corporation is a Los Gatos, California-based company that provides content identification services to social networks, record labels, music publishers, television studios, and movie studios. The company also provides digital platform music management services for Internet radio, subscription music services, on-demand streaming, and fitness and gaming applications. The services help companies identify and protect copyrighted content, manage rights and monetize media.