Content Authenticity Initiative

Last updated

The Content Authenticity Initiative (CAI) is an association founded in November 2019 by Adobe, the New York Times and Twitter. [1] [2] [3] The CAI promotes an industry standard for provenance metadata defined by the C2PA. The CAI cites curbing disinformation as one motivation for its activities. [4] [5] [6] [7]

Contents

Cooperation with the C2PA

Together with arm, BBC, Intel, Microsoft and Truepic, Adobe co-founded the "Coalition for Content Provenance and Authenticity" (C2PA) in February 2021. The C2PA is tasked with the formulation of an open, royalty-free technical standard that serves as a basis for the C2PA member's efforts against disinformation. While the C2PA's work applies to the technical aspects of implementing a provenance metadata standard, the CAI sees its task in the dissemination and promotion of the standard. [8]

Provenance of information

The structure of C2PA metadata in a file with multiple Manifests generated when the picture was recorded, edited and published Visual glossary of C2PA metadata.png
The structure of C2PA metadata in a file with multiple Manifests generated when the picture was recorded, edited and published

The procedures proposed by CAI and C2PA address the widespread occurrence of disinformation [9] [10] with a set of additional data (metadata) containing details about the provenance of information displayed on a digital device. Such information can be, for example, a photo, video, sound or text file. The C2PA metadata for this information can include, among other things, the publisher of the information, the device used to record the information, the location and time of the recording or editing steps that altered the information. To make sure that the C2PA metadata can not be changed unnoticed, it is secured with hashcodes and certified digital signatures. The same applies to main content of the information, such as a picture or a text. A hash code of that data is stored in the C2PA metadata section and then, as part of that metadata, secured with the digital signature.

Securing metadata and the main content with certified signatures enables users to reliably identify the provenance of a file they are currently viewing. If the C2PA metadata names, for example, a certain TV station as the publisher of a file, it is very unlikely that the file originated from another source.

Files with C2PA-compliant metadata that are copied from a publisher's website and then published unaltered on social media (or elsewhere) still retain the full set of tamper-proof provenance information. Users seeing that content on social media can examine such a file with an online tool offered by the CAI [11] or, if present, with C2PA-compliant inspection tools offered by the social media site. Standard-compliant tools will detect whether there were any unauthorized modifications to the file or the metadata. If there were no such modifications, the user can trust the metadata as well as the main content to be exactly as they were published.

The methods proposed by CAI and C2PA do not allow for statements whether a content is "true", i.e., contains authentic information that faithfully reflects reality. Instead, C2PA-compliant metadata only offers reliable information about the origin of a piece of information. Whether users wants to trust this information depends solely on their trust in its sources.

Open Source Tools offered

The C2PAtool

Members of the CAI

As of August 2022, the list [12] of CAI members includes more than 200 entries. In addition to the three founding members Adobe, the New York Times and Twitter, these include Arm, BBC, Microsoft, Nikon, Qualcomm and The Washington Post.

Related Research Articles

Pretty Good Privacy (PGP) is an encryption program that provides cryptographic privacy and authentication for data communication. PGP is used for signing, encrypting, and decrypting texts, e-mails, files, directories, and whole disk partitions and to increase the security of e-mail communications. Phil Zimmermann developed PGP in 1991.

<span class="mw-page-title-main">PDF</span> Portable Document Format, a digital file format

Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. Based on the PostScript language, each PDF file encapsulates a complete description of a fixed-layout flat document, including the text, fonts, vector graphics, raster images and other information needed to display it. PDF has its roots in "The Camelot Project" initiated by Adobe co-founder John Warnock in 1991. PDF was standardized as ISO 32000 in 2008. The last edition as ISO 32000-2:2020 was published in December 2020.

<span class="mw-page-title-main">Authentication</span> Act of proving an assertion

Authentication is the act of proving an assertion, such as the identity of a computer system user. In contrast with identification, the act of indicating a person or thing's identity, authentication is the process of verifying that identity. It might involve validating personal identity documents, verifying the authenticity of a website with a digital certificate, determining the age of an artifact by carbon dating, or ensuring that a product or document is not counterfeit.

In cryptography, a public key certificate, also known as a digital certificate or identity certificate, is an electronic document used to prove the validity of a public key. The certificate includes the public key and information about it, information about the identity of its owner, and the digital signature of an entity that has verified the certificate's contents. If the device examining the certificate trusts the issuer and finds the signature to be a valid signature of that issuer, then it can use the included public key to communicate securely with the certificate's subject. In email encryption, code signing, and e-signature systems, a certificate's subject is typically a person or organization. However, in Transport Layer Security (TLS) a certificate's subject is typically a computer or other device, though TLS certificates may identify organizations or individuals in addition to their core role in identifying devices. TLS, sometimes called by its older name Secure Sockets Layer (SSL), is notable for being a part of HTTPS, a protocol for securely browsing the web.

<span class="mw-page-title-main">Exif</span> Metadata standard in digital images

Exchangeable image file format is a standard that specifies formats for images, sound, and ancillary tags used by digital cameras, scanners and other systems handling image and sound files recorded by digital cameras. The specification uses the following existing encoding formats with the addition of specific metadata tags: JPEG lossy coding for compressed image files, TIFF Rev. 6.0 for uncompressed image files, and RIFF WAV for audio files. It does not support JPEG 2000 or GIF encoded images.

In library and archival science, digital preservation is a formal process to ensure that digital information of continuing value remains accessible and usable in the long term. It involves planning, resource allocation, and application of preservation methods and technologies, and combines policies, strategies and actions to ensure access to reformatted and "born-digital" content, regardless of the challenges of media failure and technological change. The goal of digital preservation is the accurate rendering of authenticated content over time.

Enterprise content management (ECM) extends the concept of content management by adding a timeline for each content item and, possibly, enforcing processes for its creation, approval, and distribution. Systems using ECM generally provide a secure repository for managed items, analog or digital. They also include one methods for importing content to bring manage new items, and several presentation methods to make items available for use. Although ECM content may be protected by digital rights management (DRM), it is not required. ECM is distinguished from general content management by its cognizance of the processes and procedures of the enterprise for which it is created.

Misinformation is incorrect or misleading information. It differs from disinformation, which is deliberately deceptive and propagated information. Early definitions of misinformation focused on statements that were patently false, incorrect, or not factual. Therefore, a narrow definition of misinformation refers to the information's quality, whether inaccurate, incomplete, or false. However, recent studies define misinformation per deception rather than informational accuracy because misinformation can include falsehoods, selective truths, and half-truths.

The Extensible Metadata Platform (XMP) is an ISO standard, originally created by Adobe Systems Inc., for the creation, processing and interchange of standardized and custom metadata for digital documents and data sets.

Flash Video is a container file format used to deliver digital video content over the Internet using Adobe Flash Player version 6 and newer. Flash Video content may also be embedded within SWF files. There are two different Flash Video file formats: FLV and F4V. The audio and video data within FLV files are encoded in the same way as SWF files. The F4V file format is based on the ISO base media file format, starting with Flash Player 9 update 3. Both formats are supported in Adobe Flash Player and developed by Adobe Systems. FLV was originally developed by Macromedia. In the early 2000s, Flash Video was the de facto standard for web-based streaming video. Users include Hulu, VEVO, Yahoo! Video, metacafe, Reuters.com, and many other news providers.

<span class="mw-page-title-main">Archival science</span> Science of storage, registration and preservation of historical data

Archival science, or archival studies, is the study and theory of building and curating archives, which are collections of documents, recordings, photographs and various other materials in physical or digital formats.

PDF/A is an ISO-standardized version of the Portable Document Format (PDF) specialized for use in the archiving and long-term preservation of electronic documents. PDF/A differs from PDF by prohibiting features unsuitable for long-term archiving, such as font linking and encryption. The ISO requirements for PDF/A file viewers include color management guidelines, support for embedded fonts, and a user interface for reading embedded annotations.

Geospatial metadata is a type of metadata applicable to geographic data and information. Such objects may be stored in a geographic information system (GIS) or may simply be documents, data-sets, images or other objects, services, or related items that exist in some other native environment but whose features may be appropriate to describe in a (geographic) metadata catalog.

<span class="mw-page-title-main">Adobe LiveCycle</span> Java EE server software

Adobe LiveCycle Enterprise Suite (ES4) is a service-oriented architecture Java EE server software product from Adobe Systems used to build applications that automate a broad range of business processes for enterprises and government agencies. LiveCycle ES4 is an enterprise document and form platform that helps you capture and process information, deliver personalized communications, and protect and track sensitive information. It is used for purposes such as account opening, services, and benefits enrollment, correspondence management, requests for proposal processes, and other manual-based workflows. LiveCycle ES4 incorporates new features with a particular focus on mobile devices. LiveCycle applications also function in both online and offline environments. These capabilities are enabled through the use of Adobe Reader, HTML/PhoneGap, and Flash Player clients to reach desktop computers and mobile devices.

Preservation metadata is item level information that describes the context and structure of a digital object. It provides background details pertaining to a digital object's provenance, authenticity, and environment. Preservation metadata, is a specific type of metadata that works to maintain a digital object's viability while ensuring continued access by providing contextual information, usage details, and rights.

<span class="mw-page-title-main">Metadata</span> Data about data

Metadata is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including:

The Handle System is the Corporation for National Research Initiatives's proprietary registry assigning persistent identifiers, or handles, to information resources, and for resolving "those handles into the information necessary to locate, access, and otherwise make use of the resources".

Disinformation attacks are strategic deception campaigns involving media manipulation and internet manipulation, to disseminate misleading information, aiming to confuse, paralyze, and polarize an audience. Disinformation can be considered an attack when it occurs as an adversarial narrative campaign that weaponizes multiple rhetorical strategies and forms of knowing—including not only falsehoods but also truths, half-truths, and value-laden judgements—to exploit and amplify identity-driven controversies. Disinformation attacks use media manipulation to target broadcast media like state-sponsored TV channels and radios. Due to the increasing use of internet manipulation on social media, they can be considered a cyber threat. Digital tools such as bots, algorithms, and AI technology, along with human agents including influencers, spread and amplify disinformation to micro-target populations on online platforms like Instagram, Twitter, Google, Facebook, and YouTube.

The Trusted News Initiative (TNI) is an international alliance of news media, social media and technology corporations which claim to be working to identify and combat purported disinformation about national elections, the COVID-19 pandemic and COVID-19 vaccines. TNI was founded by Jessica Cecil, a leadership figure at the BBC who also serves as the initiative's director.

References

  1. Robertson, Adi (2019-11-04). "Adobe and Twitter are designing a system for permanently attaching artists' names to pictures". The Verge. Retrieved 2022-06-30.
  2. Cade, DL (2019-11-06). "Adobe Wants to Help 'Authenticate' Your Photos: What Should Photographers Think?". PetaPixel. Retrieved 2022-06-30.
  3. Team, Adobe Communications. "Introducing the Content Authenticity Initiative". Adobe Blog. Retrieved 2022-06-29.
  4. "Reuters joins the Content Authenticity Initiative to help combat misinformation and disinformation". Reuters News Agency. Retrieved 2022-06-30.
  5. "Using Secure Sourcing to Combat Misinformation". rd.nytimes.com. Retrieved 2022-06-30.
  6. Pratap, Aayushi. "Deepfake Epidemic Is Looming—And Adobe Is Preparing For The Worst". Forbes. Retrieved 2022-06-30.
  7. "Content Authenticity Initiative". Content Authenticity Initiative. Retrieved 2022-06-29.
  8. "FAQ". Content Authenticity Initiative. Retrieved 2022-06-29.
  9. "Measuring the reach of "fake news" and online disinformation in Europe". Reuters Institute for the Study of Journalism. Retrieved 2022-08-16.
  10. "Four key ways disinformation is spread online". World Economic Forum. Retrieved 2022-08-16.
  11. "Verify". verify.contentauthenticity.org. Retrieved 2022-08-16.
  12. "Members". Content Authenticity Initiative. Retrieved 2022-06-29.