Fawkes (image cloaking software)

Last updated
Facial recognition works by pinpointing unique dimensions of facial features, which are then rendered as a vector graphic image of the face. Facial Recognition22.jpg
Facial recognition works by pinpointing unique dimensions of facial features, which are then rendered as a vector graphic image of the face.

Fawkes is a facial image cloaking software created by the SAND (Security, Algorithms, Networking and Data) Laboratory of the University of Chicago. [1] It is a free tool that is available as a standalone executable. [2] The software creates small alterations in images using artificial intelligence to protect the images from being recognized and matched by facial recognition software. [3] The goal of the Fawkes program is to enable individuals to protect their own privacy from large data collection. As of May 2022, Fawkes v1.0 has surpassed 840,000 downloads. [4] Eventually, the SAND Laboratory hopes to implement the software on a larger scale to combat unwarranted facial recognition software. [5]

Contents

History

The Fawkes program was named after the fictional protagonist from the movie and comic V for Vendetta, who drew inspiration from historical figure Guy Fawkes. [6] The Fawkes proposal was initially presented at a USENIX Security conference in August 2020 where it received approval and was launched shortly after. The most recent version available for download, Fawkes v1.0, was released in April 2021, and is still being updated in 2022. [4] The founding team is led by Emily Wenger and Shawn Shan, PhD students at the University of Chicago. Additional support from Jiayun Zhang and Huiying Li, with faculty advisors Ben Zhao and Heather Zheng, contributed to the creation of the software. [7] The team cites nonconsensual data collection, specifically done by such companies as Clearwater AI, as being the prime inspiration behind the creation of Fawkes. [8]

Techniques

The methods that Fawkes uses can be identified as similar to adversarial machine learning. This method trains a facial recognition software using already altered images. This results in the software not being able to match the altered image with the actual image, as it does not recognize them as the same image. Fawkes also uses data poisoning attacks, which change the data set used to train certain deep learning models. Fawkes utilizes two types of data poisoning techniques: clean label attacks and model corruption attacks. The creators of Fawkes identify, that using sybil images can increase the effectiveness of their software against recognition software products. Sybil images are images that do not match the person they are attributed to. This confuses the facial recognition software and leads to misidientification which also helps the efficacy of image cloaking. Privacy preserving machine learning uses techniques similar to the Fawkes software but opts for differentially private model training, which helps to keep information in the data set private. [3]

Applications

Fawkes image cloaking can be used on images and apps that are used every day. However, the efficacy of the software wanes if there are cloaked and uncloaked images that the facial recognition software can utilize. The image cloaking software has been tested on high-powered facial recognition software with varied results. [3] A similar facial cloaking software to Fawkes is called LowKey. LowKey also alters images on a visual level, but these alterations are much more noticeable compared to the Fawkes software. [2]

See also

Adversarial machine learning

Facial recognition system

USENIX

Related Research Articles

A CAPTCHA is a type of challenge–response test used in computing to determine whether the user is human in order to deter bot attacks and spam.

<span class="mw-page-title-main">Machine learning</span> Study of algorithms that improve automatically through experience

Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can effectively generalize and thus perform tasks without explicit instructions. Recently, generative artificial neural networks have been able to surpass many previous approaches in performance. Machine learning approaches have been applied to large language models, computer vision, speech recognition, email filtering, agriculture and medicine, where it is too costly to develop algorithms to perform the needed tasks.

<span class="mw-page-title-main">Facial recognition system</span> Technology capable of matching a face from an image against a database of faces

A facial recognition system is a technology potentially capable of matching a human face from a digital image or a video frame against a database of faces. Such a system is typically employed to authenticate users through ID verification services, and works by pinpointing and measuring facial features from a given image.

Human image synthesis is technology that can be applied to make believable and even photorealistic renditions of human-likenesses, moving or still. It has effectively existed since the early 2000s. Many films using computer generated imagery have featured synthetic images of human-like characters digitally composited onto the real or other simulated film material. Towards the end of the 2010s deep learning artificial intelligence has been applied to synthesize images and video that look like humans, without need for human assistance, once the training phase has been completed, whereas the old school 7D-route required massive amounts of human work .

<span class="mw-page-title-main">Ensemble learning</span> Statistics and machine learning technique

In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. Unlike a statistical ensemble in statistical mechanics, which is usually infinite, a machine learning ensemble consists of only a concrete finite set of alternative models, but typically allows for much more flexible structure to exist among those alternatives.

Synthetic data is information that's artificially generated rather than produced by real-world events. Typically created using algorithms, synthetic data can be deployed to validate mathematical models and to train machine learning models.

<span class="mw-page-title-main">Adversarial machine learning</span> Research field that lies at the intersection of machine learning and computer security

Adversarial machine learning is the study of the attacks on machine learning algorithms, and of the defenses against such attacks. A survey from May 2020 exposes the fact that practitioners report a dire need for better protecting machine learning systems in industrial applications.

DeepFace is a deep learning facial recognition system created by a research group at Facebook. It identifies human faces in digital images. The program employs a nine-layer neural network with over 120 million connection weights and was trained on four million images uploaded by Facebook users. The Facebook Research team has stated that the DeepFace method reaches an accuracy of 97.35% ± 0.25% on Labeled Faces in the Wild (LFW) data set where human beings have 97.53%. This means that DeepFace is sometimes more successful than human beings. As a result of growing societal concerns Meta announced that it plans to shut down Facebook facial recognition system, deleting the face scan data of more than one billion users. This change will represent one of the largest shifts in facial recognition usage in the technology’s history. Facebook planned to delete by December 2021 more than one billion facial recognition templates, which are digital scans of facial features. However, it did not plan to eliminate DeepFace which is the software that powers the facial recognition system. The company has also not ruled out incorporating facial recognition technology into future products, according to Meta spokesperson.

<span class="mw-page-title-main">Generative adversarial network</span> Deep learning method

A generative adversarial network (GAN) is a class of machine learning framework and a prominent framework for approaching generative AI. The concept was initially developed by Ian Goodfellow and his colleagues in June 2014. In a GAN, two neural networks contest with each other in the form of a zero-sum game, where one agent's gain is another agent's loss.

<span class="mw-page-title-main">Artificial intelligence in healthcare</span> Overview of the use of artificial intelligence in healthcare

Artificial intelligence in healthcare is a term used to describe the use of machine-learning algorithms and software, or artificial intelligence (AI), to copy human cognition in the analysis, presentation, and understanding of complex medical and health care data, or to exceed human capabilities by providing new ways to diagnose, treat, or prevent disease. Specifically, AI is the ability of computer algorithms to approximate conclusions based solely on input data.

<span class="mw-page-title-main">Algorithmic bias</span> Technological phenomenon with social implications

Algorithmic bias describes systematic and repeatable errors in a computer system that create "unfair" outcomes, such as "privileging" one category over another in ways different from the intended function of the algorithm.

Local differential privacy (LDP) is a model of differential privacy with the added requirement that if an adversary has access to the personal responses of an individual in the database, that adversary will still be unable to learn much of the user's personal data. This is contrasted with global differential privacy, a model of differential privacy that incorporates a central aggregator with access to the raw data.

Since the advent of differential privacy, a number of systems supporting differentially private data analyses have been implemented and deployed. This article tracks real-world deployments, production software packages, and research prototypes.

Amazon Rekognition is a cloud-based software as a service (SaaS) computer vision platform that was launched in 2016. It has been sold to, and used by, a number of United States government agencies, including U.S. Immigration and Customs Enforcement (ICE) and Orlando, Florida police, as well as private entities.

Clearview AI is an American facial recognition company, providing software to law enforcement and government agencies and other organizations. The company's algorithm matches faces to a database of more than 20 billion images collected from the Internet, including social media applications. Founded by Hoan Ton-That and Richard Schwartz, the company maintained a low profile until late 2019, when its usage by law enforcement was reported. U.S. by police have used the software to apprehend suspected criminals. Clearview's practices have lead to fines by EU nations for violating privacy laws and investigations in the U.S. and other countries as well.

DataWorks Plus LLC is a privately held biometrics systems integrator based in Greenville, South Carolina. The company started in 2000 and originally focused on mugshot management, adding facial recognition beginning in 2005. Brad Bylenga is the CEO, and Todd Pastorini is the EVP and GM. Usage of the technology by police departments has resulted in wrongful arrests.

An audio deepfake is a type of artificial intelligence used to create convincing speech sentences that sound like specific people saying things they did not say. This technology was initially developed for various applications to improve human life. For example, it can be used to produce audiobooks, and also to help people who have lost their voices to get them back. Commercially, it has opened the door to several opportunities. This technology can also create more personalized digital assistants and natural-sounding text-to-speech as well as speech translation services.

Identity replacement technology is any technology that is used to cover up all or parts of a person's identity, either in real life or virtually. This can include face masks, face authentication technology, and deepfakes on the Internet that spread fake editing of videos and images. Face replacement and identity masking are used by either criminals or law-abiding citizens. Identity replacement tech, when operated on by criminals, leads to heists or robbery activities. Law-abiding citizens utilize identity replacement technology to prevent government or various entities from tracking private information such as locations, social connections, and daily behaviors.

References

  1. James Vinvent (4 August 2020). "Cloak your photos with this AI privacy tool to fool facial recognition". The Verge. Retrieved 18 May 2021.
  2. 1 2 Ledford, B 2021, An Assessment of Image-Cloaking Techniques Against Automated Face Recognition for Biometric Privacy, Masters Thesis, Florida Institute of Technology, Melbourne Florida, viewed 27 July 27 2022, https://repository.lib.fit.edu/handle/11141/3478.
  3. 1 2 3 Shan, Shawn; Wenger, Emily; Zhang, Jiayun; Li, Huiying; Zheng, Haitao; Zhao, Ben Y. (2020-06-22). "Fawkes: Protecting Privacy against Unauthorized Deep Learning Models". arXiv: 2002.08327 [cs.CR].
  4. 1 2 "Fawkes". sandlab.cs.uchicago.edu. Retrieved 2022-07-28.
  5. Hill, Kashmir (2020-08-03). "This Tool Could Protect Your Photos From Facial Recognition". The New York Times. ISSN   0362-4331 . Retrieved 2022-07-28.
  6. Grad, Peter; Xplore, Tech. "Image cloaking tool thwarts facial recognition programs". techxplore.com. Retrieved 2022-07-28.
  7. "UChicago CS Researchers Create New Protection Against Facial Recognition". Department of Computer Science. Retrieved 2022-07-28.
  8. Shan, Shawn; Wenger, Emily; Zhang, Jiayun; Li, Huiying; Zheng, Haitao; Zhao, Ben Y. (2020-06-22). "Fawkes: Protecting Privacy against Unauthorized Deep Learning Models". arXiv: 2002.08327 [cs.CR].