Reverse image search

Last updated
Reverse image search using Google Images. GoogleImageSearch.png
Reverse image search using Google Images.

Reverse image search is a content-based image retrieval (CBIR) query technique that involves providing the CBIR system with a sample image that it will then base its search upon; in terms of information retrieval, the sample image is very useful. In particular, reverse image search is characterized by a lack of search terms. This effectively removes the need for a user to guess at keywords or terms that may or may not return a correct result. Reverse image search also allows users to discover content that is related to a specific sample image [1] or the popularity of an image, and to discover manipulated versions and derivative works. [2]

Contents

A visual search engine is a search engine designed to search for information on the World Wide Web through a reverse image search. Information may consist of web pages, locations, other images and other types of documents. This type of search engines is mostly used to search on the mobile Internet through an image of an unknown object (unknown search query). Examples are buildings in a foreign city. These search engines often use techniques for Content Based Image Retrieval.

A visual search engine searches images, patterns based on an algorithm which it could recognize and gives relative information based on the selective or apply pattern match technique.

Uses

Reverse image search may be used to: [3]

Algorithms

Commonly used reverse image search algorithms include: [4]

Visual information searchers

Screenshot of results shown by the image searcher through example GOS Imatge cercadors 1.jpg
Screenshot of results shown by the image searcher through example GOS

An image search engine is a search engine that is designed to find an image. The search can be based on keywords, a picture, or a web link to a picture. The results depend on the search criterion, such as metadata, distribution of color, shape, etc., and the search technique which the browser uses.

Diagram of a search realized through example based on detectable regions from an image Imatge wiki 2.png
Diagram of a search realized through example based on detectable regions from an image

Image search techniques

Two techniques currently used in image search:

Search by metadata: Image search is based on comparison of metadata associated with the image as keywords, text, etc. and it is obtained by employing a set of images sorted by relevance. The metadata associated with each image can reference the title of the image, format, color, etc. and can be generated manually or automatically. This metadata generation process is called audiovisual indexing.

Search by example: In this technique, also called reverse image search, the search results are obtained through the comparison between images using content-based image retrieval computer vision techniques. During the search the content of the image is examined, such as color, shape, texture or any visual information that can be extracted from the image. This system requires a higher computational complexity, but is more efficient and reliable than search by metadata.

There are image searchers that combine both search techniques. For example, the first search is done by entering a text. The images obtained are then used to refine the search.

A video search engine is a search engine designed to search video on the net. Some video searchers process the search directly in the Internet, while others shelter the videos from which the search is done. Some searchers also enable to use as search parameters the format or the length of the video. Usually the results come with a miniature capture of the video.

Video search techniques

Currently, almost all video searchers are based on keywords (search by metadata) to perform searches. These keywords can be found in the title of the video, text accompanying the video or can be defined by the author. An example of this type of search is YouTube.

3D Models searcher

A searcher of 3D models aims to find the file of a 3D modeling object from a database or network. At first glance the implementation of this type of searchers may seem unnecessary, but due to the continuous documentary inflation of the Internet, every day it becomes more necessary indexing information.

3D Models search techniques

3D models search techniques 3D models search techniques.png
3D models search techniques

These have been used with traditional text-based searchers (keywords / tags), where the authors of the indexed material, or Internet users, have contributed these tags or keywords. Because it is not always effective, it has recently been investigated in the implementation of search engines that combine the search using text with the search compared to 2D drawings, 3D drawings and 3D models.

Princeton University has developed a search engine that combines all these parameters to perform the search, thus increasing the efficiency of search. [6] Also, 3DfindIT.com portal provides 3D models search engine based on sketch, drawings, text, etc. https://www.3dfindit.com/

A mobile image searcher is a type of search engine designed exclusively for mobile phones, through which you can find any information on Internet, through an image made with the own mobile phone or using certain words (keywords). Mobile Visual Search solutions enable you to integrate image recognition software capabilities into your own branded mobile applications. Mobile Visual Search (MVS) bridges the gap between online and offline media, enabling you to link your customers to digital content.

Introduction

Mobile phones have evolved into powerful image and video processing devices equipped with high-resolution cameras, color displays, and hardware-accelerated graphics. They are also increasingly equipped with a global positioning system and connected to broadband wireless networks. All this enables a new class of applications that use the camera phone to initiate search queries about objects in visual proximity to the user (Figure 1). Such applications can be used, e.g., for identifying products, comparison shopping, finding information about movies, compact disks (CDs), real estate, print media, or artworks.

Process

Typically, this type of search engine uses techniques of query by example or Image query by example, which use the content, shape, texture and color of the image to compare them in a database and then deliver the approximate results from the query.

The process used in these searches in the mobile phones is as follows:

First, the image is sent to the server application. Already on the server, the image will be analyzed by different analytical teams, as each one is specialized in different fields that make up an image. Then, each team will decide if the submitted image contains the fields of their speciality or not.

Once this whole procedure is done, a central computer will analyze the data and create a page of the results sorted with the efficiency of each team, to eventually be sent to the mobile phone.

Yandex

Yandex Images offers a global reverse image and photo search. The site uses standard Content Based Image Retrieval (CBIR) technology used by many other sites, but additionally uses artificial intelligence-based technology to locate further results based on query. [7] Users can drag and drop images to the toolbar for the site to complete a search on the internet for similar looking images. The Yandex images searches some obscure social media sites in addition to more common ones offering content owners means of tracking plagiarism of image or photo intellectual property.

Google Images

Google's Search by Image is a feature that uses reverse image search and allows users to search for related images by uploading an image or copying the image URL. Google accomplishes this by analyzing the submitted picture and constructing a mathematical model of it. It is then compared with other images in Google's databases before returning matching and similar results. When available, Google also uses metadata about the image such as description. In 2022 the feature was replaced by Google Lens as the default visual search method on Google, and the old Search by Image function remains available within Google Lens. [8]

TinEye

TinEye is a search engine specialized for reverse image search. Upon submitting an image, TinEye creates a "unique and compact digital signature or fingerprint" of said image and matches it with other indexed images. [9] This procedure is able to match even very edited versions of the submitted image, but will not usually return similar images in the results. [10]

Pixsy

Pixsy reverse image search technology detects image matches [11] on the public internet for images uploaded to the Pixsy platform. [12] New matches are automatically detected and alerts sent to the user. For unauthorized use, Pixsy offers a compensation recovery service [13] [14] for commercial use of the image owners work. Pixsy partners with over 25 law firms and attorneys around the world to bring resolution for copyright infringement. Pixsy is the strategic image monitoring service for the Flickr platform and users. [15]

eBay

eBay ShopBot uses reverse image search to find products by a user uploaded photo. eBay uses a ResNet-50 network for category recognition, image hashes are stored in Google Bigtable; Apache Spark jobs are operated by Google Cloud Dataproc for image hash extraction; and the image ranking service is deployed by Kubernetes. [16]

SK Planet

SK Planet uses reverse image search to find related fashion items on its e-commerce website. It developed the vision encoder network based on the TensorFlow inception-v3, with speed of convergence and generalization for production usage. A recurrent neural network is used for multi-class classification, and fashion-product region-of interest detection is based on Faster R-CNN. SK Planet's reverse image search system is built in less than 100 man-months. [17]

Alibaba

Alibaba released the Pailitao application in 2014. Pailitao (Chinese :拍立淘, literally means shopping through a camera) allows users to search for items on Alibaba's E-commercial platform by taking a photo of the query object. The Pailitao application uses a deep CNN model with branches for joint detection and feature learning to discover the detection mask and exact discriminative feature without background disturbance. GoogLeNet V1 is employed as the base model for category prediction and feature learning. [18] [19]

Pinterest

Pinterest acquired startup company VisualGraph in 2014 and introduced visual search on its platform. [20] In 2015, Pinterest published a paper at the ACM Conference on Knowledge Discovery and Data Mining conference and disclosed the architecture of the system. The pipeline uses Apache Hadoop, the open-source Caffe convolutional neural network framework, Cascading for batch processing, PinLater for messaging, and Apache HBase for storage. Image characteristics, including local features, deep features, salient color signatures and salient pixels are extracted from user uploads. The system is operated by Amazon EC2, and only requires a cluster of 5 GPU instances to handle daily image uploads onto Pinterest. By using reverse image search, Pinterest is able to extract visual features from fashion objects (e.g. shoes, dress, glasses, bag, watch, pants, shorts, bikini, earrings) and offer product recommendations that look similar. [21] [22]

JD.com

JD.com disclosed the design and implementation of its real time visual search system at the Middleware '18 conference. The peer reviewed paper focuses on the algorithms used by JD's distributed hierarchical image feature extraction, indexing and retrieval system, which has 300 million daily active users. The system was able to sustain 80 million updates to its database per hour when it was deployed in production in 2018. [23]

Bing

Microsoft Bing published the architecture of their reverse image searching of system at the KDD'18 conference. The paper states that a variety of features from a query image submitted by a user are used to describe its content, including using deep neural network encoders, category recognition features, face recognition features, color features and duplicate detection features. [24]

Amazon

Amazon.com disclosed the architecture of a visual search engine for fashion and home products named Amazon Shop the Look in a paper published at the KDD'22 conference. The paper describes the lessons learned by Amazon when deployed in production environment, including image synthesis-based data augmentation for retrieval performance optimization and accuracy improvement. [25]

Research systems

Microsoft Research Asia's Beijing Lab published a paper in the Proceedings of the IEEE on the Arista-SS (Similar Search) and the Arista-DS (Duplicate Search) systems. Arista-DS only performs duplicate search algorithms such as principal component analysis on global image features to lower computational and memory costs. Arista-DS is able to perform duplicate search on 2 billion images with 10 servers but with the trade-off of not detecting near duplicates. [26]

Open-source implementations

In 2007, the Puzzle library is released under the ISC license. Puzzle is designed to offer reverse image search visually similar images, even after the images have been resized, re-compressed, recolored and/or slightly modified. [27]

The image-match opensource project was released in 2016. The project, licensed under the Apache License, implements a reverse image search engine written in Python. [28]

Both the Puzzle library and the image-match projects use algorithms published at an IEEE ICIP conference. [29]

In 2019, a book published by O'Reilly documents how a simple reverse image search system can be built in a few hours. The book covers image feature extraction and similarity search, together with more advanced topics including scalability using GPUs and search accuracy improvement tuning. [30] The code for the system was made available freely on GitHub. [31]

The processing demands for performing reverse video search would be astoundingly high. There is no simple tool to just upload the video to find the matching results. At present there is no technology that can successfully perform a reverse video search. [32] [33]

Production reverse image search systems

See also

Related Research Articles

In general computing, a search engine is an information retrieval system designed to help find information stored on a computer system. It is an information retrieval software program that discovers, crawls, transforms, and stores information for retrieval and presentation in response to user queries. The search results are usually presented in a list and are commonly called hits. A search engine normally consists of four components, as follows: a search interface, a crawler, an indexer, and a database. The crawler traverses a document collection, deconstructs document text, and assigns surrogates for storage in the search engine index. Online search engines store images, link data and metadata for the document as well.

Search engine optimization (SEO) is the process of improving the quality and quantity of website traffic to a website or a web page from search engines. SEO targets unpaid traffic rather than direct traffic or paid traffic. Unpaid traffic may originate from different kinds of searches, including image search, video search, academic search, news search, and industry-specific vertical search engines.

An image retrieval system is a computer system used for browsing, searching and retrieving images from a large database of digital images. Most traditional and common methods of image retrieval utilize some method of adding metadata such as captioning, keywords, title or descriptions to the images so that retrieval can be performed over the annotation words. Manual image annotation is time-consuming, laborious and expensive; to address this, there has been a large amount of research done on automatic image annotation. Additionally, the increase in social web applications and the semantic web have inspired the development of several web-based image annotation tools.

<span class="mw-page-title-main">Metasearch engine</span> Online information retrieval tool

A metasearch engine is an online information retrieval tool that uses the data of a web search engine to produce its own results. Metasearch engines take input from a user and immediately query search engines for results. Sufficient data is gathered, ranked, and presented to the users.

<span class="mw-page-title-main">Content-based image retrieval</span> Method of image retrieval

Content-based image retrieval, also known as query by image content (QBIC) and content-based visual information retrieval (CBVIR), is the application of computer vision techniques to the image retrieval problem, that is, the problem of searching for digital images in large databases. Content-based image retrieval is opposed to traditional concept-based approaches.

A video search engine is a web-based search engine which crawls the web for video content. Some video search engines parse externally hosted content while others allow content to be uploaded and hosted on their own servers. Some engines also allow users to search by video format type and by length of the clip. The video search results are usually accompanied by a thumbnail view of the video.

<span class="mw-page-title-main">Search engine</span> Software system that is designed to search for information on the World Wide Web

A search engine is a software system that finds web pages that match a web search. It searches the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a line of results, often referred to as search engine results pages (SERPs). The information may be a mix of hyperlinks to web pages, images, videos, infographics, articles, and other types of files. As of January 2022, Google is by far the world's most used search engine, with a market share of 90.6%, and the world's other most used search engines were Bing, Yahoo!, Baidu, Yandex, and DuckDuckGo.

A search engine results page (SERP) is a webpage that is displayed by a search engine in response to a query by a user. The main component of a SERP is the listing of results that are returned by the search engine in response to a keyword query.

<span class="mw-page-title-main">Google Images</span> Image search engine by Google Inc.

Google Images is a search engine owned by Google that allows users to search the World Wide Web for images. It was introduced on July 12, 2001, due to a demand for pictures of the green Versace dress of Jennifer Lopez worn in February 2000. In 2011, reverse image search functionality was added.

Multimedia search enables information search using queries in multiple data types including text and other multimedia formats. Multimedia search can be implemented through multimodal search interfaces, i.e., interfaces that allow to submit search queries not only as textual requests, but also through other media. We can distinguish two methodologies in multimedia search:

Social search is a behavior of retrieving and searching on a social searching engine that mainly searches user-generated content such as news, videos and images related search queries on social media like Facebook, LinkedIn, Twitter, Instagram and Flickr. It is an enhanced version of web search that combines traditional algorithms. The idea behind social search is that instead of ranking search results purely based on semantic relevance between a query and the results, a social search system also takes into account social relationships between the results and the searcher. The social relationships could be in various forms. For example, in LinkedIn people search engine, the social relationships include social connections between searcher and each result, whether or not they are in the same industries, work for the same companies, belong the same social groups, and go the same schools, etc.

Image meta search is a type of search engine specialised on finding pictures, images, animations etc. Like the text search, image search is an information retrieval system designed to help to find information on the Internet and it allows the user to look for images etc. using keywords or search phrases and to receive a set of thumbnail images, sorted by relevancy.

An audio search engine is a web-based search engine which crawls the web for audio content. The information can consist of web pages, images, audio files, or another type of document. Various techniques exist for research on these engines.

A concept search is an automated information retrieval method that is used to search electronically stored unstructured text for information that is conceptually similar to the information provided in a search query. In other words, the ideas expressed in the information retrieved in response to a concept search query are relevant to the ideas contained in the text of the query.

TinEye is a reverse image search engine developed and offered by Idée, Inc., a company based in Toronto, Ontario, Canada. It is the first image search engine on the web to use image identification technology rather than keywords, metadata or watermarks. TinEye allows users to search not using keywords but with images. Upon submitting an image, TinEye creates a "unique and compact digital signature or fingerprint" of the image and matches it with other indexed images. This procedure is able to match even heavily edited versions of the submitted image, but will not usually return similar images in the results.

Page Hunt is a game developed by Bing for investigating human research behavior. It is a so-called "game with a purpose", as it pursues additional goals: not only to provide entertainment but also to harness human computation for some specific research task. The term "games with a purpose" was coined by Luis von Ahn, inventor of CAPTCHA, co-organizer of the reCAPTCHA project, and inventor of a famous ESP game.

Multimodal search is a type of search that uses different methods to get relevant results. They can use any kind of search, search by keyword, search by concept, search by example, etc.

The following outline is provided as an overview of and topical guide to search engines.

Contextual search is a form of optimizing web-based search results based on context provided by the user and the computer being used to enter the query. Contextual search services differ from current search engines based on traditional information retrieval that return lists of documents based on their relevance to the query. Rather, contextual search attempts to increase the precision of results based on how valuable they are to individual users.

User intent, otherwise known as query intent or search intent, is the identification and categorization of what a user online intended or wanted to find when they typed their search terms into an online web search engine for the purpose of search engine optimisation or conversion rate optimisation. Examples of user intent are fact-checking, comparison shopping or navigating to other websites.

References

  1. "How to search by image" . Retrieved 2 November 2013.
  2. "Video searching with Frompo". Frompo.com. Retrieved 2 November 2013.
  3. "FAQ - TinEye - Why use TinEye?". TinEye .
  4. Bundling Features for Large Scale Partial-DuplicateWeb Image Search Microsoft.
  5. A New Web Image Searching Engine by Using SIFT Algorithm computer.org
  6. Funkhouser, Thomas; Min, Patrick; Kazhdan, Michael; Chen, Joyce; Halderman, Alex; Dobkin, David; Jacobs, David (2002). "A Search Engine for 3D Models" (PDF). ACM Transactions on Graphics. 22 (1): 83–105. doi:10.1145/588272.588279. S2CID   1178691.
  7. Raj, Abhishek, ed. (February 27, 2022). "How Does Yandex Reverse Image Search Work? Detailed Guide". www.buddinggeek.com. Budding Geek. Retrieved May 5, 2022.
  8. Li, Abner (10 August 2022). "Google Images on the web now uses Google Lens". 9to5Google . Retrieved 2 December 2022.
  9. "FAQ - TinEye - How does TinEye work?". TinEye.
  10. "FAQ - TinEye - Can TinEye find similar images??". TinEye.
  11. "Find stolen images - Pixsy". Pixsy. Retrieved 2017-10-20.
  12. "Pixsy.com review: Find & Fight Image Theft - Online Marketing for Artists -". Online Marketing for Artists. 2015-07-02. Retrieved 2017-10-20.
  13. "Pixsy: Find and Get Paid for Image Theft". artlawjournal.com. 2014-10-18. Retrieved 2017-10-20.
  14. "Resolve image theft - Pixsy". Pixsy. Retrieved 2017-10-20.
  15. "Flickr Teams Up with Pixsy to Get You Paid When Photos Are Stolen". petapixel.com. 9 April 2019. Retrieved 2019-12-12.
  16. Yang, Fan; Kale, Ajinkya; Bubnov, Yury; Stein, Leon; Wang, Qiaosong; Kiapour, Hadi; Piramuthu, Robinson (2017). "Visual Search at eBay". Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp. 2101–2110. arXiv: 1706.03154 . doi:10.1145/3097983.3098162. ISBN   9781450348874. S2CID   22367645.{{cite book}}: |work= ignored (help)
  17. Visual Fashion-Product Search at SK Planet
  18. Zhang, Yanhao; Pan, Pan; Zheng, Yun; Zhao, Kang; Zhang, Yingya; Ren, Xiaofeng; Jin, Rong (2018). "Visual Search at Alibaba". Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. pp. 993–1001. arXiv: 2102.04674 . doi:10.1145/3219819.3219820. ISBN   9781450355520. S2CID   50776405.{{cite book}}: |work= ignored (help)
  19. "Shopping With Your Camera: Visual Image Search Meets E-Commerce at Alibaba". Alibaba Tech. September 2020.
  20. Josh Constine (6 January 2014). "Pinterest Acquires Image Recognition And Visual Search Startup VisualGraph". TechCrunch. AOL.
  21. Jing, Yushi; Liu, David; Kislyuk, Dmitry; Zhai, Andrew; Xu, Jiajing; Donahue, Jeff; Tavel, Sarah (2015). "Visual Search at Pinterest". Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp. 1889–1898. doi:10.1145/2783258.2788621. ISBN   9781450336642. S2CID   1153609.{{cite book}}: |work= ignored (help)
  22. "Building a scalable machine vision pipeline". Pinterest Engineering. Archived from the original on 2015-09-06.
  23. Li, Jie; Liu, Haifeng; Gui, Chuanghua; Chen, Jianyu; Ni, Zhenyuan; Wang, Ning; Chen, Yuan (2018). "The Design and Implementation of a Real Time Visual Search System on JD E-commerce Platform". Proceedings of the 19th International Middleware Conference Industry. pp. 9–16. arXiv: 1908.07389 . doi:10.1145/3284028.3284030. ISBN   9781450360166. S2CID   53713854.{{cite book}}: |website= ignored (help)
  24. Hu, Houdong; Wang, Yan; Yang, Linjun; Komlev, Pavel; Huang, Li; Chen, Xi (Stephen); Huang, Jiapei; Wu, Ye; Merchant, Meenaz; Sacheti, Arun (2018). "Web-Scale Responsive Visual Search at Bing". Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. pp. 359–367. doi:10.1145/3219819.3219843. ISBN   9781450355520. S2CID   3427399.{{cite book}}: |website= ignored (help)
  25. Amazon Shop the Look: A Visual Search System for Fashion and Home
  26. Duplicate-Search-Based Image Annotation Using Web-Scale Data Microsoft.
  27. The Puzzle library
  28. ProvenanceLabs / image-match
  29. An image signature for any kind of image
  30. Koul, Anirudh (October 2019). "Chapter 4. Building a Reverse Image Search Engine: Understanding Embeddings". Practical Deep Learning for Cloud, Mobile, and Edge. O'Reilly Media. ISBN   9781492034865.
  31. Practical-Deep-Learning-Book source repository
  32. VHow to Use Reverse Video Search (& Why It's Useful). September 2022.{{cite book}}: |work= ignored (help)
  33. "How to Find Source of a Video with Reverse Image Search?". Alibaba DigitBin. October 2020.
  34. How to Do a Reverse Image Search From Your Phone