Multimedia search

Last updated June 22, 2024

Multimedia search enables information search using queries in multiple data types including text and other multimedia formats. Multimedia search can be implemented through multimodal search interfaces, i.e., interfaces that allow to submit search queries not only as textual requests, but also through other media. We can distinguish two methodologies in multimedia search:

Metadata search

Search is made using the layers in metadata which contain information of the content of a multimedia file. Metadata search is easier, faster and effective because instead of working with complex material, such as an audio, a video or an image, it searches using text.

There are three processes which should be done in this method:

Summarization of media content (feature extraction). The result of feature extraction is a description.
Filtering of media descriptions (for example, elimination of Redundancy)
Categorization of media descriptions into classes.

Query by example

In query by example, the element used to search is a multimedia content (image, audio, video). In other words, the query is a media. Often, it's used audiovisual indexing. It will be necessary to choose the criteria we are going to use for creating metadata. The process of search can be divided in three parts:

Generate descriptors for the media which we are going to use as query and the descriptors for the media in our database.
Compare descriptors of the query and our database’s media.
List the media sorted by maximum coincidence.

Multimedia search engine

There are two big search families, in function of the content:

Visual search engine

Inside this family we can distinguish two topics: image search and video search

Image search : Although usually it's used simple metadata search, increasingly is being used indexing methods for making the results of users queries more accurate using query by example. For example, QR codes.
Video search : Videos can be searched for simple metadata or by complex metadata generated by indexing. The audio contained in the videos is usually scanned by audio search engines.

Audio search engine

There are different methods of audio searching:

Voice search engine: Allows the user to search using speech instead of text. It uses algorithms of speech recognition. An example of this technology is Google Voice Search.^[1]
Music search engine: Although most of applications which searches music works on simple metadata (artist, name of track, album…) . There are some programs of music recognition, for example Shazam ^[2] or SoundHound.

Related Research Articles

Information retrieval (IR) in computing and information science is the task of identifying and retrieving information system resources that are relevant to an information need. The information need can be specified in the form of a search query. In the case of document retrieval, queries can be based on full-text or other content-based indexing. Information retrieval is the science of searching for information in a document, searching for documents themselves, and also searching for the metadata that describes data, and for databases of texts, images or sounds.

In computing, a search engine is an information retrieval software system designed to help find information stored on one or more computer systems. Search engines discover, crawl, transform, and store information for retrieval and presentation in response to user queries. The search results are usually presented in a list and are commonly called hits. The most widely used type of search engine is a web search engine, which searches for information on the World Wide Web.

MPEG-7 is a multimedia content description standard. It was standardized in ISO/IEC 15938. This description will be associated with the content itself, to allow fast and efficient searching for material that is of interest to the user. MPEG-7 is formally called Multimedia Content Description Interface. Thus, it is not a standard which deals with the actual encoding of moving pictures and audio, like MPEG-1, MPEG-2 and MPEG-4. It uses XML to store metadata, and can be attached to timecode in order to tag particular events, or synchronise lyrics to a song, for example.

Music information retrieval (MIR) is the interdisciplinary science of retrieving information from music. Those involved in MIR may have a background in academic musicology, psychoacoustics, psychology, signal processing, informatics, machine learning, optical music recognition, computational intelligence or some combination of these.

An image retrieval system is a computer system used for browsing, searching and retrieving images from a large database of digital images. Most traditional and common methods of image retrieval utilize some method of adding metadata such as captioning, keywords, title or descriptions to the images so that retrieval can be performed over the annotation words. Manual image annotation is time-consuming, laborious and expensive; to address this, there has been a large amount of research done on automatic image annotation. Additionally, the increase in social web applications and the semantic web have inspired the development of several web-based image annotation tools.

Content-based image retrieval, also known as query by image content and content-based visual information retrieval (CBVIR), is the application of computer vision techniques to the image retrieval problem, that is, the problem of searching for digital images in large databases. Content-based image retrieval is opposed to traditional concept-based approaches.

<span class="mw-page-title-main">Singingfish</span>

Singingfish was an audio/video search engine that powered audio video search for Windows Media Player, WindowsMedia.com, RealOne/RealPlayer, Real Guide, AOL Search, Dogpile, Metacrawler and Singingfish.com, among others. Launched in 2000, it was one of the earliest and longest lived search engines dedicated to multimedia content. Acquired in 2003 by AOL, it was slowly folded into the AOL search offerings and all web hits from RMC TV to Singingfish were being redirected to AOL Video and as of February 2007 Singingfish had ceased to exist as a separate service.

Microformats (μF) are a set of defined HTML classes created to serve as consistent and descriptive metadata about an element, designating it as representing a certain type of data. They allow software to process the information reliably by having set classes refer to a specific type of data rather than being arbitrary.

A video search engine is a web-based search engine which crawls the web for video content. Some video search engines parse externally hosted content while others allow content to be uploaded and hosted on their own servers. Some engines also allow users to search by video format type and by length of the clip. The video search results are usually accompanied by a thumbnail view of the video.

Digital Item is the basic unit of transaction in the MPEG-21 framework. It is a structured digital object, including a standard representation, identification and metadata.

In computer vision, visual descriptors or image descriptors are descriptions of the visual features of the contents in images, videos, or algorithms or applications that produce such descriptions. They describe elementary characteristics such as the shape, the color, the texture or the motion, among others.

AXMEDIS is a set of European Union digital content standards, initially created as a research project running from 2004 to 2008 partially supported by the European Commission under the Information Society Technologies programme of the Sixth Framework Programme (FP6). It stands for "Automating Production of Cross Media Content for Multi-channel Distribution". Now it is distributed as a framework, and is still being maintained and improved. A large part of the framework is under open source licensing. The AXMEDIS framework includes a set of tools, models, test cases, documents, etc. supporting the production and distribution of cross media content.

An audio search engine is a web-based search engine which crawls the web for audio content. The information can consist of web pages, images, audio files, or another type of document. Various techniques exist for research on these engines.

aceMedia is a multimedia content management software package that was funded by the European Union, and developed from 2004 to 2007. The final review was presented in February 2008, and AceMedia is now available in two editions, personal and commercial.

A concept search is an automated information retrieval method that is used to search electronically stored unstructured text for information that is conceptually similar to the information provided in a search query. In other words, the ideas expressed in the information retrieved in response to a concept search query are relevant to the ideas contained in the text of the query.

<span class="mw-page-title-main">Metadata</span> Data about data

Metadata is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including:

<span class="mw-page-title-main">Reverse image search</span> Content-based image retrieval

Reverse image search is a content-based image retrieval (CBIR) query technique that involves providing the CBIR system with a sample image that it will then base its search upon; in terms of information retrieval, the sample image is very useful. In particular, reverse image search is characterized by a lack of search terms. This effectively removes the need for a user to guess at keywords or terms that may or may not return a correct result. Reverse image search also allows users to discover content that is related to a specific sample image or the popularity of an image, and to discover manipulated versions and derivative works.

Multimedia information retrieval is a research discipline of computer science that aims at extracting semantic information from multimedia data sources. Data sources include directly perceivable media such as audio, image and video, indirectly perceivable sources such as text, semantic descriptions, biosignals as well as not perceivable sources such as bioinformation, stock prices, etc. The methodology of MMIR can be organized in three groups:

Methods for the summarization of media content. The result of feature extraction is a description.
Methods for the filtering of media descriptions
Methods for the categorization of media descriptions into classes.

Multimodal search is a type of search that uses different methods to get relevant results. They can use any kind of search, search by keyword, search by concept, search by example, etc.

A 3D Content Retrieval system is a computer system for browsing, searching and retrieving three dimensional digital contents from a large database of digital images. The most original way of doing 3D content retrieval uses methods to add description text to 3D content files such as the content file name, link text, and the web page title so that related 3D content can be found through text retrieval. Because of the inefficiency of manually annotating 3D files, researchers have investigated ways to automate the annotation process and provide a unified standard to create text descriptions for 3D contents. Moreover, the increase in 3D content has demanded and inspired more advanced ways to retrieve 3D information. Thus, shape matching methods for 3D content retrieval have become popular. Shape matching retrieval is based on techniques that compare and contrast similarities between 3D models.

References

↑ "Google Voice Search can now handle multiple languages with ease" Engadget. Retrieved 2024-04-17.
↑ "Shazam Launches Resonate TV Sales Platform" Billboard. Retrieved 2024-04-17.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "Google Voice Search can now handle multiple languages with ease" Engadget. Retrieved 2024-04-17.

[2] "Shazam Launches Resonate TV Sales Platform" Billboard. Retrieved 2024-04-17.

[1]

[2]