Video search engine

Last updated

A video search engine is a web-based search engine which crawls the web for video content. Some video search engines parse externally hosted content while others allow content to be uploaded and hosted on their own servers. Some engines also allow users to search by video format type and by length of the clip. The video search results are usually accompanied by a thumbnail view of the video.

Contents

Video search engines are computer programs designed to find videos stored on digital devices, either through Internet servers or in storage units from the same computer. These searches can be made through audiovisual indexing, which can extract information from audiovisual material and record it as metadata, which will be tracked by search engines.

Utility

The main use of these search engines is the increasing creation of audiovisual content and the need to manage it properly. The digitization of audiovisual archives and the establishment of the Internet, has led to large quantities of video files stored in big databases, whose recovery can be very difficult because of the huge volumes of data and the existence of a semantic gap.

Search criterion

The search criterion used by each search engine depends on its nature and purpose of the searches.

Metadata

Metadata is information about facts. It could be information about who is the author of the video, creation date, duration, and all the information that could be extracted and included in the same files. Internet is often used in a language called XML to encode metadata, which works very well through the web and is readable by people. Thus, through this information contained in these files is the easiest way to find data of interest to us.

In the videos there are two types of metadata, that we can integrate in the video code itself and external metadata from the page where the video is. In both cases we optimize them to make them ideal when indexed.

Internal metadata

All video formats incorporate their own metadata. The title, description, coding quality or transcription of the content are possible. To review these data exist programs like FLV MetaData Injector, Sorenson Squeeze or Castfire. Each one has some utilities and special specifications.

Converting from one format to another can lose much of this data, so check that the new format information is correct. It is therefore advisable to have the video in multiple formats, so all search robots will be able to find and index it.

External metadata

In most cases the same mechanisms must be applied as in the positioning of an image or text content.

Title and description

They are the most important factors when positioning a video, because they contain most of the necessary information. The titles have to be clearly descriptive and should remove every word or phrase that is not useful.

Filename

It should be descriptive, including keywords that describe the video with no need to see their title or description. Ideally, separate the words by dashes "-".

Tags

On the page where the video is, it should be a list of keywords linked to the microformat "rel-tag". These words will be used by search engines as a basis for organizing information.

Transcription and subtitles

Although not completely standard, there are two formats that store information in a temporal component that is specified, one for subtitles and another for transcripts, which can also be used for subtitles. The formats are SRT or SUB for subtitles and TTXT for transcripts.

Speech recognition

Speech recognition consists of a transcript of the speech of the audio track of the videos, creating a text file. In this way and with the help of a phrase extractor can easily search if the video content is of interest. Some search engines apart from using speech recognition to search for videos, also use it to find the specific point of a multimedia file in which a specific word or phrase is located and so go directly to this point. Gaudi (Google Audio Indexing), a project developed by Google Labs, uses voice recognition technology to locate the exact moment that one or more words have been spoken within an audio, allowing the user to go directly to exact moment that the words were spoken. If the search query matches some videos from YouTube, the positions are indicated by yellow markers, and must pass the mouse over to read the transcribed text.

Speaker recognition

In addition to transcription, analysis can detect different speakers and sometime attribute the speech to an identified name for the speaker.

Text recognition

The text recognition can be very useful to recognize characters in the videos through "chyrons". As with speech recognizers, there are search engines that allow (through character recognition) to play a video from a particular point.

TalkMiner, an example of search of specific fragments from videos by text recognition, analyzes each video once per second looking for identifier signs of a slide, such as its shape and static nature, captures the image of the slide and uses Optical Character Recognition (OCR) to detect the words on the slides. Then, these words are indexed in the search engine of TalkMiner, which currently offers to users more than 20,000 videos from institutions such as Stanford University, the University of California at Berkeley, and TED.

Frame analysis

Through the visual descriptors we can analyze the frames of a video and extract information that can be scored as metadata. Descriptions are generated automatically and can describe different aspects of the frames, such as color, texture, shape, motion, and the situation.

Chaptering

The video analysis can lead to automatic chaptering, using technics such as change of camera angle, identification of audio jingles. By knowing the typical structure of a video document, it is possible to identify starting and ending credits, content parts and beginning and ending of advertising breaks.

Ranking criterion

The usefulness of a search engine depends on the relevance of the result set returned. While there may be millions of videos that include a particular word or phrase, some videos may be more relevant, popular or have more authority than others. This arrangement has a lot to do with search engine optimization.

Most search engines use different methods to classify the results and provide the best video in the first results. However, most programs allow sorting the results by several criteria.

Order by relevance

This criterion is more ambiguous and less objective, but sometimes it is the closest to what we want; depends entirely on the searcher and the algorithm that the owner has chosen. That's why it has always been discussed and now that search results are so ingrained into our society it has been discussed even more. This type of management often depends on the number of times that the searched word comes out, the number of viewings of this, the number of pages that link to this content and ratings given by users who have seen it. [1]

Order by date of upload

This is a criterion based totally on timeline. Results can be sorted according to their seniority in the repository.

Order by number of views

It can give us an idea of the popularity of each video.

Order by length

This is the length of the video and can give a taste of which video it is.

Order by user rating

It is common practice in repositories let the users rate the videos, so that a content of quality and relevance will have a high rank on the list of results gaining visibility. This practice is closely related to virtual communities.

Interfaces

We can distinguish two basic types of interfaces, some are web pages hosted on servers which are accessed by Internet and searched through the network, and the others are computer programs that search within a private network.

Internet

Within Internet interfaces we can find repositories that host video files which incorporate a search engine that searches only their own databases, and video searchers without repository that search in sources of external software.

Repositories with video searcher

Provides accommodation in video files stored on its servers and usually has an integrated search engine that searches through videos uploaded by its users. One of the first web repositories, or at least the most famous are the portals Vimeo, Dailymotion and YouTube.

Their searches are often based on reading the metadata tags, titles and descriptions that users assign to their videos. The disposal and order criterion of the results of these searches are usually selectable between the file upload date, the number of viewings or what they call the relevance. Still, sorting criteria are nowadays the main weapon of these websites, because the positioning of videos is important in terms of promotion.[ citation needed ]

Video searchers repositories

They are websites specialized in searching videos across the network or certain pre-selected repositories. They work by web spiders that inspect the network in an automated way to create copies of the visited websites, which will then be indexed by search engines, so they can provide faster searches.

Private network

Functioning scheme Esquema angles.jpg
Functioning scheme

Sometimes a search engine only searches in audiovisual files stored within a computer or, as it happens in televisions, on a private server where users access through a local area network. These searchers are usually software or rich Internet applications with a very specific search options for maximum speed and efficiency when presenting the results. They are typically used for large databases and are therefore highly focused to satisfy the needs of television companies. An example of this type of software would be the Digition Suite, which apart from being a benchmark in this kind of interfaces is very close to us as for the storage and retrieval files system from the Corporació Catalana de Mitjans Audiovisuals. [2]

This particular suite and perhaps in its strongest point is that it integrates the entire process of creating, indexing, storing, searching, editing, and a recovery. Once we have a digitized audiovisual content is indexed with different techniques of different level depending on the importance of content and it's stored. The user, when he wants to retrieve a particular file, has to fill a search fields such as program title, issue date, characters who act or the name of the producer, and the robot starts the search. Once the results appear and they arranged according to preferences, the user can play the low quality videos to work as quickly as possible. When he finds the desired content, it is downloaded with good definition, it's edited and reproduced. [3]

Design and algorithms

Video search has evolved slowly through several basic search formats which exist today and all use keywords. The keywords for each search can be found in the title of the media, any text attached to the media and content linked web pages, also defined by authors and users of video hosted resources.

Some video search is performed using human powered search, others create technological systems that work automatically to detect what is in the video and match the searchers needs. Many efforts to improve video search including both human powered search as well as writing algorithm that recognize what's inside the video have meant complete redevelopment of search efforts.

It is generally acknowledged that speech to text is possible, though recently Thomas Wilde, the new CEO of Everyzing, acknowledged that Everyzing works 70% of the time when there is music, ambient noise or more than one person speaking. If newscast style speaking (one person, speaking clearly, no ambient noise) is available, that can rise to 93%. (From the Web Video Summit, San Jose, CA, June 27, 2007).

Around 40 phonemes exist in every language with about 400 in all spoken languages. Rather than applying a text search algorithm after speech-to-text processing is completed, some engines use a phonetic search algorithm to find results within the spoken word. Others work by literally listening to the entire podcast and creating a text transcription using a sophisticated speech-to-text process. Once the text file is created, the file can be searched for any number of search words and phrases.

It is generally acknowledged that visual search into video does not work well and that no company is using it publicly. Researchers at UC San Diego and Carnegie Mellon University have been working on the visual search problem for more than 15 years, and admitted at a "Future of Search" conference at UC Berkeley in spring 2007 that it was years away from being viable even in simple search.

Video search engines

Search that is not affected by the hosting of video, where results are agnostic no matter where the video is located:

Search results are modified, or suspect, due to the large hosted video being given preferential treatment in search results:

See also

Related Research Articles

Meta elements are tags used in HTML and XHTML documents to provide structured metadata about a Web page. They are part of a web page's head section. Multiple Meta elements with different attributes can be used on the same page. Meta elements can be used to specify page description, keywords and any other metadata not provided through the other head elements and attributes.

An image retrieval system is a computer system used for browsing, searching and retrieving images from a large database of digital images. Most traditional and common methods of image retrieval utilize some method of adding metadata such as captioning, keywords, title or descriptions to the images so that retrieval can be performed over the annotation words. Manual image annotation is time-consuming, laborious and expensive; to address this, there has been a large amount of research done on automatic image annotation. Additionally, the increase in social web applications and the semantic web have inspired the development of several web-based image annotation tools.

<span class="mw-page-title-main">Spotlight (Apple)</span> macOS search feature

Spotlight is a system-wide desktop search feature of Apple's macOS and iOS operating systems. Spotlight is a selection-based search system, which creates an index of all items and files on the system. It is designed to allow the user to quickly locate a wide variety of items on the computer, including documents, pictures, music, applications, and System Settings. In addition, specific words in documents and in web pages in a web browser's history or bookmarks can be searched. It also allows the user to narrow down searches with creation dates, modification dates, sizes, types and other attributes. Spotlight also offers quick access to definitions from the built-in New Oxford American Dictionary and to calculator functionality. There are also command-line tools to perform functions such as Spotlight searches.

<span class="mw-page-title-main">Desktop search</span>

Desktop search tools search within a user's own computer files as opposed to searching the Internet. These tools are designed to find information on the user's PC, including web browser history, e-mail archives, text documents, sound files, images, and video. A variety of desktop search programs are now available; see this list for examples. Most desktop search programs are standalone applications. Desktop search products are software alternatives to the search software included in the operating system, helping users sift through desktop files, emails, attachments, and more.

In text retrieval, full-text search refers to techniques for searching a single computer-stored document or a collection in a full-text database. Full-text search is distinguished from searches based on metadata or on parts of the original texts represented in databases.

Enterprise content management (ECM) extends the concept of content management by adding a timeline for each content item and, possibly, enforcing processes for its creation, approval, and distribution. Systems using ECM generally provide a secure repository for managed items, analog or digital. They also include one methods for importing content to bring manage new items, and several presentation methods to make items available for use. Although ECM content may be protected by digital rights management (DRM), it is not required. ECM is distinguished from general content management by its cognizance of the processes and procedures of the enterprise for which it is created.

An IFilter is a plugin that allows Microsoft's search engines to index various file formats so that they become searchable. Without an appropriate IFilter, contents of a file cannot be parsed and indexed by the search engine.

<span class="mw-page-title-main">Search engine</span> Software system that is designed to search for information on the World Wide Web

A search engine is a software system that finds web pages that match a web search. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a line of results, often referred to as search engine results pages (SERPs). The information may be a mix of hyperlinks to web pages, images, videos, infographics, articles, and other types of files. Some search engines also mine data available in databases or open directories. Unlike web directories and social bookmarking sites, which are maintained by human editors, search engines also maintain real-time information by running an algorithm on a web crawler. Any internet-based content that cannot be indexed and searched by a web search engine falls under the category of deep web. Modern search engines are based on techniques and methods developed in the field of Information retrieval.

Multimedia search enables information search using queries in multiple data types including text and other multimedia formats. Multimedia search can be implemented through multimodal search interfaces, i.e., interfaces that allow to submit search queries not only as textual requests, but also through other media. We can distinguish two methodologies in multimedia search:

Search engine indexing is the collecting, parsing, and storing of data to facilitate fast and accurate information retrieval. Index design incorporates interdisciplinary concepts from linguistics, cognitive psychology, mathematics, informatics, and computer science. An alternate name for the process, in the context of search engines designed to find web pages on the Internet, is web indexing.

Image meta search is a type of search engine specialised on finding pictures, images, animations etc. Like the text search, image search is an information retrieval system designed to help to find information on the Internet and it allows the user to look for images etc. using keywords or search phrases and to receive a set of thumbnail images, sorted by relevancy.

<span class="mw-page-title-main">BASE (search engine)</span> Academic search engine

BASE is a multi-disciplinary search engine to scholarly internet resources, created by Bielefeld University Library in Bielefeld, Germany. It is based on free and open-source software such as Apache Solr and VuFind. It harvests OAI metadata from institutional repositories and other academic digital libraries that implement the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), and then normalizes and indexes the data for searching. In addition to OAI metadata, the library indexes selected web sites and local data collections, all of which can be searched via a single search interface.

Enterprise search is the practice of making content from multiple enterprise-type sources, such as databases and intranets, searchable to a defined audience.

Audio mining is a technique by which the content of an audio signal can be automatically analyzed and searched. It is most commonly used in the field of automatic speech recognition, where the analysis tries to identify any speech within the audio. The term ‘audio mining’ is sometimes used interchangeably with audio indexing, phonetic searching, phonetic indexing, speech indexing, audio analytics, speech analytics, word spotting, and information retrieval. Audio indexing, however, is mostly used to describe the pre-process of audio mining, in which the audio file is broken down into a searchable index of words.

An audio search engine is a web-based search engine which crawls the web for audio content. The information can consist of web pages, images, audio files, or another type of document. Various techniques exist for research on these engines.

<span class="mw-page-title-main">Windows Search</span> Desktop search platform by Microsoft

Windows Search is a content index desktop search platform by Microsoft introduced in Windows Vista as a replacement for both the previous Indexing Service of Windows 2000 and the optional MSN Desktop Search for Windows XP and Windows Server 2003, designed to facilitate local and remote queries for files and non-file items in compatible applications including Windows Explorer. It was developed after the postponement of WinFS and introduced to Windows constituents originally touted as benefits of that platform.

<span class="mw-page-title-main">Metadata</span> Data about data

Metadata is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including:

Document Capture Software refers to applications that provide the ability and feature set to automate the process of scanning paper documents or importing electronic documents, often for the purposes of feeding advanced document classification and data collection processes. Most scanning hardware, both scanners and copiers, provides the basic ability to scan to any number of image file formats, including: PDF, TIFF, JPG, BMP, etc. This basic functionality is augmented by document capture software, which can add efficiency and standardization to the process.

<span class="mw-page-title-main">Reverse image search</span> Content-based image retrieval

Reverse image search is a content-based image retrieval (CBIR) query technique that involves providing the CBIR system with a sample image that it will then base its search upon; in terms of information retrieval, the sample image is very useful. In particular, reverse image search is characterized by a lack of search terms. This effectively removes the need for a user to guess at keywords or terms that may or may not return a correct result. Reverse image search also allows users to discover content that is related to a specific sample image or the popularity of an image, and to discover manipulated versions and derivative works.

Discoverability is the degree to which something, especially a piece of content or information, can be found in a search of a file, database, or other information system. Discoverability is a concern in library and information science, many aspects of digital media, software and web development, and in marketing, since products and services cannot be used if people cannot find it or do not understand what it can be used for.

References

  1. (in English) SEO by Google central webmaster
  2. (in Catalan) Digitalize or die (Alícia Conesa) Archived July 8, 2011, at the Wayback Machine
  3. (in Catalan) Digition Suite from Activa Multimedia
 Process of search engines How Stuff Works (in English)