Desktop search

Last updated
OSL Desktop Search engines software Aduna AutoFocus 5 AdunaAutoFocus5.png
OSL Desktop Search engines software Aduna AutoFocus 5

Desktop search tools search within a user's own computer files as opposed to searching the Internet. These tools are designed to find information on the user's PC, including web browser history, e-mail archives, text documents, sound files, images, and video. A variety of desktop search programs are now available; see this list for examples. Most desktop search programs are standalone applications. Desktop search products are software alternatives to the search software included in the operating system, helping users sift through desktop files, emails, attachments, and more. [1] [2] [3]

Contents

Desktop search emerged as a concern for large firms for two main reasons: untapped productivity and security. According to analyst firm Gartner, up to 80% of some companies' data is locked up inside unstructured data — the information stored on a user's PC, the directories (folders) and files they've created on a network, documents stored in repositories such as corporate intranets and a multitude of other locations. [4] Moreover, many companies have structured or unstructured information stored in older file formats to which they don't have ready access.

The sector attracted considerable attention in the late 2004 to early 2005 period from the struggle between Microsoft and Google. [5] [6] [7] According to market analysts, both companies were attempting to leverage their monopolies (of web browsers and search engines, respectively) to strengthen their dominance. Due to Google's complaint that users of Windows Vista cannot choose any competitor's desktop search program over the built-in one, an agreement was reached between US Justice Department and Microsoft that Windows Vista Service Pack 1 would enable users to choose between the built-in and other desktop search programs, and select which one is to be the default. [8] As of September 2011, Google ended life for Google Desktop.

Technologies

Most desktop search engines build and maintain an index database to improve performance when searching large amounts of data. Indexing usually takes place when the computer is idle and most search applications can be set to suspend indexing if a portable computer is running on batteries, in order to save power. There are notable exceptions, however: Voidtools' Everything Search Engine, [9] which performs searches over only file names, not contents, is able to build its index from scratch in just a few seconds. Another exception is Vegnos Desktop Search Engine, [10] which performs searches over filenames and files' contents without building any indices. An index may also not be up-to-date, when a query is performed. In this case, results returned will not be accurate (that is, a hit may be shown when it is no longer there, and a file may not be shown, when in fact it is a hit). Some products have sought to remedy this disadvantage by building a real-time indexing function into the software. There are disadvantages to not indexing. Namely, the time to complete a query can be significant, and the issued query can also be resource-intensive.

Desktop search tools typically collect three types of information about files:

Long-term goals for desktop search include the ability to search the contents of image files, sound files and video by context. [11] [12]

Platforms & their histories

Windows

Lookeen desktop search on Windows Lookeen Desktop Search - Screenshot of the Software.jpg
Lookeen desktop search on Windows

Indexing Service, a "base service that extracts content from files and constructs an indexed catalog to facilitate efficient and rapid searching", [13] was originally released in August 1996. It was built in order to speed up manually searching for files on Personal Desktops and Corporate Computer Network. Indexing service helped by using Microsoft web servers to index files on the desired hard drives. Indexing was done by file format. By using terms that users provided, a search was conducted that matched terms to the data within the file formats. The largest issue that Indexing service faced was the fact that every time a file was added, it had to be indexed. This coupled with the fact that the indexing cached the entire index in RAM, made the hardware a huge limitation. [14] This made indexing large amounts of files require extremely powerful hardware and very long wait times.

In 2003, Windows Desktop Search (WDS) replaced Microsoft Indexing Service. Instead of only matching terms to the details of the file format and file names, WDS brings in content indexing to all Microsoft files and text-based formats such as e-mail and text files. This means, that WDS looked into the files and indexed the content. Thus, when a user searched a term, WDS no longer matched just information such as file format types and file names, but terms, and values stored within those files. WDS also brought "Instant searching" meaning the user could type a character and the query would instantly start searching and updating the query as the user typed in more characters. [15] Windows Search apparently used up a lot of processing power, as Windows Desktop Search would only run if it was directly queried or while the PC was idle. Even only running while directly queried or while the computer was idled, indexing the entire hard drive still took hours. The index would be around 10% of the size of all the files that it indexed, e.g. if the indexed files amounted to around 100GB, the index size would be 10GB.

With the release of Windows Vista came Windows Search 3.1. Unlike its predecessors WDS and Windows Search 3.0, 3.1 could search through both indexed and non indexed locations seamlessly. Also, the RAM and CPU requirements were greatly reduced, cutting back indexing times immensely. Windows Search 4.0 is currently running on all PCs with Windows 7 and up.

Mac OS

In 1994 the AppleSearch search engine was introduced, allowing users to fully search all documents within their Macintosh computer, including file format types, meta-data on those files, and content within the files. AppleSearch was a client/server application, and as such required a server separate from the main device in order to function. The biggest issue with AppleSearch were its large resource requirements: "AppleSearch requires at least a 68040 processor and 5MB of RAM." [16] At the time, a Macintosh computer with these specifications was priced at approximately $1400; equivalent to $2050 in 2015. [17] On top of this, the software itself cost an additional $1400 for a single license.

In 1997, Sherlock was released alongside Mac OS 8.5. Sherlock (named after the famous fictional detective Sherlock Holmes) was integrated into Mac OS's file browser Finder. Sherlock extended the desktop search function to the World Wide Web, allowing users to search both locally and externally. Adding additional functions—such as internet access—to Sherlock was relatively simple, as this was done through plugins written as plain text files. Sherlock was included in every release of Mac OS from Mac OS 8, before being deprecated and replaced by Spotlight and Dashboard in Mac OS X 10.4 Tiger. It was officially removed in Mac OS X 10.5 Leopard

Spotlight was released in 2005 as part of Mac OS X 10.4 Tiger. It is a Selection-based search tool, which means the user invokes a query using only the mouse. Spotlight allows the user to search the Internet for more information about any keyword or phrase contained within a document or webpage, and uses a built-in calculator and Oxford American Dictionary to offer quick access to small calculations and word definitions. [18] While Spotlight initially has a long startup time, this decreases as the hard disk is indexed. As files are added by the user, the index is constantly updated in the background using minimal CPU & RAM resources.

Linux

There are a wide range of desktop search options for Linux users, depending upon the skill level of the user, their preference to use desktop tools which tightly integrate into their desktop environment, command-shell functionality (often with advanced scripting options), or browser-based users interfaces to locally running software. In addition, many users create their own indexing from a variety of indexing packages (e.g. one which does extraction and indexing of PDF/DOC/DOCX/ODT documents well, another search engine which works ith/ vcard, LDAP, and other directory/contact databases, as well as the conventional find and locate commands.

Ubuntu

Unity Dash search tool in Ubuntu 16.04 App Lens on Ubuntu 16.04LTS.png
Unity Dash search tool in Ubuntu 16.04

Ubuntu Linux didn't have desktop search until release Feisty Fawn 7.04. Using Tracker [19] desktop search, the desktop search feature was very similar to Mac OS's AppleSearch and Sherlock. It not only featured the basic features of file format sorting and meta-data matching, but support for searching through emails and instant messages was added. In 2014 Recoll [20] was added to Linux distributions, working with other search programs such as Tracker and Beagle to provide efficient full text search. This greatly increased the types of queries and file types that Linux desktop searches could handle. A major advantage of Recoll is that it allows for greater customization of what is indexed; Recoll will index the entire hard disk by default, but can be made to index only selected directories, omitting directories that will never need to be searched. [21]

openSUSE

Starting with KDE4, the NEPOMUK was introduced. It provided the ability to index a wide range of desktop content, email, and use semantic web technologies (e.g. RDF) to annotate the database. The introduction faced a few glitches, much of which seemed to be based on the triplestore. Performance improved (at least for queries) by switching the backend to a stripped-down version of the Virtuoso Open Source Edition, however indexing remained a common user complaint. Based on user feedback, the Nepomuk indexing and search has been replaced with the Baloo framework [22] based on Xapian. [23]

See also

Related Research Articles

An integrated development environment (IDE) is a software application that provides comprehensive facilities for software development. An IDE normally consists of at least a source-code editor, build automation tools, and a debugger. Some IDEs, such as NetBeans and Eclipse, contain the necessary compiler, interpreter, or both; others, such as SharpDevelop and Lazarus, do not.

TrueType is an outline font standard developed by Apple in the late 1980s as a competitor to Adobe's Type 1 fonts used in PostScript. It has become the most common format for fonts on the classic Mac OS, macOS, and Microsoft Windows operating systems.

In computing, cross-platform software is computer software that is designed to work in several computing platforms. Some cross-platform software requires a separate build for each platform, but some can be directly run on any platform without special preparation, being written in an interpreted language or compiled to portable bytecode for which the interpreters or run-time packages are common or standard components of all supported platforms.

FileMaker is a cross-platform relational database application from Claris International, a subsidiary of Apple Inc. It integrates a database engine with a graphical user interface (GUI) and security features, allowing users to modify a database by dragging new elements into layouts, screens, or forms. It is available in desktop, server, iOS and web-delivery configurations.

<span class="mw-page-title-main">Preview (macOS)</span> Image and PDF viewer software by Apple

Preview is the built-in image viewer and PDF viewer of the macOS operating system. In addition to viewing and printing digital images and Portable Document Format (PDF) files, it can also edit these media types. It employs the Aqua graphical user interface, the Quartz graphics layer, and the ImageIO and Core Image frameworks.

<span class="mw-page-title-main">Sherlock (software)</span>

Sherlock, named after fictional detective Sherlock Holmes, was a file and web search tool created by Apple Inc. for the PowerPC-based "classic" Mac OS, introduced with Mac OS 8 as an extension of the Mac OS Finder's file searching capabilities. Like its predecessor, Sherlock searched for local files and file contents, using the same basic indexing code and search logic found in AppleSearch. Sherlock extended the system by enabling the user to search for items through the World Wide Web through a set of plug-ins which employed existing web search engines. These plug-ins were written as plain text files, so that it was a simple task for a user to write a Sherlock plug-in.

<span class="mw-page-title-main">Mac OS 8</span> 1997 Classic Mac OS operating system by Apple and eighth major release

Mac OS 8 is an operating system that was released by Apple Computer on July 26, 1997. It includes the largest overhaul of the classic Mac OS experience since the release of System 7, approximately six years before. It places a greater emphasis on color than prior versions. Released over a series of updates, Mac OS 8 represents an incremental integration of many of the technologies which had been developed from 1988 to 1996 for Apple's overly ambitious OS named Copland. Mac OS 8 helped modernize the Mac OS while Apple developed its next-generation operating system, Mac OS X.

<span class="mw-page-title-main">Spotlight (Apple)</span>

Spotlight is a system-wide desktop search feature of Apple's macOS and iOS operating systems. Spotlight is a selection-based search system, which creates an index of all items and files on the system. It is designed to allow the user to quickly locate a wide variety of items on the computer, including documents, pictures, music, applications, and System Settings. In addition, specific words in documents and in web pages in a web browser's history or bookmarks can be searched. It also allows the user to narrow down searches with creation dates, modification dates, sizes, types and other attributes. Spotlight also offers quick access to definitions from the built-in New Oxford American Dictionary and to calculator functionality. There are also command-line tools to perform functions such as Spotlight searches.

<span class="mw-page-title-main">Google Desktop</span> Computer program

Google Desktop was a computer program with desktop search capabilities, created by Google for Linux, Apple Mac OS X, and Microsoft Windows systems. It allowed text searches of a user's email messages, computer files, music, photos, chats, Web pages viewed, and the ability to display "Google Gadgets" on the user's desktop in a Sidebar.

<span class="mw-page-title-main">Beagle (software)</span> Search system for Unix

Beagle is a search system for Linux and other Unix-like systems, enabling the user to search documents, chat logs, email and contact lists. It is not actively developed.

In computing, a virtual folder generally denotes an organizing principle for files that is not dependent on location in a hierarchical directory tree. Instead, it consists of software that coalesces results from a data store, which may be a database or a custom index, and presents them visually in the format in which folder views are presented. A virtual folder can be thought of as a view that lists all files tagged with a certain tag, and thus a simulation of a folder whose dynamic contents can be assembled on the fly, when requested. It is related in concept to several other topics in computer science, with names including saved search, saved query, and filtering.

An IFilter is a plugin that allows Microsoft's search engines to index various file formats so that they become searchable. Without an appropriate IFilter, contents of a file cannot be parsed and indexed by the search engine.

Parallels Desktop for Mac is software providing hardware virtualization for Macintosh computers with Intel processors, and since version 16.5 also for Apple silicon-based Macintosh computers. It is developed by Parallels, since 2018 a subsidiary of Corel.

<span class="mw-page-title-main">Trash (computing)</span> Temporary storage for deleted files

In computing, the trash, also known by other names such as dustbin, wastebasket, and others, is a graphical user interface desktop metaphor for temporary storage for files set aside by the user for deletion, but not yet permanently erased. The concept and name is part of Mac operating systems, a similar implementation is called the Recycle Bin in Microsoft Windows, and other operating systems use other names.

The usage share of operating systems is the percentage of computing devices that run each operating system (OS) at any particular time. All such figures are necessarily estimates because data about operating system share is difficult to obtain. There are few reliable primary sources and no agreed methodologies for its collection. Operating systems are used in the vast majority of computers, from embedded devices to supercomputers.

<span class="mw-page-title-main">Windows Search</span> Desktop search platform by Microsoft

Windows Search is a content index desktop search platform by Microsoft introduced in Windows Vista as a replacement for both the previous Indexing Service of Windows 2000 and the optional MSN Desktop Search for Windows XP and Windows Server 2003, designed to facilitate local and remote queries for files and non-file items in compatible applications including Windows Explorer. It was developed after the postponement of WinFS and introduced to Windows constituents originally touted as benefits of that platform.

<span class="mw-page-title-main">Red Star OS</span> North Korean Linux-based operating system

Red Star OS is a North Korean Linux distribution, with development first starting in 1998 at the Korea Computer Center (KCC). Prior to its release, computers in North Korea typically used Red Hat Linux and later switched to modified versions of Microsoft Windows with North Korean language packs installed.

Comparison of user features of operating systems refers to a comparison of the general user features of major operating systems in a narrative format. It does not encompass a full exhaustive comparison or description of all technical details of all operating systems. It is a comparison of basic roles and the most prominent features. It also includes the most important features of the operating system's origins, historical development, and role.

References

  1. "What do you do for desktop search in VDI and RDSH?". Blogpost by Brian Madden on brainmadden.com. Retrieved on March 25, 2015.
  2. Anthony Ha (2 June 2008). "Lookeen offers a new way for Outlook users to search". VentureBeat. Retrieved 8 March 2016.
  3. Robert L. Mitchell (8 May 2013). "X1 rises again with Desktop Search 8, Virtual Edition". Computerworld. Retrieved 24 June 2015.
  4. "Security special report: Who sees your data?", Computer Weekly, 2006-04-25.
  5. "BBC NEWS - Technology - Search wars hit desktop computers". bbc.co.uk. 26 October 2004. Retrieved 24 June 2015.
  6. "KMWorld - The Evolution of Desktop Search" . Retrieved 7 January 2019..
  7. "dtSearch UK Blog - Desktop Wars" . Retrieved 8 January 2019.
  8. "SearchMax". goebelgroup.com. Archived from the original on 27 December 2013. Retrieved 24 June 2015.
  9. "Everything Search Engine". voidtools. Retrieved 27 December 2013.
  10. "Vegnos". Vegnos. Retrieved 27 December 2013.
  11. Niall Kennedy (17 October 2006). "The current state of video search". Niall Kennedy. Retrieved 24 June 2015.
  12. Niall Kennedy (15 October 2006). "The current state of audio search". Niall Kennedy. Retrieved 24 June 2015.
  13. "Indexing Service". microsoft.com. Microsoft. Retrieved 24 June 2015.
  14. "Indexing with Microsoft Index Server". microsoft.com. Microsoft. Retrieved 24 June 2015.
  15. "Windows Search: Technical FAQ". microsoft.com. Microsoft. Archived from the original on 24 September 2011. Retrieved 24 June 2015.
  16. "AppleSearch". infomotions.com. Retrieved 24 June 2015.
  17. eduardo casais. "Converter of current to real US dollars - using the GDP deflator". areppim.com. Retrieved 24 June 2015.
  18. "Apple - Press Info - Apple to Ship Mac OS X "Tiger" on April 29". apple.com. Retrieved 24 June 2015.
  19. "A first look at Tracker 0.6.0". Ars Technica. 26 July 2007. Retrieved 24 June 2015.
  20. "Recoll user manual". lesbonscomptes.com. Retrieved 24 June 2015.
  21. "Linux.com" . Retrieved 24 June 2015.
  22. "Baloo - KDE Community Wiki".
  23. "Home". opensuse.org.