Semantic desktop

Last updated

In computer science, the semantic desktop is a collective term for ideas related to changing a computer's user interface and data handling capabilities so that data are more easily shared between different applications or tasks and so that data that once could not be automatically processed by a computer could be. It also encompasses some ideas about being able to share information automatically between different people. This concept is very much related to the Semantic Web, but is distinct insofar as its main concern is the personal use of information.

Contents

Problems to solve

The vision of the semantic desktop can be considered as a response to the perceived problems of existing user interfaces.

Metadata

Without good metadata, computers cannot easily learn many commonly needed attributes about files. For example, suppose one downloads a document by a particular author on a particular subject – though the document will likely clearly indicate its subject, author, source and possibly copyright information there may be no easy way for the computer to obtain this information and process it across applications like file managers, desktop search engines, and other services. This means the computer cannot search, filter or otherwise act upon the information as effectively as it otherwise could. This is very much the problem that the Semantic Web is concerned with.

File structure

Researchers in the iMemex project provide the following query examples: [1]

  1. "Show me all LaTeX 'Introduction' sections pertaining to project PIM that contain the phrase 'Mike Franklin'."
  2. "Show me all documents pertaining to project 'OLAP' that have a figure containing the phrase 'Indexing Time' in its label."

Both of these queries need to parse the file structure, the first one to find a section in a LaTeX document, the second one to find figures and their labels in documents of any format, both of which current OSs don't know how to do.

Inside-outside file boundary

A user might want te relate in a single query information that is maintained by the file system, such as placement in a folder, and information that is inside a file. With current technology, this query cannot be issued in one single request.

In query example 1 above, the project information is only materialized in the folder hierarchy; the rest of the filters relate to the inside of the file, and some of it needs to parse the file structure (see above). This leads to performing a first query in the file system and further search inside a file.

Data-application coupling

There is also the problem of relating different files with each other. For example, on operating systems such as Unix, e-mails are stored separately from files. Neither has anything to do with tasks, notes or planned activities that may be stored in a calendar program. Contacts might be stored in another program. However, all these forms of information might simultaneously be relevant and necessary for a particular task.

Data locality and sharing

Related to this, a user will often access a lot of data from the Internet which are segregated from the data stored locally on the computer and accessed through a browser or other program. Researchers in the iMemex project provide the example of searching both in the local folder hierarchy and also in email attachments, which are located on an IMAP server [1] (see above, query example 2). In addition, the folder hierarchies are often different on both systems.

As well as accessing data, a user has to share data, often through e-mail or separate file transfer programs.

Definition

The semantic desktop is an attempt to solve some or all of these problem by extending the operating system's capabilities to handle all data using Semantic Web technologies. Based on this data integration, improved user interfaces (or plugins to existing applications) can give the user an integrated view on stored knowledge.

Sauermann et al. proposed a definition of Semantic Desktop in 2005:

A Semantic Desktop is a device in which an individual stores all her digital information like documents, multimedia and messages. These are interpreted as Semantic Web resources, each is identified by a Uniform Resource Identifier (URI) and all data is accessible and queryable as Resource Description Framework (RDF) graph. Resources from the web can be stored and authored content can be shared with others. Ontologies allow the user to express personal mental models and form the semantic glue interconnecting information and systems. Applications respect this and store, read and communicate via ontologies and Semantic Web protocols. The Semantic Desktop is an enlarged supplement to the user's memory. [2]

Different interpretations of the semantic desktop

There are various interpretations of the semantic desktop. At its most limited state it might be interpreted as adding mechanisms for relating machine readable metadata to files. In a more extreme way it could be viewed as a complete replacement to existing user interfaces, which unifies all forms of data and provides a consistent single interface. There are many degrees between these two depending on which of the above problems are being dealt with.

Standardization effort

To foster interoperability between different implementations and publish standards, the community around the Nepomuk project founded the OSCA Foundation (OSCAF) [3] in 2008. Since June 2009, the developers from the Nepomuk-KDE communities and Xesam collaborate with OSCAF to help standardizing the data formats for KDE, GNOME and freedesktop.org. The Nepomuk/OSCAF standards are taken up by these projects and Nokia's Maemo Platform. [4]

Relationship with other concepts

Semantic Web

The Semantic Web is mainly concerned with making machine readable metadata to enable computers to process shared information, and the creation of formats and standards related to this. As such the aims of allowing more of a user's data to be processed by a computer and allowing data to more easily be shared could be considered as a subset of those of the Semantic Web, but extended to a user's local computer, rather than just files stored on the Internet.

However the aims of creating a unified interface and allowing data to be accessed in a format independent way are not really the concerns of the Semantic Web.

In practice most projects related to the semantic desktop make use of Semantic Web protocols for storing their data. In particular RDF's concepts are used, and the format itself is used.

Semantic file systems

Semantic file systems allow the user to query files by semantic metadata. As such, they can be considered a part of the semantic desktop.

Some operating systems such as BeOS include a semantic file system, which is a move towards a more semantic desktop.

See also

Related Research Articles

<span class="mw-page-title-main">Konqueror</span> Web browser and file manager

Konqueror is a free and open-source web browser and file manager that provides web access and file-viewer functionality for file systems. It forms a core part of the KDE Software Compilation. Developed by volunteers, Konqueror can run on most Unix-like operating systems. The KDE community licenses and distributes Konqueror under GNU GPL-2.0-or-later.

<span class="mw-page-title-main">Semantic Web</span> Extension of the Web to facilitate data exchange

The Semantic Web, sometimes known as Web 3.0, is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable.

<span class="mw-page-title-main">File Explorer</span> File manager application that is included with releases of the Microsoft Windows operating system

File Explorer, previously known as Windows Explorer, is a file manager application and default desktop environment that is included with releases of the Microsoft Windows operating system from Windows 95 onwards. It provides a graphical user interface for accessing the file systems, as well as user interface elements such as the taskbar and desktop.

WinFS was the code name for a canceled data storage and management system project based on relational databases, developed by Microsoft and first demonstrated in 2003. It was intended as an advanced storage subsystem for the Microsoft Windows operating system, designed for persistence and management of structured, semi-structured and unstructured data.

<span class="mw-page-title-main">Desktop search</span>

Desktop search tools search within a user's own computer files as opposed to searching the Internet. These tools are designed to find information on the user's PC, including web browser history, e-mail archives, text documents, sound files, images, and video. A variety of desktop search programs are now available; see this list for examples. Most desktop search programs are standalone applications. Desktop search products are software alternatives to the search software included in the operating system, helping users sift through desktop files, emails, attachments, and more.

Desktop organizer software applications are applications that automatically create useful organizational structures from desktop content, including heterogeneous types of content including email, files, contacts, companies, RSS news feeds, photos, music and chat sessions. The organization is based on a combination of automated scanning of metadata similar to data mining and manual tagging of content. The metadata stored in applications is correlated based on a structure for the data type handled by the organizer tool. For example, the email address of a sender of an email allows the email to be filed in a virtual folder for the author and company the author works for or a music file is filed by the musician and album label. The resulting visualization simplifies use of desktop content to navigate, search, and use related information stored on the desktop computer. The data in desktop organizer tools is normally stored in a database rather than the computer's file system in order to produce virtual folders where the same item can appear in multiple folders to the user based on its relationship to the folder.

In computing, a virtual folder generally denotes an organizing principle for files that is not dependent on location in a hierarchical directory tree. Instead, it consists of software that coalesces results from a data store, which may be a database or a custom index, and presents them visually in the format in which folder views are presented. A virtual folder can be thought of as a view that lists all files tagged with a certain tag, and thus a simulation of a folder whose dynamic contents can be assembled on the fly, when requested. It is related in concept to several other topics in computer science, with names including saved search, saved query, and filtering.

<span class="mw-page-title-main">KDE Software Compilation 4</span> Software

KDE Software Compilation 4 was the only series of the so-called KDE Software Compilation, first released in January 2008 and the last release being 4.14.3 released in November 2014. It was the follow-up to K Desktop Environment 3. Following KDE SC 4, the compilation was broken up into basic framework libraries, desktop environment and applications, which are termed KDE Frameworks 5, KDE Plasma 5 and KDE Applications, respectively.

A semantic wiki is a wiki that has an underlying model of the knowledge described in its pages. Regular, or syntactic, wikis have structured text and untyped hyperlinks. Semantic wikis, on the other hand, provide the ability to capture or identify information about the data within pages, and the relationships between pages, in ways that can be queried or exported like a database through semantic queries.

An IFilter is a plugin that allows Microsoft's search engines to index various file formats so that they become searchable. Without an appropriate IFilter, contents of a file cannot be parsed and indexed by the search engine.

Metadata publishing is the process of making metadata data elements available to external users, both people and machines using a formal review process and a commitment to change control processes.

Compared with previous versions of Microsoft Windows, features new to Windows Vista are numerous, covering most aspects of the operating system, including additional management features, new aspects of security and safety, new I/O technologies, new networking features, and new technical features. Windows Vista also removed some others.

Geospatial metadata is a type of metadata applicable to geographic data and information. Such objects may be stored in a geographic information system (GIS) or may simply be documents, data-sets, images or other objects, services, or related items that exist in some other native environment but whose features may be appropriate to describe in a (geographic) metadata catalog.

Semantic file systems are file systems used for information persistence which structure the data according to their semantics and intent, rather than the location as with current file systems. It allows the data to be addressed by their content. Traditional hierarchical file-systems tend to impose a burden, for example when a sub-directory layout is contradicting a user's perception of where files would be stored. Having a tag-based interface alleviates this hierarchy problem and enables users to query for data in an intuitive fashion.

NEPOMUK is an open-source software specification that is concerned with the development of a social semantic desktop that enriches and interconnects data from different desktop applications using semantic metadata stored as RDF. Between 2006 and 2008 it was funded by a European Union research project of the same name that grouped together industrial and academic actors to develop various Semantic Desktop technologies.

Strigi was a file indexing and file search framework adopted by KDE SC. Strigi was initiated by Jos van den Oever. Strigi's goals are to be fast, use a small amount of RAM, and use flexible backends and plug-ins. A benchmark as of January 2007 showed that Strigi is faster and uses less memory than other search systems, but it lacks many of their features. Like most desktop search systems, Strigi can extract information from files, such as the length of an audio clip, the contents of a document, or the resolution of a picture; plugins determine what filetypes it is capable of handling. Strigi uses its own Jstream system which allows for deep indexing of files. Strigi is accessible via Konqueror, or by clicking on its icon, after adding it to KDE's Kicker or GNOME Panel. The graphical user interface (GUI) is named Strigiclient.

<span class="mw-page-title-main">Windows Search</span> Desktop search platform by Microsoft

Windows Search is a content index and desktop search platform by Microsoft introduced in Windows Vista as a replacement for the previous Indexing Service of Windows 2000, Windows XP, and Windows Server 2003, designed to facilitate local and remote queries for files and non-file items in the Windows Shell and in compatible applications. It was developed after the postponement of WinFS and introduced to Windows several benefits of that platform.

BisQue is a free, open source web-based platform for the exchange and exploration of large, complex datasets. It is being developed at the Vision Research Lab at the University of California, Santa Barbara. BisQue specifically supports large scale, multi-dimensional multimodal-images and image analysis. Metadata is stored as arbitrarily nested and linked tag/value pairs, allowing for domain-specific data organization. Image analysis modules can be added to perform complex analysis tasks on compute clusters. Analysis results are stored within the database for further querying and processing. The data and analysis provenance is maintained for reproducibility of results. BisQue can be easily deployed in cloud computing environments or on computer clusters for scalability. BisQue has been integrated into the NSF Cyberinfrastructure project CyVerse. The user interacts with BisQue via any modern web browser.

References

  1. 1 2 Dittrich, Jens-Peter; Vaz Salles, Marcos Antonio (2006). "iDM: A Unified and Versatile Data Model for Personal Dataspace Management" (PDF). International Conference on Very Large Databases: 367–378.
  2. Sauermann, Leo; Bernardi, Ansgar; Dengel, Andreas (2005). "Overview and Outlook on the Semantic Desktop" (PDF). ISWC. 175: 1–19.
  3. "OSCA Foundation". OSCA Foundation. Archived from the original on 2014-01-02.{{cite web}}: CS1 maint: unfit URL (link)
  4. "OSCAF ontologies suited for Nokia's Maemo platform". OSCA Foundation. Archived from the original on 2013-11-27.{{cite web}}: CS1 maint: unfit URL (link)

Open Source Implementations