In information systems, a tag is a keyword or term assigned to a piece of information (such as an Internet bookmark, multimedia, database record, or computer file). This kind of metadata helps describe an item and allows it to be found again by browsing or searching. [1] Tags are generally chosen informally and personally by the item's creator or by its viewer, depending on the system, although they may also be chosen from a controlled vocabulary. [2] : 68
Tagging was popularized by websites associated with Web 2.0 and is an important feature of many Web 2.0 services. [2] [3] It is now also part of other database systems, desktop applications, and operating systems. [4]
People use tags to aid classification, mark ownership, note boundaries, and indicate online identity. Tags may take the form of words, images, or other identifying marks. An analogous example of tags in the physical world is museum object tagging. People were using textual keywords to classify information and objects long before computers. Computer based search algorithms made the use of such keywords a rapid way of exploring records.
Tagging gained popularity due to the growth of social bookmarking, image sharing, and social networking websites. [2] These sites allow users to create and manage labels (or "tags") that categorize content using simple keywords. Websites that include tags often display collections of tags as tag clouds, [lower-alpha 1] as do some desktop applications. [lower-alpha 2] On websites that aggregate the tags of all users, an individual user's tags can be useful both to them and to the larger community of the website's users.
Tagging systems have sometimes been classified into two kinds: top-down and bottom-up. [3] : 142 [4] : 24 Top-down taxonomies are created by an authorized group of designers (sometimes in the form of a controlled vocabulary), whereas bottom-up taxonomies (called folksonomies) are created by all users. [3] : 142 This definition of "top down" and "bottom up" should not be confused with the distinction between a single hierarchical tree structure (in which there is one correct way to classify each item) versus multiple non-hierarchical sets (in which there are multiple ways to classify an item); the structure of both top-down and bottom-up taxonomies may be either hierarchical, non-hierarchical, or a combination of both. [3] : 142–143 Some researchers and applications have experimented with combining hierarchical and non-hierarchical tagging to aid in information retrieval. [7] [8] [9] Others are combining top-down and bottom-up tagging, [10] including in some large library catalogs (OPACs) such as WorldCat. [11] [12] : 74 [13] [14]
When tags or other taxonomies have further properties (or semantics) such as relationships and attributes, they constitute an ontology. [3] : 56–62
Metadata tags as described in this article should not be confused with the use of the word "tag" in some software to refer to an automatically generated cross-reference; examples of the latter are tags tables in Emacs [15] and smart tags in Microsoft Office. [16]
The use of keywords as part of an identification and classification system long predates computers. Paper data storage devices, notably edge-notched cards, that permitted classification and sorting by multiple criteria were already in use prior to the twentieth century, and faceted classification has been used by libraries since the 1930s.
In the late 1970s and early 1980s, Emacs, the text editor for Unix systems, offered a companion software program called Tags that could automatically build a table of cross-references called a tags table that Emacs could use to jump between a function call and that function's definition. [17] This use of the word "tag" did not refer to metadata tags, but was an early use of the word "tag" in software to refer to a word index.
Online databases and early websites deployed keyword tags as a way for publishers to help users find content. In the early days of the World Wide Web, the keywords
meta element was used by web designers to tell web search engines what the web page was about, but these keywords were only visible in a web page's source code and were not modifiable by users.
In 1997, the collaborative portal "A Description of the Equator and Some ØtherLands" produced by documenta X, Germany, used the folksonomic term Tag for its co-authors and guest authors on its Upload page. [18] In "The Equator" the term Tag for user-input was described as an abstract literal or keyword to aid the user. However, users defined singular Tags, and did not share Tags at that point.
In 2003, the social bookmarking website Delicious provided a way for its users to add "tags" to their bookmarks (as a way to help find them later); [2] : 162 Delicious also provided browseable aggregated views of the bookmarks of all users featuring a particular tag. [19] Within a couple of years, the photo sharing website Flickr allowed its users to add their own text tags to each of their pictures, constructing flexible and easy metadata that made the pictures highly searchable. [20] The success of Flickr and the influence of Delicious popularized the concept, [21] and other social software websites—such as YouTube, Technorati, and Last.fm—also implemented tagging. [22] In 2005, the Atom web syndication standard provided a "category" element for inserting subject categories into web feeds, and in 2007 Tim Bray proposed a "tag" URN. [23]
Many systems (and other web content management systems) allow authors to add free-form tags to a post, along with (or instead of) placing the post into a predetermined category. [lower-alpha 1] For example, a post may display that it has been tagged with baseball
and tickets
. Each of those tags is usually a web link leading to an index page listing all of the posts associated with that tag. The blog may have a sidebar listing all the tags in use on that blog, with each tag leading to an index page. To reclassify a post, an author edits its list of tags. All connections between posts are automatically tracked and updated by the blog software; there is no need to relocate the page within a complex hierarchy of categories.
Some desktop applications and web applications feature their own tagging systems, such as email tagging in Gmail and Mozilla Thunderbird, [12] : 73 bookmark tagging in Firefox, [24] audio tagging in iTunes or Winamp, and photo tagging in various applications. [25] Some of these applications display collections of tags as tag clouds. [lower-alpha 2]
There are various systems for applying tags to the files in a computer's file system.
In Apple's Mac System 7, released in 1991, users could assign one of seven editable colored labels (with editable names such as "Essential", "Hot", and "In Progress") to each file and folder. [26] In later iterations of the Mac operating system ever since OS X 10.9 was released in 2013, users could assign multiple arbitrary tags as extended file attributes to any file or folder, [27] and before that time the open-source OpenMeta standard provided similar tagging functionality for Mac OS X. [28]
Several semantic file systems that implement tags are available for the Linux kernel, including Tagsistant. [29]
Microsoft Windows allows users to set tags only on Microsoft Office documents and some kinds of picture files. [30]
Cross-platform file tagging standards include Extensible Metadata Platform (XMP), an ISO standard for embedding metadata into popular image, video and document file formats, such as JPEG and PDF, without breaking their readability by applications that do not support XMP. [31] XMP largely supersedes the earlier IPTC Information Interchange Model. Exif is a standard that specifies the image and audio file formats used by digital cameras, including some metadata tags. [32] TagSpaces is an open-source cross-platform application for tagging files; it inserts tags into the filename. [33]
An official tag is a keyword adopted by events and conferences for participants to use in their web publications, such as blog entries, photos of the event, and presentation slides. [34] Search engines can then index them to make relevant materials related to the event searchable in a uniform way. In this case, the tag is part of a controlled vocabulary.
A researcher may work with a large collection of items (e.g. press quotes, a bibliography, images) in digital form. If he/she wishes to associate each with a small number of themes (e.g. to chapters of a book, or to sub-themes of the overall subject), then a group of tags for these themes can be attached to each of the items in the larger collection. [35] In this way, freeform classification allows the author to manage what would otherwise be unwieldy amounts of information. [36]
A triple tag or machine tag uses a special syntax to define extra semantic information about the tag, making it easier or more meaningful for interpretation by a computer program. [37] Triple tags comprise three parts: a namespace, a predicate, and a value. For example, geo:long=50.123456
is a tag for the geographical longitude coordinate whose value is 50.123456. This triple structure is similar to the Resource Description Framework model for information.
The triple tag format was first devised for geolicious in November 2004, [38] to map Delicious bookmarks, and gained wider acceptance after its adoption by Mappr and GeoBloggers to map Flickr photos. [39] In January 2007, Aaron Straup Cope at Flickr introduced the term machine tag as an alternative name for the triple tag, adding some questions and answers on purpose, syntax, and use. [40]
Specialized metadata for geographical identification is known as geotagging ; machine tags are also used for other purposes, such as identifying photos taken at a specific event or naming species using binomial nomenclature. [41]
A hashtag is a kind of metadata tag marked by the prefix #
, sometimes known as a "hash" symbol. This form of tagging is used on microblogging and social networking services such as Twitter, Facebook, Google+, VK and Instagram. The hash is used to distinguish tag text, as distinct, from other text in the post.
A knowledge tag is a type of meta-information that describes or defines some aspect of a piece of information (such as a document, digital image, database table, or web page). [42] Knowledge tags are more than traditional non-hierarchical keywords or terms; they are a type of metadata that captures knowledge in the form of descriptions, categorizations, classifications, semantics, comments, notes, annotations, hyperdata, hyperlinks, or references that are collected in tag profiles (a kind of ontology). [42] These tag profiles reference an information resource that resides in a distributed, and often heterogeneous, storage repository. [42]
Knowledge tags are part of a knowledge management discipline that leverages Enterprise 2.0 methodologies for users to capture insights, expertise, attributes, dependencies, or relationships associated with a data resource. [3] : 251 [43] Different kinds of knowledge can be captured in knowledge tags, including factual knowledge (that found in books and data), conceptual knowledge (found in perspectives and concepts), expectational knowledge (needed to make judgments and hypothesis), and methodological knowledge (derived from reasoning and strategies). [43] These forms of knowledge often exist outside the data itself and are derived from personal experience, insight, or expertise. Knowledge tags are considered an expansion of the information itself that adds additional value, context, and meaning to the information. Knowledge tags are valuable for preserving organizational intelligence that is often lost due to turnover, for sharing knowledge stored in the minds of individuals that is typically isolated and unharnessed by the organization, and for connecting knowledge that is often lost or disconnected from an information resource. [44]
In a typical tagging system, there is no explicit information about the meaning or semantics of each tag, and a user can apply new tags to an item as easily as applying older tags. [2] Hierarchical classification systems can be slow to change, and are rooted in the culture and era that created them; in contrast, the flexibility of tagging allows users to classify their collections of items in the ways that they find useful, but the personalized variety of terms can present challenges when searching and browsing.
When users can freely choose tags (creating a folksonomy, as opposed to selecting terms from a controlled vocabulary), the resulting metadata can include homonyms (the same tags used with different meanings) and synonyms (multiple tags for the same concept), which may lead to inappropriate connections between items and inefficient searches for information about a subject. [45] For example, the tag "orange" may refer to the fruit or the color, and items related to a version of the Linux kernel may be tagged "Linux", "kernel", "Penguin", "software", or a variety of other terms. Users can also choose tags that are different inflections of words (such as singular and plural), [46] which can contribute to navigation difficulties if the system does not include stemming of tags when searching or browsing. Larger-scale folksonomies address some of the problems of tagging, in that users of tagging systems tend to notice the current use of "tag terms" within these systems, and thus use existing tags in order to easily form connections to related items. In this way, folksonomies may collectively develop a partial set of tagging conventions.
Despite the apparent lack of control, research has shown that a simple form of shared vocabulary emerges in social bookmarking systems. Collaborative tagging exhibits a form of complex systems dynamics (or self-organizing dynamics). [47] Thus, even if no central controlled vocabulary constrains the actions of individual users, the distribution of tags converges over time to stable power law distributions. [47] Once such stable distributions form, simple folksonomic vocabularies can be extracted by examining the correlations that form between different tags. In addition, research has suggested that it is easier for machine learning algorithms to learn tag semantics when users tag "verbosely"—when they annotate resources with a wealth of freely associated, descriptive keywords. [48]
Tagging systems open to the public are also open to tag spam, in which people apply an excessive number of tags or unrelated tags to an item (such as a YouTube video) in order to attract viewers. This abuse can be mitigated using human or statistical identification of spam items. [49] The number of tags allowed may also be limited to reduce spam.
Some tagging systems provide a single text box to enter tags, so to be able to tokenize the string, a separator must be used. Two popular separators are the space character and the comma. To enable the use of separators in the tags, a system may allow for higher-level separators (such as quotation marks) or escape characters. Systems can avoid the use of separators by allowing only one tag to be added to each input widget at a time, although this makes adding multiple tags more time-consuming.
A syntax for use within HTML is to use the rel-tag microformat which uses the rel attribute with value "tag" (i.e., rel="tag"
) to indicate that the linked-to page acts as a tag for the current context. [50]
The Semantic Web, sometimes known as Web 3.0, is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable.
Delicious was a social bookmarking web service for storing, sharing, and discovering web bookmarks. The site was founded by Joshua Schachter and Peter Gadjokov in 2003 and acquired by Yahoo! in 2005. By the end of 2008, the service claimed more than 5.3 million users and 180 million unique bookmarked URLs. Yahoo sold Delicious to AVOS Systems in April 2011, and the site relaunched in a "back to beta" state on September 27 that year. In May 2014, AVOS sold the site to Science Inc. In January 2016 Delicious Media, a new alliance, reported it had assumed control of the service.
Furl was a free social bookmarking website that allowed members to store searchable copies of webpages and share them with others. Every member received 5 gigabytes of storage space. The site was founded by Mike Giles in 2003 and purchased by LookSmart in September 2004. Diigo bought it from LookSmart in exchange for equity.
Social bookmarking is an online service which allows users to add, annotate, edit, and share bookmarks of web documents. Many online bookmark management services have launched since 1996; Delicious, founded in 2003, popularized the terms "social bookmarking" and "tagging". Tagging is a significant feature of social bookmarking systems, allowing users to organize their bookmarks and develop shared vocabularies known as folksonomies.
Image sharing, or photo sharing, is the publishing or transfer of digital photos online. Image sharing websites offer services such as uploading, hosting, managing and sharing of photos. This function is provided through both websites and applications that facilitate the upload and display of images. The term can also be loosely applied to the use of online photo galleries that are set up and managed by individual users, including photoblogs. Sharing means that other users can view but not necessarily download images, and users can select different copyright options for their images.
CiteULike was a web service which allowed users to save and share citations to academic papers. Based on the principle of social bookmarking, the site worked to promote and to develop the sharing of scientific references amongst researchers. In the same way that it is possible to catalog web pages or photographs, scientists could share citation information using CiteULike. Richard Cameron developed CiteULike in November 2004 and in 2006 Oversity Ltd. was established to develop and support CiteULike. In February 2019, CiteULike announced that it would be ceasing operations as of March 30, 2019.
Desktop organizer software applications are applications that automatically create useful organizational structures from desktop content, including heterogeneous types of content including email, files, contacts, companies, RSS news feeds, photos, music and chat sessions. The organization is based on a combination of automated scanning of metadata similar to data mining and manual tagging of content. The metadata stored in applications is correlated based on a structure for the data type handled by the organizer tool. For example, the email address of a sender of an email allows the email to be filed in a virtual folder for the author and company the author works for or a music file is filed by the musician and album label. The resulting visualization simplifies use of desktop content to navigate, search, and use related information stored on the desktop computer. The data in desktop organizer tools is normally stored in a database rather than the computer's file system in order to produce virtual folders where the same item can appear in multiple folders to the user based on its relationship to the folder.
Connotea was a free online reference management service for scientists, researchers, and clinicians, created in December 2004 by Nature Publishing Group and discontinued in March 2013. It was one of a breed of social bookmarking tools, similar to CiteULike and del.icio.us, where users can save links to their favourite websites. ReadCube is a similar free service that offers storage, annotation and sharing tools specifically for scientific documents.
Geotagging, or GeoTagging, is the process of adding geographical identification metadata to various media such as a geotagged photograph or video, websites, SMS messages, QR Codes or RgSSfeeds and is a form of geospatial metadata. This data usually consists of latitude and longitude coordinates, though they can also include altitude, bearing, distance, accuracy data, and place names, and perhaps a time stamp.
A tag cloud is a visual representation of text data which is often used to depict keyword metadata on websites, or to visualize free form text. Tags are usually single words, and the importance of each tag is shown with font size or color. When used as website navigation aids, the terms are hyperlinked to items associated with the tag.
An image organizer or image management application is application software for organising digital images. It is a kind of desktop organizer software application.
HCL Connections is a Web 2.0 enterprise social software application developed originally by IBM and acquired by HCL Technologies in July 2019. Connections is an enterprise-collaboration platform which aims to helps teams work more efficiently. Connections is part of HCL collaboration suite which also includes Notes / Domino, Sametime, Portal and Connections.
Metadata is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including:
ApexKB, is a discontinued free and open-source script for collaborative search and knowledge management powered by a shared enterprise bookmarking engine that is a fork of KnowledgebasePublisher. It was publicly announced on 29 September 2008. A stable version of Jumper was publicly released under the GNU General Public License and made available on SourceForge on 26 March 2009.
Enterprise bookmarking is a method for Web 2.0 users to tag, organize, store, and search bookmarks of both web pages on the Internet and data resources stored in a distributed database or fileserver. This is done collectively and collaboratively in a process by which users add tag (metadata) and knowledge tags.
The following tables compare enterprise bookmarking platforms.
Folksonomy is a classification system in which end users apply public tags to online items, typically to make those items easier for themselves or others to find later. Over time, this can give rise to a classification system based on those tags and how often they are applied or searched for, in contrast to a taxonomic classification designed by the owners of the content and specified when it is published. This practice is also known as collaborative tagging, social classification, social indexing, and social tagging. Folksonomy was originally "the result of personal free tagging of information [...] for one's own retrieval", but online sharing and interaction expanded it into collaborative forms. Social tagging is the application of tags in an open online environment where the tags of other users are available to others. Collaborative tagging is tagging performed by a group of users. This type of folksonomy is commonly used in cooperative and collaborative projects such as research, content repositories, and social bookmarking.
Reverse image search is a content-based image retrieval (CBIR) query technique that involves providing the CBIR system with a sample image that it will then base its search upon; in terms of information retrieval, the sample image is very useful. In particular, reverse image search is characterized by a lack of search terms. This effectively removes the need for a user to guess at keywords or terms that may or may not return a correct result. Reverse image search also allows users to discover content that is related to a specific sample image or the popularity of an image, and to discover manipulated versions and derivative works.
Social navigation is a form of social computing introduced by Paul Dourish and Matthew Chalmers in 1994, who defined it as when "movement from one item to another is provoked as an artifact of the activity of another or a group of others". According to later research in 2002, "social navigation exploits the knowledge and experience of peer users of information resources" to guide users in the information space, and that it is becoming more difficult to navigate and search efficiently with all the digital information available from the World Wide Web and other sources. Studying others' navigational trails and understanding their behavior can help improve one's own search strategy by guiding them to make more informed decisions based on the actions of others.
As with all the other options here, meta data can be added to individual files to help improve their find-ability, and uniquely the tag cloud field within Leap's interface allows you to quickly drill down to individually labelled files without fuss.
Calling a function defined in one compilation unit from within another is analogous to cross references in large hypertext documents. By using tags tables, the Emacs environment enables the user to turn program source code into powerful hypertext documents.
You can turn on smart tags for a field to make it easier to cross-reference data between the Access database and Microsoft Outlook (or another personal information and e-mail program) and the Web.
EMACS is an M.I.T. display editor designed to be 'extensible, customizable, and self-documenting' [...] Another interesting facility for program editing is the TAGS package. The separate program TAGS builds a TAGS table containing the file name and position in that file in which each application program function is defined. This table is loaded into EMACS; specifying the command Meta, function name causes EMACS to select the appropriate file and go to the proper function definition within that file.
Tags were not in the initial version of Flickr. Stewart Butterfield wanted to add them. He liked the way they worked on del.icio.us, the social bookmarking application. We added very simple tagging functionality, so you could tag your photos, and then look at all your photos with a particular tag, or any one person's photos with a particular tag. Soon thereafter, users started telling us that what was really interesting about tagging was not just how you've tagged your photos, but how the whole Flickr community has been tagging photos. So we started seeing a lot of requests from users to be able to see a global view of the tagscape.