Hypervideo

Last updated

Hypervideo, or hyperlinked video, is a displayed video stream that contains embedded, interactive anchors, [1] allowing navigation between video and other hypermedia elements. Hypervideo is similar to hypertext, which allows a reader to click on a word in one document and retrieve information from another document, or another place in the same document. Hypervideo combines video with a non-linear information structure, allowing a user to make choices based on the content of the video and the user's interests.

Contents

A crucial difference between hypervideo and hypertext is the element of time. Text is normally static, while a video is dynamic; the content of the video changes with time. Consequently, hypervideo has different technical, aesthetic, and rhetorical requirements than a static hypertext page. For example, hypervideo might involve the creation of a link from an object in a video that is visible for only a certain duration. It is therefore necessary to segment the video appropriately and add the metadata required to link from frames—or even objects—in a video to the pertinent information in other media forms.

History

Kinoautomat (1967) was advertised as the world's first interactive movie. [2] Modern hypervideo systems implement some of core concepts of this movie such as nonlinear narrative and interactivity.

Video-to-video linking was demonstrated by the Interactive Cinema Group at the MIT Media Lab. Elastic Charles [3] was a hypermedia journal developed between 1988 and 1989, in which annotations, called "micons", were placed inside a video, indicating links to other content. When implementing the Interactive Kon-Tiki Museum, [4] Listol used micons to represent video footnotes. Video footnotes were a deliberate extension of the literary footnote applied to annotating video, thereby providing continuity between traditional text and early hypervideo. [5] In 1993, Hirata et al. [6] considered media-based navigation for hypermedia systems, where the same type of media is used as a query as for the media to be retrieved. For example, a part of an image (defined by shape, or color, for example) could link to a related image. In this approach, the content of the video becomes the basis of forming the links to other related content.

HotVideo was an implementation of this kind of hypervideo, developed at IBM's China Research Laboratory in 1996. [7] Navigation to associated resources was accomplished by clicking on a dynamic object in a video. In 1997, a project of the MIT Media Lab's Object-Based Media Group called HyperSoap further developed this concept. It was a short soap opera program in which a viewer could click with an enhanced remote control on objects in the video to find information on how they could be purchased. The company Watchpoint Media was formed to commercialize the technology involved, resulting in a product called Storyteller oriented towards interactive television.

Illustrating the progression to hypervideo from hypertext, Storyspace, [8] a hypertext writing environment, employs a spatial metaphor for displaying links. It utilizes 'writing spaces', generic containers for content, which link to other writing spaces. in 1996 HyperCafe, [9] a popular experimental prototype of hypervideo, made use of this tool to create "narrative video spaces". It was developed as an early model of a hypervideo system, placing users in a virtual cafe where the user dynamically interacts with the video to follow different conversations.

In 1997, the Israeli software firm Ephyx Technologies released a product called v-active, [10] one of the first commercial object-based authoring systems for hypervideo. This technology was not a success though: Ephyx changed its name to Veon in 1999, at which time it shifted focus away from hypervideo to the provision of development tools for web and broadband content. [11]

Eline Technologies, founded in 1999, developed a hypervideo solution called VideoClix [12] that supports support QuickTime, Flash, MPEG-4 and HTML5 formats and has been used as a Software as a Service solution to distribute and monetize clickable video on the web and mobile devices on online video platforms such as Brightcove, ThePlatform, and Ooyala.[ citation needed ]

Mainstream Use

The first steps in hypervideo were taken in the late 1980s. Many experiments (HyperCafe, HyperSoap) have not been extensively explored further, and authoring tools are currently only available from a small number of providers.[ citation needed ]

Smith et al. wrote in 2002 "Digital libraries are growing in popularity and scope, and video is an important component of such archives. All major news services have vast video archives, valuable footage that would be of use in education, historical research, even entertainment" [1] Direct searching of pictures or videos, a much harder task than indexing and searching text, could be greatly improved by hypervideo methods.[ citation needed ]

Concepts and technical challenges

Hypervideo is challenging, compared to hyperlinked text, due to the unique difficulty video presents in node segmentation; that is, separating a video into algorithmically identifiable, linkable content.

Videos, fundamentally, are a sequence of images displaying information. In order to segment a video into meaningful pieces (objects in images, or scenes within videos), it is necessary to provide a context, both in space and time, to extract meaningful elements from this image sequence. Humans are naturally able to perform this task, but it's desirable to do so algorithmically. Developing a method to achieve this, however, is a complex problem. At an NTSC frame rate of 30 frames per second, [13] even a short video of 30 seconds comprises 900 frames. The identification of distinct video elements would be tedious if human intervention were required for every frame. For moderate amounts of video material, manual segmentation is clearly unrealistic.

From the standpoint of time, the smallest unit of a video is a single frame. [5] Node segmentation could be performed at the frame level—a straightforward task as a frame is easily identifiable. However, a single frame cannot contain video information, since videos are necessarily dynamic. Analogously, a single word separated from a text does not convey meaning. Thus, it is necessary to consider the scene, which is the next level of temporal organization. A scene can be defined as the minimum sequential set of frames that conveys meaning. This is an important concept for hypervideo, as one might wish a hypervideo link to be active throughout one scene, though not in the next. Scene granularity is therefore natural in the creation of hypervideo. Consequently, hypervideo requires algorithms capable of detecting scene transitions. One can imagine coarser levels of temporal organization: scenes can be grouped together to form a narrative sequence, which in turn are grouped to form a video. From the point of view of node segmentation, however, these concepts are not as critical.

Even if the frame is the smallest time unit, one can still spatially segment a video at a sub-frame level, separating the frame image into its constituent objects. This is necessary when performing node segmentation at the object level. Time introduces complexity in this case also, for even after an object is differentiated in one frame, it is usually necessary to follow the same object through a sequence of frames. This process, known as object tracking, is essential to the creation of links from objects in videos. Spatial segmentation of object can be achieved, for example, through the use of intensity gradients to detect edges, color histograms to match regions, [1] motion detection, [14] or a combination of these and other methods.

Once the required nodes have been segmented and combined with the associated linking information, this metadata must be incorporated with the original video for playback. The metadata is placed conceptually in layers, or tracks, on top of the video; this layered structure is then presented to the user for viewing and interaction. Thus, the display technology and the hypervideo player, should not be neglected when creating hypervideo content. For example, efficiency can be gained by storing the geometry of areas associated with tracked objects only in certain keyframes, and allowing the player to interpolate between these keyframes, as developed for HotVideo. [15] Furthermore, the creators of VideoClix emphasize the fact that its content plays back on standard players, such as Quicktime and Flash.[ citation needed ]

Commentary

User replies to video content, traditionally in the form of text or image links which are not embedded into the playback sequence of the video, have been allowed through such video hosting services as Viddler to become embedded both within the imagery of the video and within portions of the playback (via selected time lengths inside the progress slider element); this feature has become known as "video comments" or "audio comments".

See also

Related Research Articles

<span class="mw-page-title-main">Hypertext</span> Text with references (links) to other text that the reader can immediately access

Hypertext is text displayed on a computer display or other electronic devices with references (hyperlinks) to other text that the reader can immediately access. Hypertext documents are interconnected by hyperlinks, which are typically activated by a mouse click, keypress set, or screen touch. Apart from text, the term "hypertext" is also sometimes used to describe tables, images, and other presentational content formats with integrated hyperlinks. Hypertext is one of the key underlying concepts of the World Wide Web, where Web pages are often written in the Hypertext Markup Language (HTML). As implemented on the Web, hypertext enables the easy-to-use publication of information over the Internet.

<span class="mw-page-title-main">Hyperlink</span> Method of referencing visual computer data

In computing, a hyperlink, or simply a link, is a digital reference to data that the user can follow or be guided to by clicking or tapping. A hyperlink points to a whole document or to a specific element within a document. Hypertext is text with hyperlinks. The text that is linked from is known as anchor text. A software system that is used for viewing and creating hypertext is a hypertext system, and to create a hyperlink is to hyperlink. A user following hyperlinks is said to navigate or browse the hypertext.

In the context of the World Wide Web, deep linking is the use of a hyperlink that links to a specific, generally searchable or indexed, piece of web content on a website, rather than the website's home page. The URL contains all the information needed to point to a particular item. Deep linking is different from mobile deep linking, which refers to directly linking to in-app content using a non-HTTP URI.

This article presents a timeline of hypertext technology, including "hypermedia" and related human–computer interaction projects and developments from 1945 on. The term hypertext is credited to the author and philosopher Ted Nelson.

The Aspen Movie Map was a hypermedia system developed at MIT that enabled the user to take a virtual tour through the city of Aspen, Colorado. It was developed by a team working with Andrew Lippman in 1978 with funding from ARPA.

<span class="mw-page-title-main">The Interactive Encyclopedia System</span>

The Interactive Encyclopedia System, or TIES, was a hypertext system developed in the University of Maryland Human-Computer Interaction Lab by Ben Shneiderman in 1983. The earliest versions of TIES ran in DOS text mode, using the cursor arrow keys for navigating through information. A later version of HyperTIES for the Sun workstation was developed by Don Hopkins using the NeWS window system, with an authoring tool based on UniPress's Gosling Emacs text editor.

Hypermedia, an extension of the term hypertext, is a nonlinear medium of information that includes graphics, audio, video, plain text and hyperlinks. This designation contrasts with the broader term multimedia, which may include non-interactive linear presentations as well as hypermedia. It is also related to the field of electronic literature. The term was first used in a 1965 article written by Ted Nelson.

Inline linking is the use of a linked object, often an image, on one site by a web page belonging to a second site. One site is said to have an inline link to the other site where the object is located.

Hypertext fiction is a genre of electronic literature, characterized by the use of hypertext links that provide a new context for non-linearity in literature and reader interaction. The reader typically chooses links to move from one node of text to the next, and in this fashion arranges a story from a deeper pool of potential stories. Its spirit can also be seen in interactive fiction.

<span class="mw-page-title-main">Tab (interface)</span> Interface component

}

Intermedia was the third notable hypertext project to emerge from Brown University, after HES (1967) and FRESS (1969). Intermedia was started in 1985 by Norman Meyrowitz, who had been associated with sooner hypertext research at Brown. The Intermedia project coincided with the establishment of the Institute for Research in Information and Scholarship (IRIS). Some of the materials that came from Intermedia, authored by Meyrowitz, Nancy Garrett, and Karen Catlin were used in the development of HTML.

<span class="mw-page-title-main">J. Yellowlees Douglas</span>

Jane Yellowlees Douglas is a pioneer author and scholar of hypertext fiction. She began writing about hypermedia in the late 1980s, very early in the development of the medium. Her 1993 fiction I Have Said Nothing, was one of the first published works of hypertext fiction.

Web Modeling Language, abbreviated as WebML is a visual notation and methodology for the design of a data-intensive web applications. It provides a graphical means to define the specifics of web application design within a structured design process. This process can be enhanced with the assistance of visual design tools.

Hyperland is a 50-minute-long documentary film about hypertext and surrounding technologies. It was written by Douglas Adams and produced and directed by Max Whitby for BBC Two in 1990. It stars Douglas Adams as a computer user and Tom Baker, with whom Adams had already worked on Doctor Who, as a personification of a software agent.

Adaptive hypermedia (AH) uses hypermedia which is adaptive according to a user model. In contrast to regular hypermedia, where all users are offered the same set of hyperlinks, adaptive hypermedia (AH) tailors what the user is offered based on a model of the user's goals, preferences and knowledge, thus providing links or content most appropriate to the current user.

KMS, an abbreviation of Knowledge Management System, was a commercial second generation hypermedia system, originally created as a successor for the early hypermedia system ZOG. KMS was developed by Don McCracken and Rob Akscyn of Knowledge Systems, a 1981 spinoff from the Computer Science Department of Carnegie Mellon University.

<span class="mw-page-title-main">Electronic Document System</span>

The Electronic Document System (EDS) was an early hypertext system – also known as the Interactive Graphical Documents (IGD) hypermedia system – focused on creation of interactive documents such as equipment repair manuals or computer-aided instruction texts with embedded links and graphics. EDS was a 1978–1981 research project at Brown University by Steven Feiner, Sandor Nagy and Andries van Dam.

<span class="mw-page-title-main">History of hypertext</span>

Hypertext is text displayed on a computer or other electronic device with references (hyperlinks) to other text that the reader can immediately access, usually by a mouse click or keypress sequence. Early conceptions of hypertext defined it as text that could be connected by a linking system to a range of other documents that were stored outside that text. In 1934 Belgian bibliographer, Paul Otlet, developed a blueprint for links that telescoped out from hypertext electrically to allow readers to access documents, books, photographs, and so on, stored anywhere in the world.

<span class="mw-page-title-main">Object co-segmentation</span> Type of image segmentation, jointly segmenting semantically similar objects in multiple images

In computer vision, object co-segmentation is a special case of image segmentation, which is defined as jointly segmenting semantically similar objects in multiple images or video frames.

<i>Six Sex Scenes</i> 1996 hypertext novella by Adrienne Eisen

Six Sex Scenes is a hypertext novella created by Adrienne Eisen and published on the web in 1996.

References

  1. 1 2 3 Smith, Jason and Stotts, David, An Extensible Object Tracking Architecture for Hyperlinking in Real-time and Stored Video Streams, Dept. of Computer Science, Univ. of North Caroline and Chapel Hill
  2. "Kinoautomat - Monoskop". monoskop.org. Retrieved 2024-03-16.
  3. Brøndmo H; Davenport G (1989). "Elastic Charles: A Hyper-Media Journal". MIT Interactive Cinema group. Retrieved 2007-03-12.{{cite web}}: CS1 maint: multiple names: authors list (link)
  4. Liestol, Gunner. Aesthetic and Rhetorical Aspects of linking Video in Hypermedia
  5. 1 2 Luis Francisco-Revilla (1998). "A Picture of Hypervideo Today". CPSC 610 Hypertext and Hypermedia. Center for the Study of Digital Libraries: Texas A&M University. Retrieved 2007-03-12.
  6. Hirata, K., Hara, Y., Shibata, N., Hirabayashi, F., 1993, Media-based navigation for hypermedia systems, in Hypertext '93 Proceedings.
  7. "New Initiatives - HotVideo: The Cool Way to Link". Research News. IBM. Retrieved 2008-09-30.
  8. Storyspace
  9. HyperCafe: Narrative and Aesthetic Properties of Hypervideo, Nitin Nick Sawhney, David Balcom, Ian Smith, UK Conference on Hypertext
  10. Tania Hershman (July 1997). "Internet Innovations from Israel". BYTE. Archived from the original on 2008-12-27. Retrieved 2008-10-01.
  11. "Ephyx Changes Name To Veon". Computergram International. 1998-04-29. Retrieved 2008-10-01.
  12. "From hypertext to hypervideo". The Economist. ISSN   0013-0613 . Retrieved 2023-09-17.
  13. NTSC Basics Archived 2007-02-05 at the Wayback Machine
  14. Khan, Sohaib and Shah, Mubarak, Object Based Segmentation of Video Using Color, Motion and Spatial Information, Computer Vision Laboratory, University of Central Florida
  15. U.S. Patent 6912726

Further reading