Text annotation is the practice and the result of adding a note or gloss to a text, which may include highlights or underlining, comments, footnotes, tags, and links. Text annotations can include notes written for a reader's private purposes, as well as shared annotations written for the purposes of collaborative writing and editing, commentary, or social reading and sharing. In some fields, text annotation is comparable to metadata insofar as it is added post hoc and provides information about a text without fundamentally altering that original text. [1] Text annotations are sometimes referred to as marginalia, though some reserve this term specifically for hand-written notes made in the margins of books or manuscripts. Annotations have been found to be useful and help to develop knowledge of English literature.
Annotations can be both private and socially shared, including hand-written and information technology-based annotation. Annotations are different than notetaking because annotations must be physically written or added on the actual original piece. [2] This can be writing within the page of a book or highlighting a line, or, if the piece is digital, a comment or saved highlight or underline within the document. For information on annotation of Web content, including images and other non-textual content, see also Web annotation.
Text annotation may be as old as writing on media, where it was possible to produce an additional copy with a reasonable effort. It became a prominent activity around 1000 AD in Talmudic commentaries and Arabic rhetorics treaties. In the Medieval era, scribes who copied manuscripts often made marginal annotations that then circulated with the manuscripts and were thus shared with the community; sometimes annotations were copied over to new versions when such manuscripts were later recopied. [3]
With the rise of the printing press and the relative ease of circulating and purchasing individual (rather than shared) copies of texts, the prevalence of socially shared annotations declined and text annotation became a more private activity consisting of a reader interacting with a text. [3] Annotations made on shared copies of texts (such as library books) are sometimes seen as devaluing the text, or as an act of defacement. Thus, print technologies support the circulation of annotations primarily as formal scholarly commentary or textual footnotes or endnotes rather than marginal, handwritten comments made by private readers, though handwritten comments or annotations were common in collaborative writing or editing. [3]
Computer-based technologies have provided new opportunities for individual and socially shared text annotations that support multiple purposes, including readers' individual reading goals, learning, social reading, writing and editing, and other practices. Text annotation in Information Technology (IT) systems raises technical issues of access, linkage, and storage that are generally not relevant to paper-based text annotation, and thus research and development of such systems often addresses these areas. [1]
Text annotations can serve a variety of functions for both private and public reading and communication practices. In their article "From the Margins to the Center: The Future of Annotation," scholars Joanna Wolfe and Christine Neuwirth identify four primary functions that text annotations commonly serve in the modern era, including: (1)"facilitat[ing] reading and later writing tasks," which includes annotations that support reading for both personal and professional purposes; (2)"eavesdrop[ping] on the insights of other readers," which involves sharing of annotations; (3)"provid[ing] feedback to writers or promote communication with collaborators," which can include personal, professional, and education-related feedback; and (4)"call[ing] attention to topics and important passages," for which scholarly annotations, footnotes, and call-outs often function. [3] Regarding the ways that annotations can support individual reading tasks, Catherine Marshall points out that the ways that readers annotate texts depends on the purpose, motivation, and context of reading. Readers may annotate to help interpret a text, to call attention to a section for future reference or reading, to support memory and recall, to help focus attention on the text as they read, to work out a problem related to the text, or create annotations not specifically related to the text at all. [4]
Educational research in text annotation has examined the role that both private and shared text annotations can play in supporting learning goals and communication. Much educational research examines how students' private annotation of texts supports comprehension and memory; for example, research indicates that annotating texts causes more in-depth processing of information, which results in greater recall of information. [3] Because annotations are done while reading with a writing utensil in hand, readers are supposed to be more aware of their thoughts as they read. This means that readers are, along with making notes to help them remember or better understand the content, actively engaged during the activity and are therefore more receptive to the information when annotating a text. [2]
Other areas of educational research investigate the benefits of socially shared text annotations for collaborative learning, both for paper-based and IT-based annotation sharing. For example, studies by Joanna Wolfe have investigated the benefits of exposure to others' annotations on student readers and writers. In a 2000 study, Wolfe found that exposing students to others' annotations influenced their perceptions of the annotators, which in turn shaped their responses to the material and their written products. [5] In a later study, Wolfe found that viewing others' written comments on a paper text, especially pairs of annotations that present opposing responses to the text, can help students engage in the type of critical reading and stance-taking necessary for effective argumentative writing. [6]
While shared annotations can benefit individual readers, "since the 1920s, literacy theory has increasingly emphasized the importance of social factors in the development of literacy." [7] Thus, shared annotations can not only help one to better understand the content of a particular text, but may also aid in the acquirement of literacy skills. For example, a mother may leave marks inside a book to draw the attention of her child to a particular theme or concept; thanks to the development of audio annotations, parents may now leave notes for children who are just starting to read and may struggle with textual annotations. [7]
More recent research in the effects of shared text annotations has focused on the learning applications for web-based annotation systems, some of which were developed based on design recommendations from studies outlined above. For example, Ananda Gunawardena, Aaron Tan, and David Kaufer conducted a pilot study to examine whether annotating documents in Classroom Salon, a web-based annotation and social reading platform, encouraged active reading, error detection, and collaboration in a computer science course at Carnegie Mellon University. This study suggested a correlation between students' overall performance in the course and their ability to identify errors in a text that they annotated in Classroom Salon; it also found that students were likely to change their annotations in response to annotations made by others in the course. [8]
Similarly, the web-based annotation tool HyLighter was used in a first-year writing course and shown to improve the development of students' mental models of texts, including supporting reading comprehension, critical thinking, and the ability to develop a thesis. The collaboration with peers and experts around a shared text improved these skills and brought the communities' understanding closer together. [9]
A meta-analysis of empirical studies into the higher-education uses of social annotation (SA) tools indicates such tools have been tested in several courses, among them English, sport psychology, and hypermedia. Studies have indicated that social annotation functions, including commenting, information sharing, and highlighting, can support instruction designed to foster collaborative learning and communication, as well as reading comprehension, metacognition, and critical analysis. Several studies indicated that students enjoyed using social annotation tools, and that it improved motivation in the course. [10]
"Multi Sensory" annotations have also been found to help students retain not only information in the classroom, but this can also help those who are trying to learn a new language.[ citation needed ] Images can be placed next to or linked to words for people to get a better understand of what that word means by looking at it.[ citation needed ] The same can be done with an audio clip of how that word is pronounced and also its meaning. Of course this is easier done using technology and in order to be specifically an annotation it must be embedded within the referenced document. However in physical copies of text a picture can be drawn next to a word and still be a sensory annotation. This form of annotation furthers comprehension, specifically in the classroom because it requires more of students' brains to retain the information being given. [11]
Text annotations have long been used in writing and revision processes as a way for reviewers to suggest changes and communicate about a text. [3] In book publishing, for example, the collaboration of authors and editors to develop and revise a manuscript frequently involves exchanges of both in-line revisions or notes as well as marginal annotations. Similarly, copyeditors often make marginal annotations or notes that explain or suggest revisions or are directed at the author as questions or suggestions (commonly called "queries"). [12] Asynchronous collaborative writing and document development often depend on text annotations as a way not only to suggest revisions but also to exchange ideas during document development or to facilitate group decision making, though such processes are often complicated by the use of different communication technologies (such as phone calls or emails as well as document sharing) for distinct tasks. [13] Text annotations can also function to allow group or community members to communicate about a shared text, such as a doctor annotating a patient's chart. [3]
Much research into the functionality and design of collaborative IT-based writing systems, which often support text annotation, has occurred in the area of computer-supported cooperative work. [14]
In corpus linguistics, digital philology and natural language processing, annotations are used to explicate linguistic, textual or other features of a text (or other digital representations of natural language). In linguistics, annotations include comments and metadata; non-transcriptional annotations are also non-linguistic.
In these disciplines, annotations are the basis for quantitative research, empirical studies and the application of machine learning. Unlike annotations in the above-mentioned uses (that appear very sparsely), linguistic annotation usually requires that every element (token) within a text carries one or multiple annotations, and that complex relations between different annotations exist. A number of specialized formats (and tools) for this purpose exist, the following illustrates an annotation with as used in the Universal Dependencies project. For clarity, the tab-separated values normally used have been replaced by an HTML table. [[File:Ud-ewt-sample.png|frame|Fig. 2. Sample for Universal Dependencies annotation, English Web Treebank, visualization by Brat]
Word number | As written | String value (FORM ) | Part of speech (POS ) | Lemma (LEMMA ) | Morphological features (FEAT ) | Syntactic dependencies (HEAD , referring to word number) | Syntactic relations (dependencies, DEP ) | Extended dependencies | Comments |
---|---|---|---|---|---|---|---|---|---|
1 | What | what | DET | WDT | PronType=Int | 2 | det | 2:det | _ |
2 | language | language | NOUN | NN | Number=Sing | 4 | nsubj:pass | 4:nsubj:pass | _ |
3 | is | be | AUX | VBZ | Number=Sing|Person=3|Tense=Pres|VerbForm=Fin | 4 | aux:pass | 4:aux:pass | _ |
4 | talked | talk | VERB | VBN | VerbForm=Past | 0 | root | 0:root | _ |
5 | in | in | ADP | IN | _ | 6 | case | 6:case | _ |
6 | Iguazu | Iguazu | PROPN | NNP | Number=Sing | 4 | obl | 4:obl:in | SpaceAfter=No |
7 | ? | ? | PUNCT | . | _ | 4 | punct | 4:punct | _ |
A visualization of the example is given in Fig. 2. In addition to word-level annotations, the word (and the sentence, etc.) in this format can carry metadata.
Various other annotation formats do exist, often coupled with certain pieces of software for their creation, processing or querying, see Ide et al. (2017) [15] for an overview. The Linguistic Annotation Wiki [16] describes tools and formats for creating and managing linguistic annotations. Selected problems and applications are also discussed under Overlapping markup and Web annotation. Aside from tab-separated values and other text formats, formats for linguistic annotations are often based on markup languages such as XML (and formerly, SGML), more complex annotations may also employ graph-based data models and formats such as JSON-LD, e.g., in accordance with the Web Annotation standard.
Linguistic annotation comes with an independent research tradition and its own terminology: [15] The target of an annotation is usually referred to as a 'markable', the body of the annotation as 'annotation', the relation between annotation and markable is usually expressed in the annotation format (e.g., by having annotations and text side-by side), so that explicit anchors are not necessary.
Research in the design and development of annotation systems uses specific terminology to refer to distinct structural components of annotations and also distinguishes among options for digital annotation displays.
The structural components of any annotation can be roughly divided into three primary elements: a body, an anchor, and a marker. The body of an annotation includes reader-generated symbols and text, such as handwritten commentary or stars in the margin. The anchor is what indicates the extent of the original text to which the body of the annotation refers; it may include circles around sections, brackets, highlights, underlines, and so on. Annotations may be anchored to very broad stretches of text (such as an entire document) or very narrow sections (such as a specific letter, word, or phrase). The marker is the visual appearance of the anchor, such as whether it is a grey underline or a yellow highlight. An annotation that has a body (such as a comment in the margin) but no specific anchor has no marker. [4]
IT-based annotation systems utilize a variety of display options for annotations, including:
Annotation interfaces may also allow highlighting or underlining, as well as threaded discussions. [3] [17] Sharing and communicating through annotations anchored to specific documents is sometimes referred to as anchored discussion. [6]
IT-based annotation systems include standalone and client-server systems. In the 1980s and 1990s, a number of such systems were built in the context of libraries, patent offices, and legal text processing. Their design led researchers to produce taxonomies of annotation forms. [18] Text annotation research has taken place at several institutions, including Xerox research centers in Palo Alto and Grenoble (France), the Hitachi Central Research Lab (in particular for annotation of patents), and in relation with the construction of the new French National Library between 1989 and 1995 [19] at the Institut de Recherche en Informatique de Toulouse and in the company AIS (Advanced Innovation Systems).
Annotation functionality has been present in text processing software for many years through inline notes displayed as pop-ups, footnotes, and endnotes; however, it is only recently that functionality for displaying annotations as marginalia has appeared in programs such as OpenOffice.org/LibreOffice Writer and Microsoft Word. Personal or standalone annotation include word processing software that supports embedded or anchored text annotations as well as Adobe Acrobat, which in addition to commenting allows highlights, stamps, and other types of markup. [1]
Tim Berners-Lee had already implemented the concept [20] of directly editing web documents in 1990 in WorldWideWeb, the first web browser, [21] but later ported versions removed this collaborative ability. [20] An early version of NCSA Mosaic in 1993 also included a collaborative annotation capability, [22] though it was quickly removed. Web Distributed Authoring and Versioning, WebDAV, was then reintroduced as an extension.
A different approach to distributed authoring consists in first gathering many annotations from a wide public, and then integrate them all in order to produce a further version of a document. This approach was pioneered by Stet, the system put in place to gather comments on drafts of version 3 of the GNU General Public License. This system arose after a specific requirement, which it served egregiously, but was not so easily configurable as to be convenient for annotating any other document on the web. The co-ment system uses annotation interface concepts similar to Stet's, but it is based on an entirely new implementation, using Django/Python on the server side and various AJAX libraries such as JQuery on the client side. Both Stet and co-ment are licensed under the GNU Affero General Public License.
Since 2011, the non-profit Hypothes Is Project [23] has offered the free, open web annotation service Hypothes.is. The service features annotation via a Chrome extension, bookmarklet or proxy server, as well as integration into a LMS or CMS. Both webpages and PDFs can be annotated. Other web-based text annotation systems are collaborative software for distributed text editing and versioning, which also feature annotation and commenting interfaces.
Specialized Web-based text annotations exist in the context of scientific publication, either for refereeing or post-publication. The on-line journal PLoS ONE, published by the Public Library of Science, has developed its own Web-based system where scientists and the public can comment on published articles. The annotations are displayed as pop-ups with an anchor in the text.
Hypertext is text displayed on a computer display or other electronic devices with references (hyperlinks) to other text that the reader can immediately access. Hypertext documents are interconnected by hyperlinks, which are typically activated by a mouse click, keypress set, or screen touch. Apart from text, the term "hypertext" is also sometimes used to describe tables, images, and other presentational content formats with integrated hyperlinks. Hypertext is one of the key underlying concepts of the World Wide Web, where Web pages are often written in the Hypertext Markup Language (HTML). As implemented on the Web, hypertext enables the easy-to-use publication of information over the Internet.
Corpus linguistics is the study of a language as that language is expressed in its text corpus, its body of "real world" text. Corpus linguistics proposes that a reliable analysis of a language is more feasible with corpora collected in the field—the natural context ("realia") of that language—with minimal experimental interference.
Ruby characters or rubi characters are small, annotative glosses that are usually placed above or to the right of logographic characters of languages in the East Asian cultural sphere, such as Chinese hanzi, Japanese kanji, and Korean hanja, to show the logographs' pronunciation; these were formerly also used for Vietnamese hán tự and chữ nôm, and may still occasionally be seen in that context when reading archaic texts. Typically called just ruby or rubi, such annotations are most commonly used as pronunciation guides for characters that are likely to be unfamiliar to the reader.
In linguistics, a corpus or text corpus is a language resource consisting of a large and structured set of texts. In corpus linguistics, they are used to do statistical analysis and hypothesis testing, checking occurrences or validating linguistic rules within a specific language territory.
Collaborative writing, or collabwriting is a method of group work that takes place in the workplace and in the classroom. Researchers expand the idea of collaborative writing beyond groups working together to complete a writing task. Collaboration can be defined as individuals communicating, whether orally or in written form, to plan, draft, and revise a document. The success of collaboration in group work is often incumbent upon a group's agreed upon plan of action. At times, success in collaborative writing is hindered by a group's failure to adequately communicate their desired strategies.
Web annotation refers to
Marginalia are marks made in the margins of a book or other document. They may be scribbles, comments, glosses (annotations), critiques, doodles, drolleries, or illuminations.
An annotation is extra information associated with a particular point in a document or other piece of information. It can be a note that includes a comment or explanation. Annotations are sometimes presented in the margin of book pages. For annotations of different digital media, see web annotation and text annotation.
Social bookmarking is an online service which allows users to add, annotate, edit, and share bookmarks of web documents. Many online bookmark management services have launched since 1996; Delicious, founded in 2003, popularized the terms "social bookmarking" and "tagging". Tagging is a significant feature of social bookmarking systems, allowing users to organize their bookmarks and develop shared vocabularies known as folksonomies.
Computer-supported collaborative learning (CSCL) is a pedagogical approach wherein learning takes place via social interaction using a computer or through the Internet. This kind of learning is characterized by the sharing and construction of knowledge among participants using technology as their primary means of communication or as a common resource. CSCL can be implemented in online and classroom learning environments and can take place synchronously or asynchronously.
An edublog is a blog created for educational purposes. Edublogs archive and support student and teacher learning by facilitating reflection, questioning by self and others, collaboration and by providing contexts for engaging in higher-order thinking. Edublogs proliferated when blogging architecture became more simplified and teachers perceived the instructional potential of blogs as an online resource. The use of blogs has become popular in education institutions including public schools and colleges. Blogs can be useful tools for sharing information and tips among co-workers, providing information for students, or keeping in contact with parents. Common examples include blogs written by or for teachers, blogs maintained for the purpose of classroom instruction, or blogs written about educational policy. Educators who blog are sometimes called edubloggers.
Balanced literacy is a theory of teaching reading and writing the English language that arose in the 1990s and has a variety of interpretations. For some, balanced literacy strikes a balance between whole language and phonics and puts an end to the so called reading wars. Others say balanced literacy, in practice, usually means the whole language approach to reading.
Mindomo is a versatile freemium collaborative mind mapping, concept mapping and outlining tool developed by Expert Software Applications. It can be used to develop ideas and interactively brainstorm, with features including sharing, collaboration, task management, presentation and interactive web publication.
Computers in the classroom include any digital technology used to enhance, supplement, or replace a traditional educational curriculum with computer science education. As computers have become more accessible, inexpensive, and powerful, the demand for this technology has increased, leading to more frequent use of computer resources within classes, and a decrease in the student-to-computer ratio within schools.
Hypertext is text displayed on a computer or other electronic device with references (hyperlinks) to other text that the reader can immediately access, usually by a mouse click or keypress sequence. Early conceptions of hypertext defined it as text that could be connected by a linking system to a range of other documents that were stored outside that text. In 1934 Belgian bibliographer, Paul Otlet, developed a blueprint for links that telescoped out from hypertext electrically to allow readers to access documents, books, photographs, and so on, stored anywhere in the world.
The Maryland Institute for Technology in the Humanities (MITH) is an international research center that works with humanities in the 21st century. A collaboration among the University of Maryland College of Arts and Humanities, Libraries, and Office of Information Technology, MITH cultivates innovative research agendas clustered around digital tools, text mining and visualization, and the creation and preservation of electronic literature, digital games, virtual worlds.
Literature Circles in EFL are teacher accompanied classroom discussion groups among English as a foreign language learners, who regularly get together in class to speak about and share their ideas, and comment on others' interpretations about the previously determined section of a graded reader in English, using their 'role-sheets' and 'student journals' in collaboration with each other.
The Alpheios Project is an open source initiative originally focused on developing software to facilitate reading Latin and ancient Greek. Dictionaries, grammars and inflection tables were combined in a set of web-based tools to provide comprehensive reading support for scholars, students and independent readers. The tools were implemented as browser add-ons so that they could be used on any web site or any page that a user might create in Unicoded HTML.
Drama annotation is the process of annotating the metadata of a drama. Given a drama expressed in some medium, the process of metadata annotation identifies what are the elements that characterize the drama and annotates such elements in some metadata format. For example, in the sentence "Laertes and Polonius warn Ophelia to stay away from Hamlet." from the text Hamlet, the word "Laertes", which refers to a drama element, namely a character, will be annotated as "Char", taken from some set of metadata. This article addresses the drama annotation projects, with the sets of metadata and annotations proposed in the scientific literature, based markup languages and ontologies.