Filename extension | .etx [lower-alpha 1] |
---|---|
Developed by | Ian Feldman |
Initial release | January 6, 1992 |
Type of format | Lightweight markup language |
Setext (Structure Enhanced Text) [2] is a lightweight markup language used to format plain text documents such as e-newsletters, Usenet postings, and e-mails. In contrast to some other markup languages (such as HTML), the markup is easily readable without any parsing or special software.
Setext was first introduced in 1991 by Ian Feldman for use in the TidBITS electronic newsletter.
Setext allows viewing of marked-up documents without special viewing software. When appropriate software is used, however, a rich text-style experience is available to the user.
Smaller documents are trivial to create in any text editor.
To prevent errors, most large setext publications are created using a markup language such as HTML or SGML and then converted. The setext document can then be distributed without the need for the recipient to use a HTML email or web viewer.
Multiple setext documents can be stored in the same file, similarly to how the mbox format can store multiple e-mail messages together.
It was initially announced [1] that multiple documents could be included in a single stream, separated by a special <end>
tag serving as a document delimiter [lower-alpha 2] . After several months, it was clarified [3] that this tag was not an official part of setext, and that multiple documents should instead be delimited by $$
appearing at the end of a line of text.
Regardless of the number of documents stored in the same file, basic metadata can be stored about any or all of them by using the subject-tt tag syntax.
The following are the ten most common of the 16 different setext tags. [4] [5] [lower-alpha 3]
Name [lower-alpha 6] | setext pattern | Example [lower-alpha 7] | Comments |
---|---|---|---|
title-tt | Title
| This is a long title==================== | A distinct title identified by the text, maximum one per setext. Must start at the beginning of the line. |
subhead-tt | Subhead
| Subheading One-------------- | A distinct subheading identified by the text, zero or more per text. Must start at beginning of line. See note in title-tt about handling. |
indent-tt | 66-char lines indented by 2 spaces | First paragraph... ...more of paragraph. [blank line] Next paragraph... | Lines undented and unfolded (longer lines are generally tolerated by most parsers). This is primary body text, generally plain undented in emails, etc. currently. |
bold-tt | **[multi ]word** | This is **very important**... | One or more bold words, generally *word* or **word** in emails |
italic-tt | ~word~ | This is an ~italic~ word. | A single, italicized word; multi-word form was not officially specified due to "visual-clarity reasons" Multi-word form of |
underline-tt | [_multi ]word_
| This is _underlined text_. This is _underlined_text_. | Display in a (user) selected style, preferably with underlining--except in browsers where underlining corresponds to hot links. One or more underlined words |
hot-tt | [multi_]word_ | This is a hot_word_. | Used to mark notes and URLs [lower-alpha 8] [lower-alpha 9] |
include-tt | >[space] [text] | > This is quoted text...> ...more... | Displayed in a user selected style, preferably monospaced with the leading ">" |
bullet-tt | *[space] [text] | * Item 1 that is... ...really long* Item 2 | Displayed in bullet or list format. |
href-tt | ^.. _hot_word URL | ^.. _Wikipedia_home_page https://wikipedia.org | (Linked in the text with a hot-tt as Wikipedia_home_page_ )These 'link definitions' are commonly placed at the end of a paragraph/section, or at the very end of the setext document. [lower-alpha 9] |
By default all properly setext-ized files will have an ".etx" or ".ETX" suffix. This stands for an "emailable/enhanced text". [1]
Other lightweight markup languages (inspired by Setext):
Hypertext Markup Language (HTML) is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScript.
A markuplanguage is a text-encoding system which specifies the structure and formatting of a document and potentially the relationship between its parts. Markup can control the display of a document or enrich its content to facilitate automated processing.
In computing, plain text is a loose term for data that represent only characters of readable material but not its graphical representation nor other objects. It may also include a limited number of "whitespace" characters that affect simple arrangement of text, such as spaces, line breaks, or tabulation characters. Plain text is different from formatted text, where style information is included; from structured text, where structural parts of the document such as paragraphs, sections, and the like are identified; and from binary files in which some portions must be interpreted as binary objects.
The Standard Generalized Markup Language is a standard for defining generalized markup languages for documents. ISO 8879 Annex A.1 states that generalized markup is "based on two postulates":
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The World Wide Web Consortium's XML 1.0 Specification of 1998 and several other related specifications—all of them free open standards—define XML.
DocBook is a semantic markup language for technical documentation. It was originally intended for writing technical documents related to computer hardware and software, but it can be used for any other sort of documentation.
In computing, a hyperlink, or simply a link, is a digital reference to data that the user can follow or be guided to by clicking or tapping. A hyperlink points to a whole document or to a specific element within a document. Hypertext is text with hyperlinks. The text that is linked from is known as anchor text. A software system that is used for viewing and creating hypertext is a hypertext system, and to create a hyperlink is to hyperlink. A user following hyperlinks is said to navigate or browse the hypertext.
Typesetting is the composition of text for publication, display, or distribution by means of arranging physical type in mechanical systems or glyphs in digital systems representing characters. Stored types are retrieved and ordered according to a language's orthography for visual display. Typesetting requires one or more fonts. One significant effect of typesetting was that authorship of works could be spotted more easily, making it difficult for copiers who have not gained permission.
An HTML element is a type of HTML document component, one of several types of HTML nodes. The first used version of HTML was written by Tim Berners-Lee in 1993 and there have since been many versions of HTML. The current de facto standard is governed by the industry group WHATWG and is known as the HTML Living Standard.
In web development, "tag soup" is a pejorative for HTML written for a web page that is syntactically or structurally incorrect. Web browsers have historically treated structural or syntax errors in HTML leniently, so there has been little pressure for web developers to follow published standards. Therefore there is a need for all browser implementations to provide mechanisms to cope with the appearance of "tag soup", accepting and correcting for invalid syntax and structure where possible.
A lightweight markup language (LML), also termed a simple or humane markup language, is a markup language with simple, unobtrusive syntax. It is designed to be easy to write using any generic text editor and easy to read in its raw form. Lightweight markup languages are used in applications where it may be necessary to read the raw document as well as the final rendered output.
In computing, formatted text, styled text, or rich text, as opposed to plain text, is digital text which has styling information beyond the minimum of semantic elements: colours, styles, sizes, and special features in HTML.
Markdown is a lightweight markup language for creating formatted text using a plain-text editor. John Gruber created Markdown in 2004, in collaboration with Aaron Swartz, as a markup language that is intended to be easy to read in its source code form. Markdown is widely used for blogging and instant messaging, and also used elsewhere in online forums, collaborative software, documentation pages, and readme files.
Search engine indexing is the collecting, parsing, and storing of data to facilitate fast and accurate information retrieval. Index design incorporates interdisciplinary concepts from linguistics, cognitive psychology, mathematics, informatics, and computer science. An alternate name for the process, in the context of search engines designed to find web pages on the Internet, is web indexing.
Scribe is a markup language and word processing system that pioneered the use of descriptive markup. Scribe was revolutionary when it was proposed, because it involved for the first time a clean separation of presentation and content.
Extensible HyperText Markup Language (XHTML) is part of the family of XML markup languages which mirrors or extends versions of the widely used HyperText Markup Language (HTML), the language in which Web pages are formulated.
A structured document is an electronic document where some method of markup is used to identify the whole and parts of the document as having various meanings beyond their formatting. For example, a structured document might identify a certain portion as a "chapter title" rather than as "Helvetica bold 24" or "indented Courier". Such portions in general are commonly called "components" or "elements" of a document.
Org Mode is a mode for document editing, formatting, and organizing within the free software text editor GNU Emacs and its derivatives, designed for notes, planning, and authoring. The name is used to encompass plain text files that include simple marks to indicate levels of a hierarchy, and an editor with functions that can read the markup and manipulate hierarchy elements.
The Office Open XML file formats are a set of file formats that can be used to represent electronic office documents. There are formats for word processing documents, spreadsheets and presentations as well as specific formats for material such as mathematical formulas, graphics, bibliographies etc.
Ctrl-\
) C0 control character but it proved too visually distracting and so was removed before setext was finalized. _hot_word
)defines a hyperlink or reference, whereas a hot-tt 'hot word' suffixed with an underscore (i.e., hot_word_
) references that hyperlink/reference by name in the body of the text. (Before the Web was ubiquitous, what are now commonly known as 'hyperlinks' were then commonly called 'hot links', especially in 'CD-ROM era' software such as HyperCard and Macromedia Director and in games such as Myst.)