Typographic approximation

Last updated

A typographic approximation is a replacement of an element of the writing system (usually a glyph) with another glyph or glyphs. The replacement may be a nearly homographic character, a digraph, or a character string. An approximation is different from a typographical error in that an approximation is intentional and aims to preserve the visual appearance of the original. The concept of approximation also applies to the World Wide Web and other forms of textual information available via digital media, though usually at the level of characters, not glyphs.

Contents

Historically, the main cause of typographic approximation was a low quantity of glyphs (such as letterforms and symbols) available for printing. In the age of World Wide Web and digital typesetting, especially after the advent of Unicode and enormous amount of computer fonts, typographic approximations are usually caused either by low ability of humans to distinguish and find needed symbols or by inadequate replacement patterns in word processors, [1] rather than by lack of available characters.

Normative: 3 × 2 − 1
Approximated: 3 x 2 - 1
An ASCII approximation
of an arithmetical expression

Typewriter and line printer approximations

Merger of characters

On typewriter, several characters were merged due to limited size of glyph repertoire. Several modern computing characters appeared by merger of different symbols, such as the "typewriter" apostrophe, ', which can denote an apostrophe proper, ’, a single quotation mark, or the prime symbol.

Non-spacing modifiers

Some typewriters have non-spacing keys for use as diacritical marks. After the typist pushes, say, acute accent ◌́ the caret does not move. This allows the typist to overstrike this mark by a spacing letter, say, e and obtain é, an accented letter. Due to geometrical restrictions of a monospaced font, the result could not always be perfect. For example, overstriking was unlikely to be a feasible method to produce uppercase accented letters, such as É.

Overstrike was used on line printers for the same function. This contributed to standardization of such characters as U+0060` .

Overstrike of the same letter was used to simulate boldface letters on line printers.

ASCII approximations

DOS PrintScreen approximations.png
An ASCII approximation (above) may be ugly, but giving some representation of several symbols. Replacements of non-ASCII characters (others than default "*") are highlighted in yellow.

The US-ASCII character set and other variants of ISO/IEC 646 contains 95  graphic characters. It is comparable with a (Latin script) typewriter and insufficient for a quality typography. But high availability and robustness of ASCII character encoding prompted computer users to invent ASCII substitutes for various glyphs.

The following ASCII characters are used to approximate certain characters. Note that there are many Latin letters that are homographic to letters of other scripts, however those Latin letters are not listed below.

Approximation of non-glyphs

There exist various approximation for typographic alignment. For example, justification may be emulated with inserting of spaces, and flush-right alignment may be done by padding with spaces.

There are various techniques for approximation of tables (historically used for text mode displays), such as box-drawing characters.

Related Research Articles

<span class="mw-page-title-main">Diacritic</span> Modifier mark added to a letter

A diacritic is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek διακριτικός, from διακρίνω. The word diacritic is a noun, though it is sometimes used in an attributive sense, whereas diacritical is only an adjective. Some diacritics, such as the acute ⟨á⟩, grave ⟨à⟩, and circumflex ⟨â⟩, are often called accents. Diacritics may appear above or below a letter or in some other position such as within the letter or between two letters.

The apostrophe is a punctuation mark, and sometimes a diacritical mark, in languages that use the Latin alphabet and some other alphabets. In English, the apostrophe is used for three basic purposes:

Ø is a letter used in the Danish, Norwegian, Faroese, and Southern Sámi languages. It is mostly used as to represent the mid front rounded vowels, such as and, except for Southern Sámi where it is used as an diphthong.

The tilde˜ or ~, is a grapheme with a number of uses. The name of the character came into English from Spanish, which in turn came from the Latin titulus, meaning 'title' or 'superscription'. Its primary use is as a diacritic (accent) in combination with a base letter; but, for historical reasons, it is also used in standalone form within a variety of contexts.

A caron is a diacritic mark commonly placed over certain letters in the orthography of some languages to indicate a change of the related letter's pronunciation.

The backtick` is a typographical mark used mainly in computing. It is also known as backquote, grave, or grave accent.

<span class="mw-page-title-main">ʻOkina</span> Letter of the Latin alphabet

The ʻokina, also called by several other names, is a consonant letter used within the Latin script to mark the phonemic glottal stop in many Polynesian languages. It does not have distinct uppercase and lowercase forms.

<span class="mw-page-title-main">Overstrike</span> Technique of printing two characters atop one another

In typography, overstrike is a method of printing characters that are missing from the printer's character set. The character is created by placing one character on another one – for example, overstriking ⟨L⟩ with ⟨-⟩ results in printing a ⟨Ł⟩ character.

<span class="mw-page-title-main">Backspace</span> Key on a keyboard

Backspace is the keyboard key that in typewriters originally pushed the carriage one position backwards, and in modern computer systems typically moves the display cursor one position backwards, deletes the character at that position, and shifts back any text after that position by one character.

<span class="mw-page-title-main">Homoglyph</span> Different glyphs which are visually similar

In orthography and typography, a homoglyph is one of two or more graphemes, characters, or glyphs with shapes that appear identical or very similar but may have differing meaning. The designation is also applied to sequences of characters sharing these properties.

The internationalized domain name (IDN) homograph attack is a way a malicious party may deceive computer users about what remote system they are communicating with, by exploiting the fact that many different characters look alike

Unicode has a certain amount of duplication of characters. These are pairs of single Unicode code points that are canonically equivalent. The reason for this are compatibility issues with legacy systems.

The hyphen-minus symbol - is the form of hyphen most commonly used in digital documents. On most keyboards, it is the only character that resembles a minus sign or a dash so it is also used for these. The name hyphen-minus derives from the original ASCII standard, where it was called hyphen–(minus). The character is referred to as a hyphen, a minus sign, or a dash according to the context where it is being used.

Quotation marks are punctuation marks used in pairs in various writing systems to identify direct speech, a quotation, or a phrase. The pair consists of an opening quotation mark and a closing quotation mark, which may or may not be the same glyph. Quotation marks have a variety of forms in different languages and in different media.

There are various systems of romanization of the Armenian alphabet.

<span class="mw-page-title-main">Extended ASCII</span> Nickname for 8-bit ASCII-derived character sets

Extended ASCII is a repertoire of character encodings that include the original 96 ASCII character set, plus up to 128 additional characters. There is no formal definition of "extended ASCII", and even use of the term is sometimes criticized, because it can be mistakenly interpreted to mean that the American National Standards Institute (ANSI) had updated its ANSI X3.4-1986 standard to include more characters, or that the term identifies a single unambiguous encoding, neither of which is the case.

The programming language APL uses a number of symbols, rather than words from natural language, to identify operations, similarly to mathematical symbols. Prior to the wide adoption of Unicode, a number of special-purpose EBCDIC and non-EBCDIC code pages were used to represent the symbols required for writing APL.

The German keyboard layout is a QWERTZ keyboard layout commonly used in Austria and Germany. It is based on one defined in a former edition of the German standard DIN 2137–2. The current edition DIN 2137-1:2012-06 standardizes it as the first (basic) one of three layouts, calling it "T1".

Caret is the name used familiarly for the character ^ provided on most QWERTY keyboards by typing ⇧ Shift+6. The symbol has a variety of uses in programming and mathematics. The name "caret" arose from its visual similarity to the original proofreader's caret, a mark used in proofreading to indicate where a punctuation mark, word, or phrase should be inserted into a document. The formal ASCII standard (X3.64.1977) calls it a "circumflex".

References

  1. Phin, Christopher (2008-03-29). "Ten typographic mistakes everyone makes". Archived from the original on May 3, 2012. Retrieved August 17, 2015.{{cite web}}: CS1 maint: unfit URL (link)