_ ◌̲ | |
---|---|
Underscore | |
In Unicode | U+005F_LOW LINE U+0332◌̲COMBINING LOW LINE |
Graphical variants | |
_ | |
U+FF3F_FULLWIDTH LOW LINE | |
Different from | |
Different from | U+0331◌̱ COMBINING MACRON BELOW |
Related | |
See also | U+2017‗DOUBLE LOW LINE U+2381⎁CONTINUOUS UNDERLINE SYMBOL U+2382⎂DISCONTINUOUS UNDERLINE SYMBOL U+FE33︳PRESENTATION FORM FOR VERTICAL LOW LINE |
An underscore or underline is a line drawn under a segment of text. In proofreading, underscoring is a convention that says "set this text in italic type", traditionally used on manuscript or typescript as an instruction to the printer. Its use to add emphasis in modern finished documents is generally avoided. [1]
The (freestanding) underscore character, _, also called a low line, or low dash, originally appeared on the typewriter so that underscores could be typed. To produce an underscored word, the word was typed, the typewriter carriage was moved back to the beginning of the word, and the word was overtyped with the underscore character.
In modern usage, underscoring is achieved with a markup language, with the Unicode combining low line or as a standard facility of word processing software. The free-standing underscore character is used to indicate word boundaries in situations where spaces are not allowed, such as in computer filenames, email addresses, and in Internet URLs, for example Mr_John_Smith
. It is also used as a proofreader's mark, to indicate that text should be underscored or italicised when typeset, for instance _thus_
is to be rendered as thus or thus.
The combining diacritic, ◌̱ (macron below), is similar to the combining low line but is shorter. The difference between "macron below" and "low line" is that the latter results in an unbroken underline when it is run together: compare a̱ḇc̱ and a̲b̲c̲ (only the latter should look like abc). [2] [a]
In a manuscript (or typescript) to be typeset, various forms of underlining (see below) were therefore conventionally used to indicate that text should be set in special type such as italics, part of a procedure known as markup. In printed documents underlining is generally avoided, with italics or small caps often used instead, or (especially in headings) using capitalization, bold type or greater body height (font size). [1] Underlining may still be seen in display work. [3]
A series of underscores (like __________ ) may be used to reserve a blank space in text that is later to be filled in by hand, such as on a paper form. It is also sometimes used to create a horizontal line; other symbols with similar glyphs, such as hyphens and dashes, are also used for this purpose.
In German, Slovene and some other Slavic languages, the underscore has recently gained prominence as the punctuation to form gender-neutral suffixes in gendered nouns and other parts of the speech. [4]
The underscore is also used in modern editions of Spanish vocal sheet music to indicate elision, instead of the breve below (U+032E◌̮ COMBINING BREVE BELOW ), which is less convenient to input on a computer.
In mathematical notations, underscores are sometimes used in the following contexts:
In web browsers, default settings typically distinguish hyperlinks by underlining them (and usually changing their color), but both users and websites can change the settings to make some or all hyperlinks appear differently (or even without distinction from normal text). [1]
As early output devices (Teleprinters, CRTs and line printers) could not produce more than one character at a location, it was not possible to underscore text, so early encodings such as ITA2 and the first versions of ASCII had no underscore. IBM's EBCDIC character-coding system, introduced in 1964, added the underscore, which IBM referred to as the "break character". IBM's report on NPL (the early name of what is now called PL/I) leaves the character set undefined, but specifically mentions the break character, and gives RATE_OF_PAY
as an example identifier. [7] By 1967 the underscore had spread to ASCII, [8] replacing the similarly shaped left-arrow character, ← (see also: PIP). C, developed at Bell Labs in the early 1970s, allowed the underscore in identifiers. [9]
Underscore predates the existence of lower-case letters in many systems, so often it had to be used to make multi-word identifiers, since camelCase (see below) was not available.
Underscores inserted between letters are very common to make a "multi-word" identifier in languages that cannot handle spaces in identifiers. This convention is known as "snake case" (the other popular method is called camelCase, where capital letters are used to show where the words start).
An underscore as the first character in an ID is often used to indicate an internal implementation that is not considered part of the API and should not be called by code outside that implementation. In Dart, all private properties of classes must start with an underscore; this usage is also common in other languages such as C++ even though those provide keywords to indicate that members are private. It is extensively used to hide variables and functions used for implementations in header files. In fact, the use of a single underscore for this became so common that C compilers had to standardize on a double leading underscore (for instance __DATE__
) for actual built-in variables to avoid conflicts with the ones in header files. PHP "reserves all function names starting with __ as magical." [10]
Python uses names that both start and end with double underscores (so called "dunder methods", as in double underscore) for magic members used for purposes such as operator overloading and reflection, and names starting but not ending with a double underscore to denote private member variables of classes which should be mangled in a manner which prevents them from colliding with members of derived classes unless the classes have the same name (__bar
in class Foo
will be mangled to _Foo__bar
). By convention, members starting with a single underscore are considered private or protected, although this behavior only has inherent effect for modules, where import *
statements by default import all names that do not start with an underscore, unless an export list is explicitly defined by the module.
A variable named with just an underscore often has special meaning. $_
or _
is the previous command or result in many interactive shells, such as those of Python, Ruby, and Perl. In Perl, @_
is a special array variable that holds the arguments to a function. In Clojure, it indicates an argument whose value will be ignored. [11]
In some languages with pattern matching, such as Prolog, Standard ML, Scala, OCaml, Haskell, Erlang, and the Wolfram Language, the pattern _
matches any value, but does not perform binding.
The ASCII underscore character can be inserted with the entities _
or _
(or _
or _
).
HTML has a presentational element <u>
that was originally used to underline text; this usage was deprecated in HTML4 in favor of the CSS style {text-decoration: underline}
. [12] In HTML5, the tag reappeared but its meaning was changed significantly: it now "represents a span of inline text which should be rendered in a way that indicates that it has a non-textual annotation". [12] This facility is intended for example to provide a red wavy line (or wiggly line) underline to flag spelling errors at input time but which are not to be embedded in any stored file (unlike an emphasis mark, which would be). Other styles are also available: doubled, dotted, and dashed. [13]
The elements may also exist in other markup languages, such as MediaWiki. The Text Encoding Initiative (TEI) provides an extensive selection of related elements for marking editorial activity (insertion, deletion, correction, addition, etc.).
Unicode has a free-standing underscore _ at U+005F, inherited from ASCII, which is a legacy of the typewriter practice of underlining using backspace and overtype. Modern practice uses the combining diacritic U+0332◌̲COMBINING LOW LINE that results in an underline when run together: u̲n̲d̲e̲r̲l̲i̲n̲e̲. Unicode also has U+0333◌̳COMBINING DOUBLE LOW LINE. In addition, there are single line and double line versions of the combining macron below, a diacritic that applies to single letters only. [2]
Effect | Using combining diacritic | Using html span style | Using macron below |
---|---|---|---|
single underline | a̲b̲c̲d̲e̲f̲g̲h̲i̲j̲k̲l̲m̲n̲o̲p̲q̲r̲s̲t̲u̲v̲w̲x̲y̲z̲0̲1̲2̲3̲4̲5̲6̲7̲8̲9̲ | abcdefghijklmnopqrstuvwxyz0123456789 | a̱ḇc̱ḏe̱ |
double underline | a̲̲b̲̲c̲̲d̲̲e̲̲f̲̲g̲̲h̲̲i̲̲j̲̲k̲̲l̲̲m̲̲n̲̲o̲̲p̲̲q̲̲r̲̲s̲̲t̲̲u̲̲v̲̲w̲̲x̲̲y̲̲z̲̲0̲̲1̲̲2̲̲3̲̲4̲̲5̲̲6̲̲7̲̲8̲̲9̲̲ | abcdefghijklmnopqrstuvwxyz0123456789 | |
single underline | A̲B̲C̲D̲E̲F̲G̲H̲I̲J̲K̲L̲M̲N̲O̲P̲Q̲R̲S̲T̲U̲V̲W̲X̲Y̲Z̲ | ABCDEFGHIJKLMOPQRTSUVWXYZ | A̱ḆC̱ḎE̱ |
double underline | A̲̲B̲̲C̲̲D̲̲E̲̲F̲̲G̲̲H̲̲I̲̲J̲̲K̲̲L̲̲M̲̲N̲̲O̲̲P̲̲Q̲̲R̲̲S̲̲T̲̲U̲̲V̲̲W̲̲X̲̲Y̲̲Z̲̲ | ABCDEFGHIJKLMOPQRTSUVWXYZ |
In plain-text applications, including plain-text e-mails where emphasis markup is not possible, the desired emphasis is often indicated by surrounding words with underscore characters. For example, "You must use _emulsion_ paint on the ceiling".
Some applications will automatically add emphasis to text manually bracketed by underscores, either by underlining or by italicizing it (e.g. _string_
may render as either string or string).
Underline (typically red or wavy or both) is often used by spell checkers (and grammar checkers) to denote misspelled or otherwise incorrect text.
Depending on local conventions, the following kinds of underlines may be used inline on manuscripts to indicate the special typefaces to be used: [14] [15]
In Chinese, the underline is a little-used punctuation mark for proper names (simplified Chinese :专名号; traditional Chinese :專名號; pinyin: zhuānmínghào; literally "proper name mark", used for personal and geographic names). Its meaning is somewhat akin to capitalization in English and should never be used for emphasis even if the influence of English computing makes the latter sometimes occur. A wavy underline (simplified Chinese :书名号; traditional Chinese :書名號; pinyin: shūmínghào; literally, "book title mark") serves a similar function, but marks names of literary works instead of proper names. [16]
In the case of two or more adjacent proper names, each individual proper name is separately underlined so there should be a slight gap between the underlining of each proper name.
ASCII, an acronym for American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. ASCII has just 128 code points, of which only 95 are printable characters, which severely limit its scope. The set of available punctuation had significant impact on the syntax of computer languages and text markup. ASCII hugely influenced the design of character sets used by modern computers, including Unicode which has over a million code points, but the first 128 of these are the same as ASCII.
A macron is a diacritical mark: it is a straight bar ¯ placed above a letter, usually a vowel. Its name derives from Ancient Greek μακρόν (makrón) 'long' because it was originally used to mark long or heavy syllables in Greco-Roman metrics. It now more often marks a long vowel. In the International Phonetic Alphabet, the macron is used to indicate a mid-tone; the sign for a long vowel is instead a modified triangular colon ⟨ː⟩.
Blackboard bold is a style of writing bold symbols on a blackboard by doubling certain strokes, commonly used in mathematical lectures, and the derived style of typeface used in printed mathematical texts. The style is most commonly used to represent the number sets , (integers), , , and .
The Coptic script is the script used for writing the Coptic language, the most recent development of Egyptian. The repertoire of glyphs is based on the uncial Greek alphabet, augmented by letters borrowed from the Egyptian Demotic. It was the first alphabetic script used for the Egyptian language. There are several Coptic alphabets, as the script varies greatly among the various dialects and eras of the Coptic language.
The symbol # is known variously in English-speaking regions as the number sign, hash, or pound sign. The symbol has historically been used for a wide range of purposes including the designation of an ordinal number and as a ligatured abbreviation for pounds avoirdupois – having been derived from the now-rare ℔.
The colon, :, is a punctuation mark consisting of two equally sized dots aligned vertically. A colon often precedes an explanation, a list, or a quoted sentence. It is also used between hours and minutes in time, between certain elements in medical journal citations, between chapter and verse in Bible citations, and, in the US, for salutations in business letters and other formal letters.
The tilde is a grapheme ⟨˜⟩ or ⟨~⟩ with a number of uses. The name of the character came into English from Spanish tilde, which in turn came from the Latin titulus, meaning 'title' or 'superscription'. Its primary use is as a diacritic (accent) in combination with a base letter. Its freestanding form is used in modern texts mainly to indicate approximation.
In typography, emphasis is the strengthening of words in a text with a font in a different style from the rest of the text, to highlight them. It is the equivalent of prosody stress in speech.
The pound sign is the symbol for the pound unit of sterling – the currency of the United Kingdom and its associated Crown Dependencies and British Overseas Territories and previously of Great Britain and of the Kingdom of England. The same symbol is used for other currencies called pound, such as the Egyptian and Syrian pounds. The sign may be drawn with one or two bars depending on personal preference, but the Bank of England has used the one-bar style exclusively on banknotes since 1975.
The vertical bar, |, is a glyph with various uses in mathematics, computing, and typography. It has many names, often related to particular meanings: Sheffer stroke, pipe, bar, or, vbar, and others.
A whitespace character is a character data element that represents white space when text is rendered for display by a computer.
An overline, overscore, or overbar, is a typographical feature of a horizontal line drawn immediately above the text. In old mathematical notation, an overline was called a vinculum, a notation for grouping symbols which is expressed in modern notation by parentheses, though it persists for symbols under a radical sign. The original use in Ancient Greek was to indicate compositions of Greek letters as Greek numerals. In Latin, it indicates Roman numerals multiplied by a thousand and it forms medieval abbreviations (sigla). Marking one or more words with a continuous line above the characters is sometimes called overstriking, though overstriking generally refers to printing one character on top of an already-printed character.
Macron below is a combining diacritical mark that is used in various orthographies.
Strikethrough, or strikeout, is a typographical presentation of words with a horizontal line through their center, resulting in text like this, sometimes an X or a forward slash is typed over the top instead of using a horizontal line. Strike-through was used in medieval manuscripts. Contrary to censored or sanitized (redacted) texts, the words remain readable.
Writing systems that use Chinese characters also include various punctuation marks, derived from both Chinese and Western sources. Historically, jùdòu annotations were often used to indicate the boundaries of sentences and clauses in text. The use of punctuation in written Chinese only became mandatory during the 20th century, due to Western influence. Unlike modern punctuation, judou marks were added by scholars for pedagogical purposes and were not viewed as integral to the text. Texts were therefore generally transmitted without judou. In most cases, this practice did not interfere with the interpretation of a text, although it occasionally resulted in ambiguity.
The programming language APL uses a number of symbols, rather than words from natural language, to identify operations, similarly to mathematical symbols. Prior to the wide adoption of Unicode, a number of special-purpose EBCDIC and non-EBCDIC code pages were used to represent the symbols required for writing APL.
The tie is a symbol in the shape of an arc similar to a large breve, used in Greek, phonetic alphabets, and Z notation. It can be used between two characters with spacing as punctuation, non-spacing as a diacritic, or (underneath) as a proofreading mark. It can be above or below, and reversed. Its forms are called tie, double breve, enotikon or papyrological hyphen, ligature tie, and undertie.
Sentence spacing in digital media concerns the horizontal width of the space between sentences in computer- and web-based media. Digital media allow sentence spacing variations not possible with the typewriter. Most digital fonts permit the use of a variable space or a no-break space. Some modern font specifications, such as OpenType, have the ability to automatically add or reduce space after punctuation, and users may be able to choose sentence spacing variations.
Caret is the name used familiarly for the character ^ provided on most QWERTY keyboards by typing ⇧ Shift+6. The symbol has a variety of uses in programming and mathematics. The name "caret" arose from its visual similarity to the original proofreader's caret, ‸, a mark used in proofreading to indicate where a punctuation mark, word, or phrase should be inserted into a document. The ASCII standard (X3.64.1977) calls it a "circumflex"; the Unicode standard calls it a "circumflex accent", although it is no longer practicable for that purpose.
Don't underline. Ever. It's ugly and it makes text harder to read. Underlining is another dreary typewriter habit... a workaround for shortcomings in typewriter technology.
Spacing Overscores and Underscores. U+203E OVERLINE is the above-the-line counterpart to U+005F low line. It is a spacing character, not to be confused with U+0305 COMBINING OVERLINE. As with all overscores and underscores, a sequence of these characters should connect in an unbroken line. The overscoring characters also must be distinguished from U+0304 COMBINING MACRON, which does not connect horizontally in this way.
{{cite journal}}
: Cite journal requires |journal=
(help){{cite journal}}
: Cite journal requires |journal=
(help)