^ | |
---|---|
Caret | |
In Unicode | U+005E^ CIRCUMFLEX ACCENT (^) |
Different from | |
Different from | U+2038‸ CARET U+02C6ˆ MODIFIER LETTER CIRCUMFLEX ACCENT U+028Cʌ LATIN SMALL LETTER TURNED V U+2227∧ LOGICAL AND U+039BΛGREEK CAPITAL LETTER LAMDA |
Related | |
See also | U+FF3E^FULLWIDTH CIRCUMFLEX ACCENT |
Caret is the name used familiarly for the character ^ provided on most QWERTY keyboards by typing ⇧ Shift+6. The symbol has a variety of uses in programming and mathematics. The name "caret" arose from its visual similarity to the original proofreader's caret, ‸, a mark used in proofreading to indicate where a punctuation mark, word, or phrase should be inserted into a document. The ASCII standard (X3.64.1977) calls it a "circumflex"; [3] the Unicode standard calls it a "circumflex accent", although it is no longer practicable for that purpose.
On typewriters designed for languages that routinely use diacritics (accent marks), there are two possible ways to type these: keys can be dedicated to precomposed characters (with the diacritic included); alternatively a dead key mechanism can be provided. With the latter, a mark is made when a dead key is typed but, unlike normal keys, the paper carriage does not move on and thus the next letter to be typed is printed under the accent. The ^ symbol was originally provided in typewriters and computer printers so that circumflex accents could be overprinted on letters (as in ô or ŵ).
The incorporation of the circumflex symbol into ASCII is a consequence of this prior existence on typewriters: this symbol did not exist independently as a type or hot-lead printing character. The original 1963 version of the ASCII standard used the code point 0x5E for an up-arrow ↑. However, the 1965 ISO/IEC 646 standard defined code point 0x5E as one of five available for national variation, [lower-alpha 1] with the circumflex ^ diacritic as the default and the up-arrow as one of the alternative uses. [4] In 1967, the second revision of ASCII followed suit. [5]
Overprinting to add an accent mark was not always supported well by printers, and was almost never possible on video terminals. Instead, precomposed characters were eventually created to show the accented letters. [lower-alpha 2] The freestanding circumflex (which had come to be called a caret) quickly became reused for many other purposes, such as in computer languages and mathematical notation. As the mark did not need to fit above a letter any more, it became larger in appearance such that it can no longer be used to overprint an accent in most fonts. [6]
In Unicode the symbol is encoded as U+005E^CIRCUMFLEX ACCENT; in HTML it may be used directly or inserted with ^
. The combining character for use as a diacritic is U+0302◌̂COMBINING CIRCUMFLEX ACCENT, although precomposed characters (like U+00E2âLATIN SMALL LETTER A WITH CIRCUMFLEX) are available for most European languages.
The symbol ^ has many uses in programming languages, where it is typically called a caret. It can signify exponentiation, the bitwise XOR operator, string concatenation [ citation needed ], and control characters in caret notation, among other uses. In regular expressions, the caret is used to match the beginning of a string or line; if it begins a character class, then the inverse of the class is to be matched.
ANSI C can transcribe the caret in the form of the trigraph ??'
, as the character was originally not available in all character sets and keyboards. C++ additionally supports tokens like xor
(for ^
) and xor_eq
(for ^=
) to avoid the character altogether. RFC 1345 recommends that the character be transcribed as digraph '>
when required. [7]
Pascal uses the caret for declaring and dereferencing pointers. In Smalltalk, the caret is the method return statement. In C++/CLI, .NET reference types are accessed through a handle using the ClassName^
syntax. In Apple's C extensions for Mac OS X and iOS, carets are used to create blocks and to denote block types. Go uses it as a bitwise NOT operator.
Node.js uses the caret in package.json files to signify dependency resolution behavior being used for each particular dependency. In the case of Node.js, a caret allows any kind of update, unless it is seen as a "major" update as defined by semver. [8]
In mathematics, the caret can signify exponentiation (e.g. 3^5
for 35) where the usual superscript is not readily usable (as on some graphing calculators). It is also used to indicate a superscript in TeX typesetting.
The use of the caret for exponentiation can be traced back to ALGOL 60,[ citation needed ] which expressed the exponentiation operator as an upward-pointing arrow, intended to evoke the superscript notation common in mathematics. The upward-pointing arrow is now used to signify hyperoperations in Knuth's up-arrow notation.
It is often seen in caret notation to show control characters: for instance, ^A
means the control character with value 1.
The Windows command-line interpreter (cmd.exe) uses the caret to escape reserved characters (most other shells use the backslash). For example, to pass a 'less-than' sign as an argument to a program, one would type ^<
.
In internet forums, on social networking sites such as Facebook, or in online chats, one or more carets may be used beneath the text of another post, representing an upward-pointing arrow to that post; [9] in addition to the arrow usage, it can also mean that the user who posted the ^ agrees with the above post. Multiple carets may be used to indicate that the comment is replying to, or relating to, the post above that correlates with the number of carets used, or to "underscore" the correct portion of the previous post, or simply for emphasis.
A similar use has been adopted by programming language compilers, such as the Java compiler, to point out where a compilation error has occurred.[ citation needed ] The compiler prints out the faulty line of code and uses a single caret on the next line, padded by spaces, to give a visual indication of the error location.
A diacritic is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek διακριτικός, from διακρίνω. The word diacritic is a noun, though it is sometimes used in an attributive sense, whereas diacritical is only an adjective. Some diacritics, such as the acute ⟨ó⟩, grave ⟨ò⟩, and circumflex ⟨ô⟩, are often called accents. Diacritics may appear above or below a letter or in some other position such as within the letter or between two letters.
QWERTY is a keyboard layout for Latin-script alphabets. The name comes from the order of the first six keys on the top letter row of the keyboard: QWERTY. The QWERTY design is based on a layout included in the Sholes and Glidden typewriter sold via E. Remington and Sons from 1874. QWERTY became popular with the success of the Remington No. 2 of 1878 and remains in ubiquitous use.
The circumflex is a diacritic in the Latin and Greek scripts that is also used in the written forms of many languages and in various romanization and transcription schemes. It received its English name from Latin: circumflexus "bent around"—a translation of the Greek: περισπωμένη.
The tilde is a grapheme ⟨˜⟩ or ⟨~⟩ with a number of uses. The name of the character came into English from Spanish tilde, which in turn came from the Latin titulus, meaning 'title' or 'superscription'. Its primary use is as a diacritic (accent) in combination with a base letter. Its freestanding form is used in modern texts mainly to indicate approximation.
A caron is a diacritic mark commonly placed over certain letters in the orthography of some languages to indicate a change of the related letter's pronunciation.
AZERTY is a specific layout for the characters of the Latin alphabet on typewriter keys and computer keyboards. The layout takes its name from the first six letters to appear on the first row of alphabetical keys; that is,. Similar to the QWERTZ layout, it is modelled on the English QWERTY layout. It is used in France and Belgium, although each of these countries has its own national variation on the layout. Luxembourg and Switzerland use the Swiss QWERTZ keyboard. Most residents of Quebec, the mainly French-speaking province of Canada, use a QWERTY keyboard that has been adapted to the French language such as the Multilingual Standard keyboard CAN/CSA Z243.200-92 which is stipulated by the government of Quebec and the Government of Canada.
A dead key is a special kind of modifier key on a mechanical typewriter, or computer keyboard, that is typically used to attach a specific diacritic to a base letter. The dead key does not generate a (complete) character by itself, but modifies the character generated by the key struck immediately after. Thus, a dedicated key is not needed for each possible combination of a diacritic and a letter, but rather only one dead key for each diacritic is needed, in addition to the normal base letter keys.
In digital typography, combining characters are characters that are intended to modify other characters. The most common combining characters in the Latin script are the combining diacritical marks.
The backtick` is a typographical mark used mainly in computing. It is also known as backquote, grave, or grave accent.
Backspace is the keyboard key that in typewriters originally pushed the carriage one position backwards, and in modern computer systems typically moves the display cursor one position backwards, deletes the character at that position, and shifts back any text after that position by one character.
The degree symbol or degree sign, °, is a glyph or symbol that is used, among other things, to represent degrees of arc, hours, degrees of temperature or alcohol proof. The symbol consists of a small superscript circle.
Caret notation is a notation for control characters in ASCII. The notation assigns ^A
to control-code 1, sequentially through the alphabet to ^Z
assigned to control-code 26 (0x1A). For the control-codes outside of the range 1–26, the notation extends to the adjacent, non-alphabetic ASCII characters.
Diacritical marks of two dots¨, placed side-by-side over or under a letter, are used in several languages for several different purposes. The most familiar to English-language speakers are the diaeresis and the umlaut, though there are numerous others. For example, in Albanian, ë represents a schwa. Such diacritics are also sometimes used for stylistic reasons.
Over a thousand characters from the Latin script are encoded in the Unicode Standard, grouped in several basic and extended Latin blocks. The extended ranges contain mainly precomposed letters plus diacritics that are equivalently encoded with combining diacritics, as well as some ligatures and distinct letters, used for example in the orthographies of various African languages and the Vietnamese alphabet. Latin Extended-C contains additions for Uighur and the Claudian letters. Latin Extended-D comprises characters that are mostly of interest to medievalists. Latin Extended-E mostly comprises characters used for German dialectology (Teuthonista). Latin Extended-F and -G contain characters for phonetic transcription.
Unicode equivalence is the specification by the Unicode character encoding standard that some sequences of code points represent essentially the same character. This feature was introduced in the standard to allow compatibility with preexisting standard character sets, which often included similar or identical characters.
Extended ASCII is a repertoire of character encodings that include the original 96 ASCII character set, plus up to 128 additional characters. There is no formal definition of "extended ASCII", and even use of the term is sometimes criticized, because it can be mistakenly interpreted to mean that the American National Standards Institute (ANSI) had updated its ANSI X3.4-1986 standard to include more characters, or that the term identifies a single unambiguous encoding, neither of which is the case.
The Vietnamese language is written with a Latin script with diacritics which requires several accommodations when typing on phone or computers. Software-based systems are a form of writing Vietnamese on phones or computers with software that can be installed on the device or from third-party software such as UniKey. Telex is the oldest input method devised to encode the Vietnamese language with its tones. Other input methods may also include VNI and VIQR. VNI input method is not to be confused with VNI code page.
There are two conventional sets ASCII substitutions for the letters in the Esperanto alphabet that have diacritics, as well as a number of graphic work-arounds.
There are a number of methods to input Esperanto letters and text on a computer, e.g. when using a word processor or email. Input methods depend on a computer's operating system. Specifically the characters ĵ, ĝ, ĉ, ĥ, ŭ, ŝ can be problematic.