Braille ASCII (or more formally The North American Braille ASCII Code, also known as SimBraille) is a subset of the ASCII character set which uses 64 of the printable ASCII characters to represent all possible dot combinations in six-dot braille. It was developed around 1969 and, despite originally being known as North American Braille ASCII, it is now used internationally.
Braille ASCII uses the 64 ASCII characters between 32 and 95 inclusive. All capital letters in ASCII correspond to their equivalent values in uncontracted English Braille. Note however that, unlike standard print, there is only one braille symbol for each letter of the alphabet. Therefore, in Braille, all letters are lower-case by default, unless preceded by a capitalization sign (⠠dot 6
).
The numbers 1 through 9 and 0 correspond to the letters a through j, except that they are lowered or shifted lower in the Braille cell. For example, ⠉dots 1-4
represents c, and ⠒dots 2-5
is 3. The other symbols may or may not correspond to their Braille values. For example, ⠌dots 3-4
represents / in Braille ASCII, and this is the Braille slash, but ⠿dots 1-2-3-4-5-6
represents =, and this is not the equals sign in Braille.
Braille ASCII more closely corresponds to the Nemeth Braille Code for mathematics than it does to the English Literary Braille Code, as the Nemeth Braille code is what it was originally based upon.
If Braille ASCII is viewed in a word processor, it will look like a jumbled mix of letters, numbers, and punctuation. However, there are several fonts available, many of them free, which allow the user to view and print Braille ASCII as simulated braille, i.e. a graphical representation of braille characters.
Braille ASCII was originally designed to be a means for storing and transmitting six-dot Braille in a digital format, and this continues to be its primary usage today. Because it uses standard characters available on computer keyboards, it can be easily typed and edited with a standard word processor. Many Braille embossers receive their input in Braille ASCII, and nearly all Braille translation software can import and export this format.
Most institutions which produce Braille materials distribute BRF files. BRF is a file that can represent contracted or uncontracted (i.e. grade 1 or grade 2) Unified English Braille, English Braille and non-English languages. [1] BRF files contain plain Braille ASCII plus spaces, Carriage Return, Line Feed, and Form Feed ASCII control characters. The spaces, Carriage Returns, Line Feeds, and Form feeds are sufficient to specify how the Braille is formatted. Previously BRF contained some additional specialized formatting instructions, but now BRF is formatted exactly like Web-Braille/BARD. [2] [3] BRF files can be embossed with a braille embosser or printed, read on a refreshable braille display, or imperfectly back-translated [4] into standard text [5] [6] which can then be read by a screen reader or other similar program. Many find BRF files to be a more convenient way to receive brailled content, and it has increasing use as a distribution format. [7] If a SimBraille font [8] is downloaded and installed a BRF file can be opened in WordPad, Apache OpenOffice, Microsoft Word, Apple Pages, etc., and the Braille will appear correctly rendered as 2 dimensional, non-tactile, visual 6 dot braille characters when the font is set to SimBraille.
Unicode includes a means for encoding eight-dot braille; however, Braille ASCII continues to be the preferred format for encoding six-dot braille.
The following table shows the arrangement of characters, with the hexadecimal value, corresponding ASCII character, binary notation matching the standard dot order, Braille Unicode glyph, and general meaning (the actual meaning may change depending on context). [9] [10]
|
|
The following ASCII string literal (where the content enclosed by quotes contains the escape sequences \"
for a literal "
and \\
for a literal \
) is the "ASCII glyph" column of the above table sorted according to reverse lexicographical order of its "Braille dots" column. It may be used to encode the above table. (Note that Unicode Braille characters are U+2800 through U+283F with their codepoints being in reverse lexicogrpahical order of the above table's "Braille dots" column.)
" A1B'K2L@CIF/MSP\"E3H9O6R^DJG>NTQ,*5<-U8V.%[$+X!&;:4\\0Z7(_?W]#Y)="
Under the mapping derived from the above table, the "Braille glyph" column orders according to the above key as the following Unicode codepoint string literal (note that the first character is not an ASCII space but U+2800):
"⠀⠁⠂⠃⠄⠅⠆⠇⠈⠉⠊⠋⠌⠍⠎⠏⠐⠑⠒⠓⠔⠕⠖⠗⠘⠙⠚⠛⠜⠝⠞⠟⠠⠡⠢⠣⠤⠥⠦⠧⠨⠩⠪⠫⠬⠭⠮⠯⠰⠱⠲⠳⠴⠵⠶⠷⠸⠹⠺⠻⠼⠽⠾⠿"
Only 64 characters are needed to represent all possible combinations of 6-dot braille (including space), so not all ASCII values are needed for Braille ASCII.
The lower-case letters (a to z) are not normally used, but might be interpreted as having the same dot patterns as their upper-case equivalents. `, {, |, and } are not used and their Braille ASCII rendition is not defined.
Braille ASCII is merely a subset of the ASCII table that can be used to represent all possible combinations of 6-dot braille. It is not to be confused with the Computer Braille Code, which can represent all ASCII values in braille.
Braille is a tactile writing system used by people who are visually impaired. It can be read either on embossed paper or by using refreshable braille displays that connect to computers and smartphone devices. Braille can be written using a slate and stylus, a braille writer, an electronic braille notetaker or with the use of a computer connected to a braille embosser.
Unicode, formally The Unicode Standard, is a text encoding standard maintained by the Unicode Consortium designed to support the use of text written in all of the world's major writing systems. Version 15.1 of the standard defines 149813 characters and 161 scripts used in various ordinary, literary, academic, and technical contexts.
The byte-order mark (BOM) is a particular usage of the special Unicode character code, U+FEFFZERO WIDTH NO-BREAK SPACE, whose appearance as a magic number at the start of a text stream can signal several things to a program reading the text:
In typography, kerning is the process of adjusting the spacing between characters in a proportional font, usually to achieve a visually pleasing result. Kerning adjusts the space between individual letterforms while tracking (letter-spacing) adjusts spacing uniformly over a range of characters. In a well-kerned font, the two-dimensional blank spaces between each pair of characters all have a visually similar area. The term "keming" is sometimes used informally to refer to poor kerning.
In digital typography, combining characters are characters that are intended to modify other characters. The most common combining characters in the Latin script are the combining diacritical marks.
GB/T 2312-1980 is a key official character set of the People's Republic of China, used for Simplified Chinese characters. GB2312 is the registered internet name for EUC-CN, which is its usual encoded form. GB refers to the Guobiao standards (国家标准), whereas the T suffix denotes a non-mandatory standard.
The vertical bar, |, is a glyph with various uses in mathematics, computing, and typography. It has many names, often related to particular meanings: Sheffer stroke, pipe, bar, or, vbar, and others.
Code page 437 is the character set of the original IBM PC. It is also known as CP437, OEM-US, OEM 437, PC-8, or DOS Latin US. The set includes all printable ASCII characters as well as some accented letters (diacritics), Greek letters, icons, and line-drawing symbols. It is sometimes referred to as the "OEM font" or "high ASCII", or as "extended ASCII".
A code point, codepoint or code position is a unique position in a quantized n-dimensional space that has been assigned a semantic meaning.
A whitespace character is a character data element that represents white space when text is rendered for display by a computer.
Symbol is one of the four standard fonts available on all PostScript-based printers, starting with Apple's original LaserWriter (1985). It contains a complete unaccented Greek alphabet and a selection of commonly used mathematical symbols. Insofar as it fits into any standard classification, it is a serif font designed in the style of Times New Roman.
A six-bit character code is a character encoding designed for use on computers with word lengths a multiple of 6. Six bits can only encode 64 distinct characters, so these codes generally include only the upper-case letters, the numerals, some punctuation characters, and sometimes control characters. The 7-track magnetic tape format was developed to store data in such codes, along with an additional parity bit.
Many Unicode characters are used to control the interpretation or display of text, but these characters themselves have no visual or spatial representation. For example, the null character is used in C-programming application environments to indicate the end of a string of characters. In this way, these programs only require a single starting memory address for a string, since the string ends once the program reads the null character.
GNU Unifont is a free Unicode bitmap font created by Roman Czyborra. The main Unifont covers all of the Basic Multilingual Plane (BMP). The "upper" companion covers significant parts of the Supplementary Multilingual Plane (SMP). The "Unifont JP" companion contains Japanese kanji present in the JIS X 0213 character set.
Unicode input is the insertion of a specific Unicode character on a computer by a user; it is a common way to input characters not directly supported by a physical keyboard. Unicode characters can be produced either by selecting them from a display or by typing a certain sequence of keys on a physical keyboard. In addition, a character produced by one of these methods in one web page or document can be copied into another. In contrast to ASCII's 96 element character set, Unicode encodes hundreds of thousands of graphemes (characters) from almost all of the world's written languages and many other signs and symbols besides.
KPS 9566 is a North Korean standard specifying a character encoding for the Chosŏn'gŭl (Hangul) writing system used for the Korean language. The edition of 1997 specified an ISO 2022-compliant 94×94 two-byte coded character set. Subsequent editions have added additional encoded characters outside of the 94×94 plane, in a manner comparable to UHC or GBK.
Computer Braille is an adaptation of braille for precise representation of computer-related materials such as programs, program lines, computer commands, and filenames. Unlike standard 6-dot braille scripts, but like Gardner–Salinas braille codes, this may employ the extended 8-dot braille patterns.
Tamil All Character Encoding (TACE16) is a scheme for encoding the Tamil script in the Private Use Area of Unicode, implementing a syllabary-based character model differing from the modified-ISCII model used by Unicode's existing Tamil implementation.
The ZX80 character set is the character encoding used by the Sinclair Research ZX80 microcomputer with its original 4K BASIC ROM. The encoding uses one byte per character for 256 code points. It has no relationship with previously established ones like ASCII or EBCDIC, but it is related though not identical to the character set of the successor ZX81.
The ZX81 character set is the character encoding used by the Sinclair Research ZX81 family of microcomputers including the Timex Sinclair 1000 and Timex Sinclair 1500. The encoding uses one byte per character for 256 code points. It has no relationship with previously established ones like ASCII or EBCDIC, but it is related though not identical to the character set of the predecessor ZX80.