Braille ASCII

Last updated

Braille ASCII (or more formally The North American Braille ASCII Code, also known as SimBraille) is a subset of the ASCII character set which uses 64 of the printable ASCII characters to represent all possible dot combinations in six-dot braille. It was developed around 1969 and, despite originally being known as North American Braille ASCII, it is now used internationally.

Contents

Overview

Braille ASCII uses the 64 ASCII characters between 32 and 95 inclusive. All capital letters in ASCII correspond to their equivalent values in uncontracted English Braille. Note however that, unlike standard print, there is only one braille symbol for each letter of the alphabet. Therefore, in Braille, all letters are lower-case by default, unless preceded by a capitalization sign (dot 6).

The numbers 1 through 9 and 0 correspond to the letters a through j, except that they are lowered or shifted lower in the Braille cell. For example, dots 1-4 represents c, and dots 2-5 is 3. The other symbols may or may not correspond to their Braille values. For example, dots 3-4 represents / in Braille ASCII, and this is the Braille slash, but dots 1-2-3-4-5-6 represents =, and this is not the equals sign in Braille.

Braille ASCII more closely corresponds to the Nemeth Braille Code for mathematics than it does to the English Literary Braille Code, as the Nemeth Braille code is what it was originally based upon.

If Braille ASCII is viewed in a word processor, it will look like a jumbled mix of letters, numbers, and punctuation. However, there are several fonts available, many of them free, which allow the user to view and print Braille ASCII as simulated braille, i.e. a graphical representation of braille characters.

Uses

Braille ASCII was originally designed to be a means for storing and transmitting six-dot Braille in a digital format, and this continues to be its primary usage today. Because it uses standard characters available on computer keyboards, it can be easily typed and edited with a standard word processor. Many Braille embossers receive their input in Braille ASCII, and nearly all Braille translation software can import and export this format.

Most institutions which produce Braille materials distribute BRF files. BRF is a file that can represent contracted or uncontracted (i.e. grade 1 or grade 2) Unified English Braille, English Braille and non-English languages. [1] BRF files contain plain Braille ASCII plus spaces, Carriage Return, Line Feed, and Form Feed ASCII control characters. The spaces, Carriage Returns, Line Feeds, and Form feeds are sufficient to specify how the Braille is formatted. Previously BRF contained some additional specialized formatting instructions, but now BRF is formatted exactly like Web-Braille/BARD. [2] [3] BRF files can be embossed with a braille embosser or printed, read on a refreshable braille display, or imperfectly back-translated [4] into standard text [5] [6] which can then be read by a screen reader or other similar program. Many find BRF files to be a more convenient way to receive brailled content, and it has increasing use as a distribution format. [7] If a SimBraille font [8] is downloaded and installed a BRF file can be opened in WordPad, Apache OpenOffice, Microsoft Word, Apple Pages, etc., and the Braille will appear correctly rendered as 2 dimensional, non-tactile, visual 6 dot braille characters when the font is set to SimBraille.

Unicode includes a means for encoding eight-dot braille; however, Braille ASCII continues to be the preferred format for encoding six-dot braille.

Braille ASCII values

The following table shows the arrangement of characters, with the hexadecimal value, corresponding ASCII character, binary notation matching the standard dot order, Braille Unicode glyph, and general meaning (the actual meaning may change depending on context). [9] [10]

ASCII hexASCII glyphBraille dotsBraille glyphUnicode Braille glyphBraille meaning
20 (space) 000000 Braille NULL.svg (space)
21 ! 011101 Braille E.svg the
22 " 000010 Braille ContractionPrefix.svg (contraction)
23 # 001111 Braille NumberSign.svg (number prefix)
24 $ 110101 Braille E.svg ed
25 % 100101 Braille SH.svg sh
26 & 111101 Braille AND.svg and
27 ' 001000 Braille Apostrophe.svg '
28 ( 111011 Braille A.svg of
29 ) 011111 Braille U.svg with
2A * 100001 Braille A.svg ch
2B + 001101 Braille O.svg ing
2C , 000001 Braille CapitalSign.svg (uppercase prefix)
2D - 001001 Braille Hyphen.svg -
2E . 000101 Braille DecimalPoint.svg (italic prefix)
2F / 001100 Braille ST.svg st or /
30 0 001011 Braille QuoteClose.svg
31 1 010000 Braille Comma.svg ,
32 2 011000 Braille Semicolon.svg ;
33 3 010010 Braille Colon.svg :
34 4 010011 Braille Period.svg .
35 5 010001 Braille QuestionMark.svg en
36 6 011010 Braille ExclamationPoint.svg !
37 7 011011 Braille Bracket.svg ( or )
38 8 011001 Braille QuoteOpen.svg “ or ?
39 9 001010 Braille Asterisk.svg in
3A : 100011 Braille U.svg wh
3B ; 000011 Braille Correction.svg (letter prefix)
3C < 110001 Braille E.svg gh
3D = 111111 Braille E.svg for
3E > 001110 Braille A.svg ar
3F ? 100111 Braille O.svg th
 
ASCII hexASCII glyphBraille dotsBraille glyphUnicode Braille glyphBraille meaning
40 @ 000100 Braille Accent.svg (accent prefix)
41 A 100000 Braille A1.svg a
42 B 110000 Braille B2.svg b
43 C 100100 Braille C3.svg c
44 D 100110 Braille D4.svg d
45 E 100010 Braille E5.svg e
46 F 110100 Braille F6.svg f
47 G 110110 Braille G7.svg g
48 H 110010 Braille H8.svg h
49 I 010100 Braille I9.svg i
4A J 010110 Braille J0.svg j
4B K 101000 Braille K.svg k
4C L 111000 Braille L.svg l
4D M 101100 Braille M.svg m
4E N 101110 Braille N.svg n
4F O 101010 Braille O.svg o
50 P 111100 Braille P.svg p
51 Q 111110 Braille Q.svg q
52 R 111010 Braille R.svg r
53 S 011100 Braille S.svg s
54 T 011110 Braille T.svg t
55 U 101001 Braille U.svg u
56 V 111001 Braille V.svg v
57 W 010111 Braille W.svg w
58 X 101101 Braille X.svg x
59 Y 101111 Braille Y.svg y
5A Z 101011 Braille Z.svg z
5B [ 010101 Braille O.svg ow
5C \ 110011 Braille U.svg ou
5D ] 110111 Braille I.svg er
5E ^ 000110 Braille Currency.svg (currency prefix)
5F _ 000111 Braille CursiveSign.svg (contraction)

The following ASCII string literal (where the content enclosed by quotes contains the escape sequences \" for a literal " and \\ for a literal \) is the "ASCII glyph" column of the above table sorted according to reverse lexicographical order of its "Braille dots" column. It may be used to encode the above table. (Note that Unicode Braille characters are U+2800 through U+283F with their codepoints being in reverse lexicogrpahical order of the above table's "Braille dots" column.)

" A1B'K2L@CIF/MSP\"E3H9O6R^DJG>NTQ,*5<-U8V.%[$+X!&;:4\\0Z7(_?W]#Y)="

Under the mapping derived from the above table, the "Braille glyph" column orders according to the above key as the following Unicode codepoint string literal (note that the first character is not an ASCII space but U+2800):

"⠀⠁⠂⠃⠄⠅⠆⠇⠈⠉⠊⠋⠌⠍⠎⠏⠐⠑⠒⠓⠔⠕⠖⠗⠘⠙⠚⠛⠜⠝⠞⠟⠠⠡⠢⠣⠤⠥⠦⠧⠨⠩⠪⠫⠬⠭⠮⠯⠰⠱⠲⠳⠴⠵⠶⠷⠸⠹⠺⠻⠼⠽⠾⠿"

Unused ASCII values

Only 64 characters are needed to represent all possible combinations of 6-dot braille (including space), so not all ASCII values are needed for Braille ASCII.

The lower-case letters (a to z) are not normally used, but might be interpreted as having the same dot patterns as their upper-case equivalents. `, {, |, and } are not used and their Braille ASCII rendition is not defined.

Braille ASCII is merely a subset of the ASCII table that can be used to represent all possible combinations of 6-dot braille. It is not to be confused with the Computer Braille Code, which can represent all ASCII values in braille.

See also

Related Research Articles

<span class="mw-page-title-main">Braille</span> Tactile writing system for blind and visually impaired people

Braille is a tactile writing system used by people who are visually impaired. It can be read either on embossed paper or by using refreshable braille displays that connect to computers and smartphone devices. Braille can be written using a slate and stylus, a braille writer, an electronic braille notetaker or with the use of a computer connected to a braille embosser.

<span class="mw-page-title-main">Unicode</span> Character encoding standard

Unicode, formally The Unicode Standard, is a text encoding standard maintained by the Unicode Consortium designed to support the use of text written in all of the world's major writing systems. Version 15.1 of the standard defines 149813 characters and 161 scripts used in various ordinary, literary, academic, and technical contexts.

The byte-order mark (BOM) is a particular usage of the special Unicode character code, U+FEFFZERO WIDTH NO-BREAK SPACE, whose appearance as a magic number at the start of a text stream can signal several things to a program reading the text:

<span class="mw-page-title-main">Kerning</span> Process in typography

In typography, kerning is the process of adjusting the spacing between characters in a proportional font, usually to achieve a visually pleasing result. Kerning adjusts the space between individual letterforms while tracking (letter-spacing) adjusts spacing uniformly over a range of characters. In a well-kerned font, the two-dimensional blank spaces between each pair of characters all have a visually similar area. The term "keming" is sometimes used informally to refer to poor kerning.

In digital typography, combining characters are characters that are intended to modify other characters. The most common combining characters in the Latin script are the combining diacritical marks.

GB/T 2312-1980 is a key official character set of the People's Republic of China, used for Simplified Chinese characters. GB2312 is the registered internet name for EUC-CN, which is its usual encoded form. GB refers to the Guobiao standards (国家标准), whereas the T suffix denotes a non-mandatory standard.

The vertical bar, |, is a glyph with various uses in mathematics, computing, and typography. It has many names, often related to particular meanings: Sheffer stroke, pipe, bar, or, vbar, and others.

<span class="mw-page-title-main">Code page 437</span> Character set of the original IBM PC

Code page 437 is the character set of the original IBM PC. It is also known as CP437, OEM-US, OEM 437, PC-8, or DOS Latin US. The set includes all printable ASCII characters as well as some accented letters (diacritics), Greek letters, icons, and line-drawing symbols. It is sometimes referred to as the "OEM font" or "high ASCII", or as "extended ASCII".

A code point, codepoint or code position is a unique position in a quantized n-dimensional space that has been assigned a semantic meaning.

A whitespace character is a character data element that represents white space when text is rendered for display by a computer.

Symbol is one of the four standard fonts available on all PostScript-based printers, starting with Apple's original LaserWriter (1985). It contains a complete unaccented Greek alphabet and a selection of commonly used mathematical symbols. Insofar as it fits into any standard classification, it is a serif font designed in the style of Times New Roman.

A six-bit character code is a character encoding designed for use on computers with word lengths a multiple of 6. Six bits can only encode 64 distinct characters, so these codes generally include only the upper-case letters, the numerals, some punctuation characters, and sometimes control characters. The 7-track magnetic tape format was developed to store data in such codes, along with an additional parity bit.

Many Unicode characters are used to control the interpretation or display of text, but these characters themselves have no visual or spatial representation. For example, the null character is used in C-programming application environments to indicate the end of a string of characters. In this way, these programs only require a single starting memory address for a string, since the string ends once the program reads the null character.

<span class="mw-page-title-main">GNU Unifont</span> Duospaced bitmap font

GNU Unifont is a free Unicode bitmap font created by Roman Czyborra. The main Unifont covers all of the Basic Multilingual Plane (BMP). The "upper" companion covers significant parts of the Supplementary Multilingual Plane (SMP). The "Unifont JP" companion contains Japanese kanji present in the JIS X 0213 character set.

<span class="mw-page-title-main">Unicode input</span> Input characters using their Unicode code points

Unicode input is the insertion of a specific Unicode character on a computer by a user; it is a common way to input characters not directly supported by a physical keyboard. Unicode characters can be produced either by selecting them from a display or by typing a certain sequence of keys on a physical keyboard. In addition, a character produced by one of these methods in one web page or document can be copied into another. In contrast to ASCII's 96 element character set, Unicode encodes hundreds of thousands of graphemes (characters) from almost all of the world's written languages and many other signs and symbols besides.

KPS 9566 is a North Korean standard specifying a character encoding for the Chosŏn'gŭl (Hangul) writing system used for the Korean language. The edition of 1997 specified an ISO 2022-compliant 94×94 two-byte coded character set. Subsequent editions have added additional encoded characters outside of the 94×94 plane, in a manner comparable to UHC or GBK.

Computer Braille is an adaptation of braille for precise representation of computer-related materials such as programs, program lines, computer commands, and filenames. Unlike standard 6-dot braille scripts, but like Gardner–Salinas braille codes, this may employ the extended 8-dot braille patterns.

Tamil All Character Encoding (TACE16) is a scheme for encoding the Tamil script in the Private Use Area of Unicode, implementing a syllabary-based character model differing from the modified-ISCII model used by Unicode's existing Tamil implementation.

<span class="mw-page-title-main">ZX80 character set</span> Character set

The ZX80 character set is the character encoding used by the Sinclair Research ZX80 microcomputer with its original 4K BASIC ROM. The encoding uses one byte per character for 256 code points. It has no relationship with previously established ones like ASCII or EBCDIC, but it is related though not identical to the character set of the successor ZX81.

<span class="mw-page-title-main">ZX81 character set</span> Character encoding used in the Sinclair ZX81 computers

The ZX81 character set is the character encoding used by the Sinclair Research ZX81 family of microcomputers including the Timex Sinclair 1000 and Timex Sinclair 1500. The encoding uses one byte per character for 256 code points. It has no relationship with previously established ones like ASCII or EBCDIC, but it is related though not identical to the character set of the predecessor ZX80.

References

  1. "World Braille Usage".
  2. "New BARD Overview". nlsbard.loc.gov.
  3. "NBP - What's a BRF". www.nbp.org.
  4. "Liblouis* - An open-source braille translator and back-translator". liblouis.org.
  5. "About Electronic Files - APH Louis - APH Louis". louis.aph.org.
  6. "What are DAISY and BRF? - Bookshare". www.bookshare.org.
  7. "IRS Tax Forms (in Braille and Text Formats) - Internal Revenue Service". www.irs.gov.
  8. "BRL: Braille Through Remote Learning". www.brl.org.
  9. "Representing and Displaying Braille". DotlessBraille.org. February 20, 2002. Retrieved August 9, 2009.
  10. Halleck, John (August 24, 2000). "braille-ascii.ads". Braille.Ascii. Archived from the original on June 13, 2010. Retrieved August 10, 2009.