ATASCII

Last updated
The entire visible ATASCII character set, both normal and inverse glyphs, upscaled to 2x to better show details Atascii-character-set-00toFF-2x.gif
The entire visible ATASCII character set, both normal and inverse glyphs, upscaled to 2x to better show details

The ATASCII character set, from ATARI Standard Code for Information Interchange, alternatively ATARI ASCII, is a character encoding used in the Atari 8-bit home computers. ATASCII is based on ASCII, but is not fully compatible with it.

Contents

The first computers in the Atari 8-bit series are the Atari 400 and 800, released in 1979, and later models were released throughout the 1980s. The last computer to use the ATASCII character set is the Atari XEGS, which was released in 1987 and discontinued in 1992. The Atari ST family of computers use the different Atari ST character set.

Like most other variants of ASCII, ATASCII has its own distinct characters (arrows, blocks, box-drawing characters, playing card suits, etc.) in place of the C0 control codes in ASCII (characters 0–31), as well as replacing a few other ASCII code points.

Implementation

Atari 8-bit systems have three distinct sets of codes: interchange codes (ATASCII), internal codes (also called screen codes), and keyboard codes. [1] [2]

Keyboard codes represent the codes sent by the keyboard. Pressing one of the two modifier keys (Shift and Control) modifies the value input by pressing other keys. Due to there being two modifier keys, there are four distinct keyboard codes that can be sent by each character; however, several keys (the exact keys depend on the model) do not send a control code if they are pressed while holding both Shift and Control. [1] When entering text, the Atari keyboard handler converts these signals into ATASCII. [3]

ATASCII and internal codes contain the same character set, but indexed differently. ATASCII codes are used by BASIC, while internal codes are used to look up how to render the character on-screen. [1]

Atari 8-bit systems have several distinct graphics modes; these modes can be classified as pure text modes, pure graphics modes, or mixed modes. Modes 0, 1 and 2 represent pure text modes, while Modes 3 and above represent mixed or pure graphics modes (the exact number of distinct modes depending on the model). [4] Mode 0 displays characters at the default size, Mode 1 displays them twice as wide (but the same height), and Mode 2 displays them twice as wide and twice the height. [4] Mode 0 is the default graphics mode and supports 128 unique characters in one of two colors (regular or inverse video, depending on the upper bit); Modes 1 and 2 only support 64 unique characters, but support four different colors (as they use the upper two bits as color information instead). [2] The 64 characters available in Modes 1 and 2 are the first 64 characters in the internal code, which correspond to ATASCII codes 32 to 95 (0x20 to 0x5F). [3] This includes all uppercase letters and punctuation, but excludes lowercase letters and graphics characters.

The Atari screen editor implements the text cursor by simply inverting the character at the cursor position (by XOR with 0x80). It does not flash.

Inverse video

ATASCII only has 128 unique graphic characters, with the upper 128 graphic characters (index 128 to 255) being inverse video variants of the lower 128 graphic characters (index 0 to 127). If the high-order bit is set on a character (i.e., if the byte value of the character is between 128 and 255), the character is generally rendered as the inverse video variant of its counterpart between 0 and 127, using a bitwise negation of the character's glyph. This is done by the ANTIC chip.

Due to this behavior, there is asymmetry in the selection of block-drawing characters. In normal video, there are lower triangles but no upper triangles, a left half block but no right half block, and a lower half block but no upper half block; these ostensibly missing characters can be displayed by using inverse video.

Alternate character sets

The international character set included in the XL and XE models Atascii-international.png
The international character set included in the XL and XE models

Atari 8-bit computers, via the ANTIC coprocessor, supported indirection of the character set graphics, allowing a program to redefine the graphical glyphs that appear for each ATASCII character. [2] This can be used as a new font for text, to support an additional character set, or for tile graphics in a video game or other application. Cycling between multiple redefined character sets can be used to provide simple animation at very little CPU cost (in exchange for memory used to store the character set data). Altering a character set in RAM can also be used for animation.

In the XL and XE lines, the Atari OS ROM includes an "international character set" that replaces 29 of the graphical glyphs with Latin alphabetical characters containing diacritics, such as e-acute (é). The OS built into the Atari 1200XL, the only Atari 8-bit model with function keys, allowed users to switch between the standard and alternate character sets by pressing CTRL+F4. [5] Later XL and XE models required the user to update a register in RAM (e.g., via a POKE command in BASIC). [2]

In some regions, a different character set was included instead of the default international character set, in order to better accommodate the target market, including Polish, Arabic, and Hebrew. Atari 192XT and 256XT systems distributed in Eastern Europe by P.Z.Karen had a Polish character set in place of the international character set. [6] [7] The Atari 65XE Najm, which was distributed in the Middle East, has an Arabic character encoding as its default encoding and displays text right-to-left, while the international character set was replaced by the standard ATASCII encoding. [7] [8] [9] [10] Hebrew versions of the Atari 600XL and 800XL were distributed in Israel, which had a Hebrew character set in place of the international character set. The Hebrew character set had Hebrew letters instead of lowercase Latin letters, but preserved the uppercase Latin letters. When typing in Hebrew mode, typing Latin letters advances the cursor to the right, while typing Hebrew letters advances the cursor to the left. [7] [11]

Character set

Default graphic characters

The following table shows the default ATASCII character set. Control characters with a graphic representation are displayed using that representation. Each character is shown with a Unicode equivalent.

ATASCII [12] [13]
0123456789ABCDEF
0x 🮇 🮂
1x
2x  SP   ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x a b c d e f g h i j k l m n o
7x p q r s t u v w x y z | 🢰
8x 🮅
9x 🮊 EOL
Ax !"#$%&'()*+,-./
Bx0123456789:;<=> 🯄
Cx@ABCDEFGHIJKLMNO
DxPQRSTUVWXYZ[\]^_
Exabcdefghijklmno
Fxpqrstuvwxyz-🢰

The box-drawing characters are arranged relative to their corresponding letter keys on the Atari keyboard, appearing 64 code points earlier than the corresponding uppercase letter. For example, ┌, ┬, and ┐ are the graphics characters found on the top left Q, W, and E keys, and appear 64 code points before those uppercase letters in ATASCII.

International character set

The following table shows the lower half of ATASCII international character set. The upper half are inverse video variants of the lower half, in exactly the same way as the standard ATASCII character set.

ATASCII international character set [14]
0123456789ABCDEF
0xáùÑÉçôòì£ïüäÖúóö
1xÜâûîéèñêåàÅ
2x  SP   ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x¡ a b c d e f g h i j k l m n o
7x p q r s t u v w x y z Ä | 🢰
  Differs from standard ATASCII

Control characters

ATASCII has 16 control characters, defined in four separate ranges (0x1B to 0x1F, 0x7D to 0x7F, 0x8B to 0x8F, and 0xFD to 0xFF). [15] This is a key difference between ASCII and ATASCII—in ASCII, there are 32 control characters, defined in the range 0 to 31 (0x00 to 0x1F).

All ATASCII control characters except End of Line (0x9B) have a graphic representation, which can be produced by escaping that character by pressing the Escape key before inputting that control character. [15] For example, typing "Escape" followed by "cursor right" will produce a right arrow. Uniquely, the End of Line control character always renders a newline, regardless of the presence of a preceding escape character. [15]

ATASCII control characters [3]
Hex DecimalFunctionKeystroke
1B27 Escape key ESC
1C28Cursor UpCTRL+-
1D29Cursor DownCTRL+=
1E30Cursor LeftCTRL++
1F31Cursor RightCTRL+*
7D125Clear ScreenCTRL+< or ⇧ Shift+<
7E126Delete← Backspace
7F127 Tab Tab ↹
9B155 End of line RETURN
9C156Delete Line⇧ Shift+← Backspace
9D157Insert Line⇧ Shift+>
9E158Clear Tab stop CTRL+Tab ↹
9F159Set Tab stop ⇧ Shift+Tab ↹
FD253 Buzzer CTRL+2
FE254Delete CharacterCTRL+← Backspace
FF255Insert CharacterCTRL+>

Inter-operation

The differences between character representation can cause problems during modem communication between Ataris and other computers. Cursor movement commands (and even carriage returns and line feeds) from computers not using ATASCII will be nonsense on an Atari, and vice versa. Terminal programs need to translate between ATASCII and standard ASCII.

Some Atari-based BBSs exploited this difference by asking the client to hit the "Return" key. If it got 13 (ASCII CR), then standard ASCII would be used. If it got 155 (ATASCII CR) it would switch to ATASCII, allowing full use of the ATASCII graphic set. Some Atari BBSs would also block features (or even block access completely) for non-Atari users.[ citation needed ]

Text files encoded in ATASCII also need conversion to be viewed on modern PCs and vice versa—utilities are available to facilitate this. [16]

ATASCII animations

The control codes in ATASCII are transmissible to other computers such as BBSs, and crude animations are possible. These animations, also known as "break movies", often take the form of short cartoons, and were a popular feature of Atari BBSs in their heyday. [17]

Because cursor control operations are represented with a single character (as opposed to multi-byte sequences that were common in other schemes, like ANSI or VT100), it is quite easy to make these animations. They can be created by a short BASIC program that captures keyboard commands, echoes them to the screen and saves them to a file. [18] The Atari also allowed commands to be typed and captured as part of its operating system. Of course this required care to get it right, but after a few attempts it normally became quite easy. The simple capture programs didn't have editing features, so ATASCII movies frequently had errors that were corrected by repositioning the cursor and printing over the mistake.

See also

Related Research Articles

<span class="mw-page-title-main">ASCII</span> American character encoding standard

ASCII, an acronym for American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of technical limitations of computer systems at the time it was invented, ASCII has just 128 code points, of which only 95 are printable characters, which severely limited its scope. Modern computer systems have evolved to use Unicode, which has millions of code points, but the first 128 of these are the same as the ASCII set.

<span class="mw-page-title-main">ASCII art</span> Computer art form using text characters

ASCII art is a graphic design technique that uses computers for presentation and consists of pictures pieced together from the 95 printable characters defined by the ASCII Standard from 1963 and ASCII compliant character sets with proprietary extended characters. The term is also loosely used to refer to text-based visual art in general. ASCII art can be created with any text editor, and is often used with free-form languages. Most examples of ASCII art require a fixed-width font such as Courier for presentation.

In computing and telecommunication, a control character or non-printing character (NPC) is a code point in a character set that does not represent a written character or symbol. They are used as in-band signaling to cause effects other than the addition of a symbol to the text. All other characters are mainly graphic characters, also known as printing characters, except perhaps for "space" characters. In the ASCII standard there are 33 control characters, such as code 7, BEL, which rings a terminal bell.

<span class="mw-page-title-main">Plain text</span> Term for computer data consisting only of unformatted characters of readable material

In computing, plain text is a loose term for data that represent only characters of readable material but not its graphical representation nor other objects. It may also include a limited number of "whitespace" characters that affect simple arrangement of text, such as spaces, line breaks, or tabulation characters. Plain text is different from formatted text, where style information is included; from structured text, where structural parts of the document such as paragraphs, sections, and the like are identified; and from binary files in which some portions must be interpreted as binary objects.

In computer science, an escape sequence is a combination of characters that has a meaning other than the literal characters contained therein; it is marked by one or more preceding characters.

<span class="mw-page-title-main">PETSCII</span> Character encoding on Commodore computers

PETSCII, also known as CBM ASCII, is the character set used in Commodore Business Machines' 8-bit home computers.

VISCII is an unofficially-defined modified ASCII character encoding for using the Vietnamese language with computers. It should not be confused with the similarly-named officially registered VSCII encoding. VISCII keeps the 95 printable characters of ASCII unmodified, but it replaces 6 of the 33 control characters with printable characters. It adds 128 precomposed characters. Unicode and the Windows-1258 code page are now used for virtually all Vietnamese computer data, but legacy VSCII and VISCII files may need conversion.

<span class="mw-page-title-main">Tektronix 4010</span> Text and graphics computer terminals

The Tektronix 4010 series was a family of text-and-graphics computer terminals based on storage-tube technology created by Tektronix. Several members of the family were introduced during the 1970s, the best known being the 11-inch 4010 and 19-inch 4014, along with the less popular 25-inch 4016. They were widely used in the computer-aided design market in the 1970s and early 1980s.

<span class="mw-page-title-main">VT52</span> CRT-based computer terminal by Digital

The VT50 is a CRT-based computer terminal that was introduced by Digital Equipment Corporation (DEC) in July 1974. It provided a display with 12 rows and 80 columns of upper-case text, and used an expanded set of control characters and forward-only scrolling based on the earlier VT05. DEC documentation of the era refers to the terminals as the DECscope, a name that was otherwise almost never seen.

Several 8-bit character sets (encodings) were designed for binary representation of common Western European languages, which use the Latin alphabet, a few additional letters and ones with precomposed diacritics, some punctuation, and various symbols. These character sets also happen to support many other languages such as Malay, Swahili, and Classical Latin.

A binary-to-text encoding is encoding of data in plain text. More precisely, it is an encoding of binary data in a sequence of printable characters. These encodings are necessary for transmission of data when the communication channel does not allow binary data or is not 8-bit clean. PGP documentation uses the term "ASCII armor" for binary-to-text encoding when referring to Base64.

<span class="mw-page-title-main">Extended ASCII</span> Nickname for 8-bit ASCII-derived character sets

Extended ASCII is a repertoire of character encodings that include the original 96 ASCII character set, plus up to 128 additional characters. There is no formal definition of "extended ASCII", and even use of the term is sometimes criticized, because it can be mistakenly interpreted to mean that the American National Standards Institute (ANSI) had updated its ANSI X3.4-1986 standard to include more characters, or that the term identifies a single unambiguous encoding, neither of which is the case.

<span class="mw-page-title-main">ZX Spectrum character set</span>

The ZX Spectrum character set is the variant of ASCII used in the ZX Spectrum family computers. It is based on ASCII-1967 but the characters ^, ` and DEL are replaced with ↑, £ and ©. It also differs in its use of the C0 control codes other than the common BS and CR, and it makes use of the 128 high-bit characters beyond the ASCII range. The ZX Spectrum's main set of printable characters and system font are also used by the Jupiter Ace computer.

<span class="mw-page-title-main">Semigraphics</span> Method used in early text mode video hardware to emulate raster graphics

Text-based semigraphics, pseudographics, or character graphics is a primitive method used in early text mode video hardware to emulate raster graphics without having to implement the logic for such a display mode.

<span class="mw-page-title-main">ZX80 character set</span> Character set

The ZX80 character set is the character encoding used by the Sinclair Research ZX80 microcomputer with its original 4K BASIC ROM. The encoding uses one byte per character for 256 code points. It has no relationship with previously established ones like ASCII or EBCDIC, but it is related though not identical to the character set of the successor ZX81.

<span class="mw-page-title-main">ZX81 character set</span> Character encoding used in the Sinclair ZX81 computers

The ZX81 character set is the character encoding used by the Sinclair Research ZX81 family of microcomputers including the Timex Sinclair 1000 and Timex Sinclair 1500. The encoding uses one byte per character for 256 code points. It has no relationship with previously established ones like ASCII or EBCDIC, but it is related though not identical to the character set of the predecessor ZX80.

<span class="mw-page-title-main">Atari ST character set</span> Character set of the Atari ST personal computer family

The Atari ST character set is the character set of the Atari ST personal computer family including the Atari STE, TT and Falcon. It is based on code page 437, the original character set of the IBM PC.

Apple II text mode uses the 7-bit ASCII (us-ascii) character set. The high-bit is set to display in normal mode on the 40x24 text screen.

The TRS-80 computer manufacturered by Tandy / Radio Shack contains an 8-bit character set. It is partially derived from ASCII, and shares the code points from 32 - 95 on the standard model. Code points 96 - 127 are supported on models that have been fitted with a lower-case upgrade.

Sharp MZ character sets are character sets made by Sharp Corporation for Sharp MZ computers. The European and Japanese versions of the software use different character sets.

References

  1. 1 2 3 Card, Orson Scott (1984). "Reading the Keyboard Codes". Compute!'s Third Book of Atari. Compute! Books. ISBN   978-0-942386-18-9.
  2. 1 2 3 4 Wilkinson, Bill (March 1986). "INSIGHT: Atari—Atari Character Codes". Compute! . Vol. 8, no. 70. Compute! Publications. pp. 112–113. ISSN   0194-357X.
  3. 1 2 3 Chadwick, Ian (1985). "Appendix 10 - ATASCII And Internal Character Code Values". Mapping the Atari (Revised ed.). Compute! Books. pp. 180–181. ISBN   0874550041.
  4. 1 2 Halfhill, Tom R. (1982). "The Basics of Atari Graphics". Compute!'s First Book of Atari Graphics. Compute! Books. ISBN   978-0-942386-08-0.
  5. The Atari 1200XL Home Computer Owner's Guide. Atari. 1982.
  6. "ATASCII". Atariki (in Polish). 5 May 2020.
  7. 1 2 3 Current, Michael (29 May 2023). "Atari 8-Bit Computers Frequently Asked Questions List". comp.sys.atari.8bit newsgroup. Retrieved 29 November 2023.
  8. Nosty (2007-07-13). "Atari Allacha". Atari Online.pl (in Polish).
  9. Parent, Eric. "ATASCII Character Sets". Joyful Coder. Archived from the original on 16 March 2016.
  10. Savetz, Kevin (19 November 2003). "Exploring the "Star" Arabic Atari 65 XE". Atari 8-Bit Computer WebRing. Retrieved 29 November 2023.
  11. The Modern Atari 8bit computer (11 October 2017). "Hebrew ATARI XL Computer". YouTube. Retrieved 29 November 2023.{{cite web}}: CS1 maint: numeric names: authors list (link)
  12. Bettencourt, Rebecca G. "ATASCII to Unicode Mapping". Kreative Korp.
  13. Bettencourt, Rebecca (2018-04-20), "ATARI8IG.TXT", L2/19-025: Proposal to add characters from legacy computers and teletext to the UCS (PDF)
  14. Bettencourt, Rebecca (2018-04-20), "ATARI8II.TXT", L2/19-025: Proposal to add characters from legacy computers and teletext to the UCS (PDF)
  15. 1 2 3 Atari 400/800: Atari Home Computer Operating System User's Manual. Atari. 1982. pp. 68–70, 183–184.
  16. "ATASCII". Just Solve the File Format Problem.
  17. "AtasciiTube". Break Into Chat.
  18. Ratcliff, Matthew (August 1985). "Atari 'Toons". Antic . Vol. 4, no. 4.