RISC OS character set

Last updated

The Acorn RISC OS character set was used in the Acorn Archimedes series and subsequent computers from 1987 onwards. It is an extension of ISO/IEC 8859-1, similar to the Windows CP1252 in that many of the added characters are typographical punctuation marks.

Contents

Code page layout (standard)

At 0x83 is a box with another box inside it on the top left-hand corner, meaning "resize window". [1] At 0x84 is a 'bubble-writing' X, meaning "close window". [1] At 0x87 is an unusual character that is a subscript 8 followed by a superscript 7. [1] It is not proposed for Unicode. [2] At 0x88, 0x89, 0x8A, and 0x8B are left, right, up, and down bubble arrows for window scrollbars. [1]

The following table shows the RISC OS character set. Each character is shown with a potential Unicode equivalent in the tooltip.

RISC OS character set [1] [3] [4] [5]
0123456789ABCDEF
0x NUL SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF CR SO SI
1x DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US
2x  SP   ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~ DEL
8x Ŵ ŵ 🯀 Ŷ ŷ 87
9x Œ œ
Ax  SP   ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ SHY ® ¯
Bx ° ± ² ³ ´ µ · ¸ ¹ º » ¼ ½ ¾ ¿
Cx À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï
Dx Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß
Ex à á â ã ä å æ ç è é ê ë ì í î ï
Fx ð ñ ò ó ô õ ö ÷ ø ù ú û ü ý þ ÿ
  Differences from ISO/IEC 8859-1

Code page layout (Electronic Font Foundry 1.1)

The Homerton font, a clone of Helvetica, does not have these characters. EFF, a third-party supplier of RISC OS outline fonts, has a different, but similar character set. This RISC OS Latin-1 character set was used by Electronic Font Foundry.

RISC OS EFF character set [6]
0123456789ABCDEF
0x NUL SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF CR SO SI
1x DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US
2x  SP   ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~ DEL
8x Ŵ Ŷ İ ŵ ŷ ı ƒ
9x Œ œ
Ax NBSP ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ SHY ® ¯
Bx ° ± ² ³ ´ µ · ¸ ¹ º » ¼ ½ ¾ ¿
Cx À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï
Dx Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß
Ex à á â ã ä å æ ç è é ê ë ì í î ï
Fx ð ñ ò ó ô õ ö ÷ ø ù ú û ü ý þ ÿ
  Differences from RISC OS

Code page layout (TRC)

TRC character set[ citation needed ]
0123456789ABCDEF
0x NUL SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF CR SO SI
1x DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US
2x  SP   ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~ DEL
8x α ←→ İ ŵ ŷ ı ƒ
9x Œ œ
Ax NBSP ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ SHY ® ¯
Bx ° ± ² ³ ´ µ · ¸ ¹ º » ¼ ½ ¾ ¿
Cx À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï
Dx Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß
Ex à á â ã ä å æ ç è é ê ë ì í î ï
Fx ð ñ ò ó ô õ ö ÷ ø ù ú û ü ý þ ÿ

Related Research Articles

<span class="mw-page-title-main">ISO/IEC 8859-1</span> Character encoding

ISO/IEC 8859-1:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 1: Latin alphabet No. 1, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. ISO/IEC 8859-1 encodes what it refers to as "Latin alphabet no. 1", consisting of 191 characters from the Latin script. This character-encoding scheme is used throughout the Americas, Western Europe, Oceania, and much of Africa. It is the basis for some popular 8-bit character sets and the first two blocks of characters in Unicode.

ISO/IEC 8859 is a joint ISO and IEC series of standards for 8-bit character encodings. The series of standards consists of numbered parts, such as ISO/IEC 8859-1, ISO/IEC 8859-2, etc. There are 15 parts, excluding the abandoned ISO/IEC 8859-12. The ISO working group maintaining this series of standards has been disbanded.

ISO/IEC 8859-3:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 3: Latin alphabet No. 3, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1988. It is informally referred to as Latin-3 or South European. It was designed to cover Turkish, Maltese and Esperanto, though the introduction of ISO/IEC 8859-9 superseded it for Turkish. The encoding was popular for users of Esperanto, but fell out of use as application support for Unicode became more common.

<span class="mw-page-title-main">ß</span> Letter of the Latin alphabet; used in German

In German orthography, the letter ß, called Eszett or scharfes S, represents the phoneme in Standard German when following long vowels and diphthongs. The letter-name Eszett combines the names of the letters of ⟨s⟩ and ⟨z⟩ in German. The character's Unicode names in English are sharp s and eszett. The Eszett letter is used only in German, and can be typographically replaced with the double-s digraph ⟨ss⟩, if the ß-character is unavailable. In the 20th century, the ß-character was replaced with ss in the spelling of Swiss Standard German, while remaining Standard German spelling in other varieties of the German language.

<span class="mw-page-title-main">Windows-1252</span> Character encoding

Windows-1252 or CP-1252 is a single-byte character encoding of the Latin alphabet that was used by default in Microsoft Windows for English and many Romance and Germanic languages including Spanish, Portuguese, French, and German. This character-encoding scheme is used throughout the Americas, Western Europe, Oceania, and much of Africa.

ISO/IEC 8859-11:2001, Information technology — 8-bit single-byte coded graphic character sets — Part 11: Latin/Thai alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 2001. It is informally referred to as Latin/Thai. It is nearly identical to the national Thai standard TIS-620 (1990). The sole difference is that ISO/IEC 8859-11 allocates non-breaking space to code 0xA0, while TIS-620 leaves it undefined.

ISO/IEC 8859-8, Information technology — 8-bit single-byte coded graphic character sets — Part 8: Latin/Hebrew alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings. ISO/IEC 8859-8:1999 from 1999 represents its second and current revision, preceded by the first edition ISO/IEC 8859-8:1988 in 1988. It is informally referred to as Latin/Hebrew. ISO/IEC 8859-8 covers all the Hebrew letters, but no Hebrew vowel signs. IBM assigned code page 916 to it. This character set was also adopted by Israeli Standard SI1311:2002, with some extensions.

ISO/IEC 8859-5:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 5: Latin/Cyrillic alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1988. It is informally referred to as Latin/Cyrillic.

<span class="mw-page-title-main">PETSCII</span> Character encoding on Commodore computers

PETSCII, also known as CBM ASCII, is the character set used in Commodore Business Machines' 8-bit home computers.

<span class="mw-page-title-main">Michael Everson</span> American-Irish type designer (born 1963)

Michael Everson is an American and Irish linguist, script encoder, typesetter, type designer and publisher. He runs a publishing company called Evertype, through which he has published over a hundred books since 2006.

<span class="mw-page-title-main">ArmSCII</span> Set of obsolete single-byte character encodings

ArmSCII or ARMSCII is a set of obsolete single-byte character encodings for the Armenian alphabet defined by Armenian national standard 166–9. ArmSCII is an acronym for Armenian Standard Code for Information Interchange, similar to ASCII for the American standard. It has been superseded by the Unicode standard.

The Universal Coded Character Set is a standard set of characters defined by the international standard ISO/IEC 10646, Information technology — Universal Coded Character Set (UCS), which is the basis of many character encodings, improving as characters from previously unrepresented typing systems are added.

<span class="mw-page-title-main">Atari ST character set</span> Character set of the Atari ST personal computer family

The Atari ST character set is the character set of the Atari ST personal computer family including the Atari STE, TT and Falcon. It is based on code page 437, the original character set of the IBM PC.

MSX character sets are a group of single- and double-byte character sets developed by Microsoft for MSX computers. They are based on code page 437.

The TRS-80 computer manufacturered by Tandy / Radio Shack contains an 8-bit character set. It is partially derived from ASCII, and shares the code points from 32 - 95 on the standard model. Code points 96 - 127 are supported on models that have been fitted with a lower-case upgrade.

This article covers technical details of the character encoding system defined by ETS 300 706, a standard for World System Teletext, and used for the Viewdata and Teletext variants of Videotex in Europe.

Sharp MZ character sets are character sets made by Sharp Corporation for Sharp MZ computers. The European and Japanese versions of the software use different character sets.

Symbols for Legacy Computing is a Unicode block containing graphic characters that were used for various home computers from the 1970s and 1980s and in Teletext broadcasting standards. It includes characters from the Amstrad CPC, MSX, Mattel Aquarius, RISC OS, MouseText, Atari ST, TRS-80 Color Computer, Oric, Texas Instruments TI-99/4A, TRS-80, Minitel, Teletext, ATASCII, PETSCII, ZX80, and ZX81 character sets, as well as semigraphics characters.

The Amstrad CPC character set is the character set used in the Amstrad CPC series of 8-bit personal computers when running BASIC. This character set existed in the built-in "lower" ROM chip. It is based on ASCII-1967, with the exception of character 0x5E which is the up arrow instead of the circumflex, as it is in ASCII-1963, a feature shared with other character sets of the time. Apart from the standard printable ASCII range (0x20-0x7e), it is completely different from the Amstrad CP/M Plus character set. The BASIC character set had symbols of particular use in games and home computing, while the CP/M Plus character reflected the International and Business flavor of the CP/M Plus environment. This character set is represented in Unicode as of the March 2020 release of Unicode 13.0, which added symbols for legacy computing.

References

  1. 1 2 3 4 5 Avis, Ed (2001-03-08). "Acorn RISC OS". Archived from the original on 2020-08-06. Retrieved 2020-03-30.
  2. "7. Characters not proposed", L2/19-025: Proposal to add characters from legacy computers and teletext to the UCS (PDF), 2019-01-04
  3. "ReadMe.txt" (PDF), L2/19-025: Proposal to add characters from legacy computers and teletext to the UCS, 2019-01-04
  4. "RISCOSV.TXT" (PDF), L2/19-025: Proposal to add characters from legacy computers and teletext to the UCS, 2019-01-04
  5. "RISCOSI.TXT" (PDF), L2/19-025: Proposal to add characters from legacy computers and teletext to the UCS, 2019-01-04
  6. "RISCEFF.TXT" (PDF), L2/19-025: Proposal to add characters from legacy computers and teletext to the UCS, 2019-01-04

Further reading