MacArabic encoding

Last updated

MacArabic encoding is an obsolete encoding for Arabic (and English) text that was used in Apple Macintosh computers to texts.

The encoding is identical to MacFarsi encoding, except the numerals.

MacArabic [1]
0123456789ABCDEF
2x  SP   ! " # $ % & ' ( ) * + ,/٬ - ./٫ /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~ DEL
8xÄ NBSP ÇÉÑÖÜáàâäں«çéè
9xêëíîïñó»ôö÷úùûü
Ax SP !"#$٪&'()*+،-./
Bx٠١٢٣٤٥٦٧٨٩:؛<=>؟
Cxءآأؤإئابةتثجحخد
Dxذرزسشصضطظعغ[\]^_
Exـفقكلمنهوىيًٌٍَُ
Fxِّْپٹچەڤگڈڑ{|}ژے
  Characters which should be treated with strong right-to-left direction.
  Characters which should be treated with strong left-to-right direction.

Related Research Articles

<span class="mw-page-title-main">Arabic alphabet</span> Writing system of the Arabic language

The Arabic alphabet, or the Arabic abjad, is the Arabic script as specifically codified for writing the Arabic language. It is written from right-to-left in a cursive style, and includes 28 letters, of which most have contextual letterforms. Unlike the Latin alphabet, the script has no concept of letter case. The Arabic alphabet is considered an abjad, with only consonants required to be written; due to its optional use of diacritics to notate vowels, it is considered an impure abjad.

A bidirectional text contains two text directionalities, right-to-left (RTL) and left-to-right (LTR). It generally involves text containing different types of alphabets, but may also refer to boustrophedon, which is changing text direction in each row.

<span class="mw-page-title-main">Plain text</span> Term for computer data consisting only of unformatted characters of readable material

In computing, plain text is a loose term for data that represent only characters of readable material but not its graphical representation nor other objects. It may also include a limited number of "whitespace" characters that affect simple arrangement of text, such as spaces, line breaks, or tabulation characters. Plain text is different from formatted text, where style information is included; from structured text, where structural parts of the document such as paragraphs, sections, and the like are identified; and from binary files in which some portions must be interpreted as binary objects.

The byte-order mark (BOM) is a particular usage of the special Unicode character code, U+FEFFZERO WIDTH NO-BREAK SPACE, whose appearance as a magic number at the start of a text stream can signal several things to a program reading the text:

<span class="mw-page-title-main">Control key</span> Key on computer keyboards

In computing, a Control keyCtrl is a modifier key which, when pressed in conjunction with another key, performs a special operation. Similarly to the Shift key, the Control key rarely performs any function when pressed by itself. The Control key is located on or near the bottom left side of most keyboards, with many featuring an additional one at the bottom right.

ISO/IEC 8859-8, Information technology — 8-bit single-byte coded graphic character sets — Part 8: Latin/Hebrew alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings. ISO/IEC 8859-8:1999 from 1999 represents its second and current revision, preceded by the first edition ISO/IEC 8859-8:1988 in 1988. It is informally referred to as Latin/Hebrew. ISO/IEC 8859-8 covers all the Hebrew letters, but no Hebrew vowel signs. IBM assigned code page 916 to it. This character set was also adopted by Israeli Standard SI1311:2002, with some extensions.

Mac OS Roman is a character encoding created by Apple Computer, Inc. for use by Macintosh computers. It is suitable for representing text in English and several other languages that use the Latin script. Mac OS Roman encodes 256 characters, the first 128 of which are identical to ASCII, with the remaining characters including mathematical symbols, diacritics, and additional punctuation marks. Mac OS Roman is an extension of the original Macintosh character set, which encoded only 217 characters. Full support for Mac OS Roman first appeared in System 6.0.4, released in 1989, and the encoding is still supported in current versions of macOS, though the standard character encoding is now UTF-8. Apple modified Mac OS Roman in 1998 with the release of Mac OS 8.5 by replacing the currency sign with the euro sign, but otherwise the encoding has been unchanged since its release.

The ʻokina is the letter that transcribes the glottal stop consonant in Hawaiian. It does not have distinct uppercase and lowercase forms, and is represented electronically by the modifier letter turned comma: ʻ.

<span class="mw-page-title-main">Complex text layout</span> Neighbour-dependent grapheme positioning

Complex text layout (CTL) or complex text rendering is the typesetting of writing systems in which the shape or positioning of a grapheme depends on its relation to other graphemes. The term is used in the field of software internationalization, where each grapheme is a character.

ISO-8859-8-I is the IANA charset name for the character encoding ISO/IEC 8859-8 used together with the control codes from ISO/IEC 6429 for the C0 and C1 (80–9F) parts. The characters are in logical order.

<span class="mw-page-title-main">Right-to-left script</span> Type of writing system

In a right-to-left, top-to-bottom script, writing starts from the right of the page and continues to the left, proceeding from top to bottom for new lines. Arabic, Hebrew, and Persian are the most widespread RTL writing systems in modern times.

ASMO 449 is a, now technologically obsolete, 7-bit coded character set to encode the Arabic language.

<span class="mw-page-title-main">Hebrew keyboard</span> Keyboard layout

A Hebrew keyboard comes in two different keyboard layouts. Most Hebrew keyboards are bilingual, with Latin characters, usually in a US Qwerty layout. Trilingual keyboard options also exist, with the third script being Arabic or Russian, due to the sizable Arabic- and Russian-speaking populations in Israel.

The left-to-right mark (LRM) is a control character used in computerized typesetting of text containing a mix of left-to-right scripts and right-to-left scripts. It is used to set the way adjacent characters are grouped with respect to text direction.

‏The right-to-left mark (RLM) is a non-printing character used in the computerized typesetting of bi-directional text containing a mix of left-to-right scripts and right-to-left scripts.

<span class="mw-page-title-main">Universal Character Set characters</span> Complete list of the characters available on most computers

The Unicode Consortium and the ISO/IEC JTC 1/SC 2/WG 2 jointly collaborate on the list of the characters in the Universal Coded Character Set. The Universal Coded Character Set, most commonly called the Universal Character Set, is an international standard to map characters, discrete symbols used in natural language, mathematics, music, and other domains, to unique machine-readable data values. By creating this mapping, the UCS enables computer software vendors to interoperate, and transmit—interchange—UCS-encoded text strings from one to another. Because it is a universal map, it can be used to represent multiple languages at the same time. This avoids the confusion of using multiple legacy character encodings, which can result in the same sequence of codes having multiple interpretations depending on the character encoding in use, resulting in mojibake if the wrong one is chosen.

Many Unicode characters are used to control the interpretation or display of text, but these characters themselves have no visual or spatial representation. For example, the null character is used in C-programming application environments to indicate the end of a string of characters. In this way, these programs only require a single starting memory address for a string, since the string ends once the program reads the null character.

Mac OS Icelandic is an obsolete character encoding that was used in Apple Macintosh computers to represent Icelandic text. It is largely identical to Mac OS Roman, except for the Icelandic special characters Ý, Þ and Ð which have replaced typography characters.

MacFarsi encoding is an obsolete encoding for Farsi/Persian, Urdu texts that was used in Apple Macintosh computers to texts.

Mac OS Hebrew is used in Apple Macintosh computers to represent Hebrew texts.

References

  1. "ARABIC.TXT". Apple Computer, Inc. Retrieved 25 April 2014.

See also