MacArabic encoding

Last updated

MacArabic encoding is an obsolete encoding for Arabic (and English) text that was used in Apple Macintosh computers to texts.

The encoding is identical to MacFarsi encoding, except the numerals.

MacArabic [1]
0123456789ABCDEF
2x  SP   ! " # $ % & ' ( ) * + ,/٬ - ./٫ /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~ DEL
8xÄ NBSP ÇÉÑÖÜáàâäں«çéè
9xêëíîïñó»ôö÷úùûü
Ax SP !"#$٪&'()*+،-./
Bx٠١٢٣٤٥٦٧٨٩:؛<=>؟
Cxءآأؤإئابةتثجحخد
Dxذرزسشصضطظعغ[\]^_
Exـفقكلمنهوىيًٌٍَُ
Fxِّْپٹچەڤگڈڑ{|}ژے
  Characters which should be treated with strong right-to-left direction.
  Characters which should be treated with strong left-to-right direction.

Related Research Articles

<span class="mw-page-title-main">Arabic alphabet</span> Alphabets for Arabic and other languages

The Arabic alphabet, or Arabic abjad, is the Arabic script as it is codified for writing Arabic. It is written from right to left in a cursive style and includes 28 letters. Most letters have contextual letterforms.

A bidirectional text contains two text directionalities, right-to-left (RTL) and left-to-right (LTR). It generally involves text containing different types of alphabets, but may also refer to boustrophedon, which is changing text direction in each row.

<span class="mw-page-title-main">Plain text</span> Term for computer data consisting only of unformatted characters of readable material

In computing, plain text is a loose term for data that represent only characters of readable material but not its graphical representation nor other objects. It may also include a limited number of "whitespace" characters that affect simple arrangement of text, such as spaces, line breaks, or tabulation characters. Plain text is different from formatted text, where style information is included; from structured text, where structural parts of the document such as paragraphs, sections, and the like are identified; and from binary files in which some portions must be interpreted as binary objects.

<span class="mw-page-title-main">Unicode</span> Character encoding standard

Unicode, formally The Unicode Standard, is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, which is maintained by the Unicode Consortium, defines as of the current version (15.0) 149,186 characters covering 161 modern and historic scripts, as well as symbols, thousands of emoji, and non-visual control and formatting codes.

The byte order mark (BOM) is a particular usage of the special Unicode character, U+FEFFZERO WIDTH NO-BREAK SPACE, whose appearance as a magic number at the start of a text stream can signal several things to a program reading the text:

<span class="mw-page-title-main">Control key</span> Key on computer keyboards

In computing, a Control keyCtrl is a modifier key which, when pressed in conjunction with another key, performs a special operation. Similarly to the Shift key, the Control key rarely performs any function when pressed by itself. The Control key is located on or near the bottom left side of most keyboards, with many featuring an additional one at the bottom right.

A text file is a kind of computer file that is structured as a sequence of lines of electronic text. A text file exists stored as data within a computer file system. In operating systems such as CP/M and MS-DOS, where the operating system does not keep track of the file size in bytes, the end of a text file is denoted by placing one or more special characters, known as an end-of-file (EOF) marker, as padding after the last line in a text file. On modern operating systems such as Microsoft Windows and Unix-like systems, text files do not contain any special EOF character, because file systems on those operating systems keep track of the file size in bytes. Most text files need to have end-of-line delimiters, which are done in a few different ways depending on operating system. Some operating systems with record-orientated file systems may not use new line delimiters and will primarily store text files with lines separated as fixed or variable length records.

ISO/IEC 8859-8, Information technology — 8-bit single-byte coded graphic character sets — Part 8: Latin/Hebrew alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings. ISO/IEC 8859-8:1999 from 1999 represents its second and current revision, preceded by the first edition ISO/IEC 8859-8:1988 in 1988. It is informally referred to as Latin/Hebrew. ISO/IEC 8859-8 covers all the Hebrew letters, but no Hebrew vowel signs. IBM assigned code page 916 to it. This character set was also adopted by Israeli Standard SI1311:2002, with some extensions.

<span class="mw-page-title-main">Virtual Network Computing</span> Graphical desktop-sharing system

Virtual Network Computing (VNC) is a graphical desktop-sharing system that uses the Remote Frame Buffer protocol (RFB) to remotely control another computer. It transmits the keyboard and mouse input from one computer to another, relaying the graphical-screen updates, over a network.

Mac OS Roman is a character encoding created by Apple Computer, Inc. for use by Macintosh computers. It is suitable for representing text in English and several other Western languages. Mac OS Roman encodes 256 characters, the first 128 of which are identical to ASCII, with the remaining characters including mathematical symbols, diacritics, and additional punctuation marks. Mac OS Roman is an extension of the original Macintosh character set, which encoded only 217 characters. Full support for Mac OS Roman first appeared in System 6.0.4, released in 1989, and the encoding is still supported in current versions of macOS, though the standard character encodings is now UTF-8. Apple modified Mac OS Roman in 1998 with the release of Mac OS 8.5 by replacing the currency sign with the euro sign, but otherwise the encoding has been unchanged since its release.

<span class="mw-page-title-main">ʻOkina</span> Letter of the Latin alphabet

The ʻokina, also called by several other names, is a unicameral consonant letter used within the Latin script to mark the phonemic glottal stop, in many Polynesian languages.

<span class="mw-page-title-main">Complex text layout</span> Neighbour-dependent grapheme positioning

Complex text layout (CTL) or complex text rendering is the typesetting of writing systems in which the shape or positioning of a grapheme depends on its relation to other graphemes. The term is used in the field of software internationalization, where each grapheme is a character.

ASMO 449 is a, now technologically obsolete, 7-bit coded character set to encode the Arabic language.

<span class="mw-page-title-main">Hebrew keyboard</span> Keyboard layout

A Hebrew keyboard comes in two different keyboard layouts. Most Hebrew keyboards are bilingual, with Latin characters, usually in a US Qwerty layout. Trilingual keyboard options also exist, with the third script being Arabic or Russian, due to the sizable Arabic- and Russian-speaking populations in Israel.

The left-to-right mark (LRM) is a control character used in computerized typesetting of text containing a mix of left-to-right scripts and right-to-left scripts. It is used to set the way adjacent characters are grouped with respect to text direction.

‏The right-to-left mark (RLM) is a non-printing character used in the computerized typesetting of bi-directional text containing a mix of left-to-right scripts and right-to-left scripts.

<span class="mw-page-title-main">Universal Character Set characters</span> Complete list of the characters available on most computers

The Unicode Consortium and the ISO/IEC JTC 1/SC 2/WG 2 jointly collaborate on the list of the characters in the Universal Coded Character Set. The Universal Coded Character Set, most commonly called the Universal Character Set, is an international standard to map characters, discrete symbols used in natural language, mathematics, music, and other domains, to unique machine-readable data values. By creating this mapping, the UCS enables computer software vendors to interoperate, and transmit—interchange—UCS-encoded text strings from one to another. Because it is a universal map, it can be used to represent multiple languages at the same time. This avoids the confusion of using multiple legacy character encodings, which can result in the same sequence of codes having multiple interpretations depending on the character encoding in use, resulting in mojibake if the wrong one is chosen.

Many Unicode characters are used to control the interpretation or display of text, but these characters themselves have no visual or spatial representation. For example, the null character is used in C-programming application environments to indicate the end of a string of characters. In this way, these programs only require a single starting memory address for a string, since the string ends once the program reads the null character.

MacFarsi encoding is an obsolete encoding for Farsi/Persian, Urdu texts that was used in Apple Macintosh computers to texts.

Mac OS Hebrew is used in Apple Macintosh computers to represent Hebrew texts.

References

  1. "ARABIC.TXT". Apple Computer, Inc. Retrieved 25 April 2014.

See also