MacFarsi encoding

Last updated

MacFarsi encoding is an obsolete encoding for Farsi/Persian, Urdu (and English) texts that was used in Apple Macintosh computers to texts.

The encoding is identical to MacArabic encoding, except the numerals, which are the Persian/Urdu style, also known as "Extended" or "Eastern" Arabic-Indic numerals. See Arabic script in Unicode for more details.

Mac OS Farsi
0123456789ABCDEF
2x SP ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~ DEL
8xÄ NBSP ÇÉÑÖÜáàâäں«çéè
9xêëíîïñó»ôö÷úùûü
Ax SP !"#$٪&'()*+،-./
Bx۰۱۲۳۴۵۶۷۸۹:؛<=>؟
Cxءآأؤإئابةتثجحخد
Dxذرزسشصضطظعغ[\]^_
Exـفقكلمنهوىيًٌٍَُ
Fxِّْپٹچەڤگڈڑ{|}ژے
  Characters which should be treated with strong right-to-left direction. [1]
  Characters which should be treated with strong left-to-right direction. [1]

Related Research Articles

<span class="mw-page-title-main">Arabic alphabet</span>

The Arabic alphabet, or Arabic abjad, is the Arabic script as specifically codified for writing the Arabic language. It is written from right-to-left in a cursive style, and includes 28 letters, of which most have contextual letterforms. The Arabic alphabet is considered an abjad, with only consonants required to be written; due to its optional use of diacritics to notate vowels, it is considered an impure abjad.

A bidirectional text contains two text directionalities, right-to-left (RTL) and left-to-right (LTR). It generally involves text containing different types of alphabets, but may also refer to boustrophedon, which is changing text direction in each row.

ISO/IEC 8859-6:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 6: Latin/Arabic alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. It is informally referred to as Latin/Arabic. It was designed to cover Arabic. Only nominal letters are encoded, no preshaped forms of the letters, so shaping processing is required for display. It does not include the extra letters needed to write most Arabic-script languages other than Arabic itself.

Indian Standard Code for Information Interchange (ISCII) is a coding scheme for representing various writing systems of India. It encodes the main Indic scripts and a Roman transliteration. The supported scripts are: Bengali–Assamese, Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, and Telugu. ISCII does not encode the writing systems of India that are based on Persian, but its writing system switching codes nonetheless provide for Kashmiri, Sindhi, Urdu, Persian, Pashto and Arabic. The Persian-based writing systems were subsequently encoded in the PASCII encoding.

<span class="mw-page-title-main">ʻOkina</span> Letter of the Latin alphabet

The ʻokina, also called by several other names, is a consonant letter used within the Latin script to mark the phonemic glottal stop in many Polynesian languages. It does not have distinct uppercase and lowercase forms.

<span class="mw-page-title-main">Persian alphabet</span> Writing system used for the Persian language

The Persian alphabet, also known as the Perso-Arabic script, is the right-to-left alphabet used for the Persian language. It is a variation of the Arabic alphabet with five additional letters: پ چ ژ گ, in addition the obsolete ڤ.

Windows-1256 is a code page used under Microsoft Windows to write Arabic and other languages that use Arabic script, such as Persian and Urdu.

<span class="mw-page-title-main">Complex text layout</span> Neighbour-dependent grapheme positioning

Complex text layout (CTL) or complex text rendering is the typesetting of writing systems in which the shape or positioning of a grapheme depends on its relation to other graphemes. The term is used in the field of software internationalization, where each grapheme is a character.

<span class="mw-page-title-main">Right-to-left script</span> Type of writing system

In a right-to-left, top-to-bottom script, writing starts from the right of the page and continues to the left, proceeding from top to bottom for new lines. Arabic, Hebrew, and Persian are the most widespread RTL writing systems in modern times.

<span class="mw-page-title-main">Eastern Arabic numerals</span> Numerals used in the eastern Arab world and Asia

The Eastern Arabic numerals, also called Indo-Arabic numerals, are the symbols used to represent numerical digits in conjunction with the Arabic alphabet in the countries of the Mashriq, the Arabian Peninsula, and its variant in other countries that use the Persian numerals on the Iranian plateau and in Asia.

ASMO 449 is a, now technologically obsolete, 7-bit coded character set to encode the Arabic language.

<span class="mw-page-title-main">Arabic script</span> Writing system for Arabic and several other languages

The Arabic script is the writing system used for Arabic and several other languages of Asia and Africa. It is the second-most widely used alphabetic writing system in the world, the second-most widely used writing system in the world by number of countries using it, and the third-most by number of users.

<span class="mw-page-title-main">Urdu alphabet</span> Writing system used for Urdu

The Urdu alphabet is the right-to-left alphabet used for writing Urdu. It is a modification of the Persian alphabet, which itself is derived from the Arabic script. It has official status in the republics of Pakistan, India and South Africa. The Urdu alphabet has up to 39 or 40 distinct letters with no distinct letter cases and is typically written in the calligraphic Nastaʿlīq script, whereas Arabic is more commonly written in the Naskh style.

‏The right-to-left mark (RLM) is a non-printing character used in the computerized typesetting of bi-directional text containing a mix of left-to-right scripts and right-to-left scripts.

<span class="mw-page-title-main">Universal Character Set characters</span> Complete list of the characters available on most computers

The Unicode Consortium and the ISO/IEC JTC 1/SC 2/WG 2 jointly collaborate on the list of the characters in the Universal Coded Character Set. The Universal Coded Character Set, most commonly called the Universal Character Set, is an international standard to map characters, discrete symbols used in natural language, mathematics, music, and other domains, to unique machine-readable data values. By creating this mapping, the UCS enables computer software vendors to interoperate, and transmit—interchange—UCS-encoded text strings from one to another. Because it is a universal map, it can be used to represent multiple languages at the same time. This avoids the confusion of using multiple legacy character encodings, which can result in the same sequence of codes having multiple interpretations depending on the character encoding in use, resulting in mojibake if the wrong one is chosen.

Many scripts in Unicode, such as Arabic, have special orthographic rules that require certain combinations of letterforms to be combined into special ligature forms. In English, the common ampersand (&) developed from a ligature in which the handwritten Latin letters e and t were combined. The rules governing ligature formation in Arabic can be quite complex, requiring special script-shaping technologies such as the Arabic Calligraphic Engine by Thomas Milo's DecoType.

MacArabic encoding is an obsolete encoding for Arabic text that was used in Apple Macintosh computers to texts.

The Hanifi Rohingya script is a unified script for the Rohingya language. Rohingya today is written in three scripts, Hanifi, Arabic, and Latin (Rohingyalish). The Rohingya language was first written in the 19th century with a version of the Perso-Arabic script. In 1975, an orthographic Arabic script was developed and approved by the community leaders, based on the Urdu alphabet but with unique innovations to make the script suitable to Rohingya.

The Arabic letter mark (ALM) is a non-printing character used in the computerized typesetting of bi-directional text containing mixed left-to-right scripts and right-to-left scripts.

Mac OS Hebrew is used in Apple Macintosh computers to represent Hebrew texts.

References

  1. 1 2 "FARSI.TXT". Apple Computer, Inc. Retrieved 21 April 2020.

See also