ArabTeX

Last updated
The ArabTeX logo ArabTeX logo.svg
The ArabTeX logo

ArabTeX is a free software package providing support for the Arabic and Hebrew alphabets to TeX and LaTeX. Written by Klaus Lagally, it can take romanized ASCII or native script input to produce quality ligatures for Arabic, Persian, Urdu, Pashto, Sindhi, Western Punjabi (Lahnda), Maghribi, Uyghur, Kashmiri, Hebrew, Judeo-Arabic, Ladino and Yiddish. ArabTeX characters are placed within a TeX/LaTeX document using the command \RL{ ... } or the environment \begin{RLtext} ... \end{RLtext}. ArabTeX is released under the LaTeX Project Public License v1+. [1]

Contents

Example

Arabtex as-salam alaikum.png

\novocalize\RL{al-salAm `alaykum}

Bismillah.png

\documentclass[12pt]{article}\usepackage{arabtex}\begin{document}\setarab\fullvocalize\transtrue\arabtrue\begin{RLtext}  bismi al-ll_ahi al-rra.hm_ani al-rra.hImi العربية  \end{RLtext}\end{document}

Common commands

Character table

LetterTransliterationUnicode name
اAARABIC LETTER ALEF
أa'ARABIC LETTER ALEF WITH HAMZA ABOVE
بbARABIC LETTER BEH
تtARABIC LETTER TEH
ث_tARABIC LETTER THEH
جj / ^gARABIC LETTER JEEM
ح.hARABIC LETTER HAH
خx / _hARABIC LETTER KHAH
دdARABIC LETTER DAL
ذ_dARABIC LETTER THAL
رrARABIC LETTER REH
زzARABIC LETTER ZAIN
سsARABIC LETTER SEEN
ش^sARABIC LETTER SHEEN
ص.sARABIC LETTER SAD
ض.dARABIC LETTER DAD
ط.tARABIC LETTER TAH
ظ.zARABIC LETTER ZAH
ع`ARABIC LETTER AIN
غ.gARABIC LETTER GHAIN
فfARABIC LETTER FEH
قqARABIC LETTER QAF
كkARABIC LETTER KAF
لlARABIC LETTER LAM
مmARABIC LETTER MEEM
نnARABIC LETTER NOON
وw / UARABIC LETTER WAW
هhARABIC LETTER HEH
يy / IARABIC LETTER YEH
َaARABIC FATHA
ُu / oARABIC DAMMA
ِi / eARABIC KASRA
پpARABIC LETTER PEH
چ^cARABIC LETTER TCHEH
ژ^zARABIC LETTER JEH
گgARABIC LETTER GAF
ک.kARABIC LETTER KEHEH
یy / I * ARABIC LETTER FARSI YEH
ۀH-iARABIC LETTER HEH WITH YEH
آ'AARABIC LETTER ALEF WITH MADDA ABOVE
ةTARABIC LETTER TEH MARBUTA
ء'ARABIC LETTER HAMZA ABOVE
ئ'yARABIC LETTER YEH WITH HAMZA ABOVE
ؤu'ARABIC LETTER WAW WITH HAMZA ABOVE
ًaNARABIC FATHATAN
ّxxARABIC SHADDA
،,ARABIC COMMA
؛;ARABIC SEMICOLON
؟?ARABIC QUESTION MARK
٪%ARABIC PERCENT SIGN
SPACE
..FULL STOP
-ZERO WIDTH JOINER
\hspace{0ex}ZERO WIDTH NON-JOINER
^* Activated by \setfarsi

Note that one can also overcome the problem with <yah> containing dots using the \yahnodots command.

See also

Related Research Articles

<span class="mw-page-title-main">Arabic alphabet</span>

The Arabic alphabet, or Arabic abjad, is the Arabic script as specifically codified for writing the Arabic language. It is written from right-to-left in a cursive style, and includes 28 letters, of which most have contextual letterforms. The Arabic alphabet is considered an abjad, with only consonants required to be written; due to its optional use of diacritics to notate vowels, it is considered an impure abjad.

The Hebrew alphabet, known variously by scholars as the Ktav Ashuri, Jewish script, square script and block script, is traditionally an abjad script used in the writing of the Hebrew language and other Jewish languages, most notably Yiddish, Ladino, Judeo-Arabic, and Judeo-Persian. In modern Hebrew, vowels are increasingly introduced. It is also used informally in Israel to write Levantine Arabic, especially among Druze. It is an offshoot of the Imperial Aramaic alphabet, which flourished during the Achaemenid Empire and which itself derives from the Phoenician alphabet.

<span class="mw-page-title-main">LaTeX</span> Typesetting system

LaTeX is a software system for typesetting documents. LaTeX markup describes the content and layout of the document, as opposed to the formatted text found in WYSIWYG word processors like Microsoft Word, LibreOffice Writer and Apple Pages. The writer uses markup tagging conventions to define the general structure of a document, to stylise text throughout a document, and to add citations and cross-references. A TeX distribution such as TeX Live or MiKTeX is used to produce an output file suitable for printing or digital distribution.

TeX, stylized within the system as TeX, is a typesetting program which was designed and written by computer scientist and Stanford University professor Donald Knuth and first released in 1978. The term now refers to the system of extensions - which includes software programs called TeX engines, sets of TeX macros, and packages which provide extra typesetting functionality - built around the original TeX language. TeX is a popular means of typesetting complex mathematical formulae; it has been noted as one of the most sophisticated digital typographical systems.

OpenType is a format for scalable computer fonts. Derived from TrueType, it retains TrueType's basic structure but adds many intricate data structures for describing typographic behavior. OpenType is a registered trademark of Microsoft Corporation.

<span class="mw-page-title-main">Device independent file format</span> Typesetting file format

The device independent file format (DVI) is the output file format of the TeX typesetting program, designed by David R. Fuchs and implemented by Donald E. Knuth in 1982. Unlike the TeX markup files used to generate them, DVI files are not intended to be human-readable; they consist of binary data describing the visual layout of a document in a manner not reliant on any specific image format, display hardware or printer. DVI files are typically used as input to a second program which translates DVI files to graphical data. For example, most TeX software packages include a program for previewing DVI files on a user's computer display; this program is a driver. Drivers are also used to convert from DVI to popular page description languages and for printing.

<span class="mw-page-title-main">Arabic diacritics</span> Diacritics used in the Arabic script

The Arabic script has numerous diacritics, which include consonant pointing known as iʻjām (إِعْجَام), and supplementary diacritics known as tashkīl (تَشْكِيل). The latter include the vowel marks termed ḥarakāt.

<span class="mw-page-title-main">Romanian alphabet</span> Variant of the Latin alphabet

The Romanian alphabet is a variant of the Latin alphabet used for writing the Romanian language. It is a modification of the classical Latin alphabet and consists of 31 letters, five of which have been modified from their Latin originals for the phonetic requirements of the language.

When used as a diacritic mark, the term dot refers to the glyphs "combining dot above", and "combining dot below" which may be combined with some letters of the extended Latin alphabets in use in a variety of languages. Similar marks are used with other scripts.

<span class="mw-page-title-main">XeTeX</span> TeX typesetting engine

XeTeX is a TeX typesetting engine using Unicode and supporting modern font technologies such as OpenType, Graphite and Apple Advanced Typography (AAT). It was originally written by Jonathan Kew and is distributed under the X11 free software license.

Kaph is the eleventh letter of the Semitic abjads, including Phoenician kāp 𐤊, Hebrew kāp̄כ‎, Aramaic kāp 𐡊, Syriac kāp̄ ܟ, and Arabic kāfك‎.

Ayin is the sixteenth letter of the Semitic scripts, including Phoenician ʿayin 𐤏, Hebrew ʿayinע‎, Aramaic ʿē 𐡏, Syriac ʿē ܥ, and Arabic ʿaynع‎.

Aleph is the first letter of the Semitic abjads, including Phoenician ʾālep 𐤀, Hebrew ʾālefא‎, Aramaic ʾālap 𐡀, Syriac ʾālap̄ ܐ, Arabic ʾalifا‎, and North Arabian 𐪑. It also appears as South Arabian 𐩱 and Ge'ez ʾälef አ.

<span class="mw-page-title-main">Romanization of Arabic</span> Representation of Arabic in Latin script

The romanization of Arabic is the systematic rendering of written and spoken Arabic in the Latin script. Romanized Arabic is used for various purposes, among them transcription of names and titles, cataloging Arabic language works, language education when used instead of or alongside the Arabic script, and representation of the language in scientific publications by linguists. These formal systems, which often make use of diacritics and non-standard Latin characters and are used in academic settings or for the benefit of non-speakers, contrast with informal means of written communication used by speakers such as the Latin-based Arabic chat alphabet.

Diacritical marks of two dots¨, placed side-by-side over or under a letter, are used in several languages for several different purposes. The most familiar to English-language speakers are the diaeresis and the umlaut, though there are numerous others. For example, in Albanian, ë represents a schwa. Such diacritics are also sometimes used for stylistic reasons.

The computer program pdfTeX is an extension of Knuth's typesetting program TeX, and was originally written and developed into a publicly usable product by Hàn Thế Thành as a part of the work for his PhD thesis at the Faculty of Informatics, Masaryk University, Brno, Czech Republic. The idea of making this extension to TeX was conceived during the early 1990s, when Jiří Zlatuška and Phil Taylor discussed some developmental ideas with Donald Knuth at Stanford University. Knuth later met Hàn Thế Thành in Brno during his visit to the Faculty of Informatics to receive an honorary doctorate from Masaryk University.

<span class="mw-page-title-main">Varieties of Arabic</span> Family of dialects/variants of Arabic language

Varieties of Arabic are the linguistic systems that Arabic speakers speak natively. Arabic is a Semitic language within the Afroasiatic family that originated in the Arabian Peninsula. There are considerable variations from region to region, with degrees of mutual intelligibility that are often related to geographical distance and some that are mutually unintelligible. Many aspects of the variability attested to in these modern variants can be found in the ancient Arabic dialects in the peninsula. Likewise, many of the features that characterize the various modern variants can be attributed to the original settler dialects as well as local native languages and dialects. Some organizations, such as SIL International, consider these approximately 30 different varieties to be separate languages, while others, such as the Library of Congress, consider them all to be dialects of Arabic.

<span class="mw-page-title-main">Moroccan Arabic</span> Vernacular Arabic spoken in Morocco

Moroccan Arabic, also known as Darija, is the dialectal, vernacular form or forms of Arabic spoken in Morocco. It is part of the Maghrebi Arabic dialect continuum and as such is mutually intelligible to some extent with Algerian Arabic and to a lesser extent with Tunisian Arabic. It is spoken by 90.9% of the population of Morocco. While Modern Standard Arabic is used to varying degrees in formal situations such as religious sermons, books, newspapers, government communications, news broadcasts and political talk shows, Moroccan Arabic is the predominant spoken language of the country and has a strong presence in Moroccan television entertainment, cinema and commercial advertising. Moroccan Arabic has many regional dialects and accents as well, with its mainstream dialect being the one used in Casablanca, Rabat, Tangier, Marrakesh and Fez, and therefore it dominates the media and eclipses most of the other regional accents.

<span class="mw-page-title-main">Palestinian Arabic</span> Dialect of Arabic spoken in the State of Palestine

Palestinian Arabic is a dialect continuum of mutually intelligible varieties of Levantine Arabic spoken by Palestinians in Palestine, including the State of Palestine, Israel and in the Palestinian diaspora.

References