International Alphabet of Sanskrit Transliteration IAST | |
---|---|
Script type | romanisation |
Time period | 19th century–present |
Languages | Sanskrit and other Indic Languages |
The International Alphabet of Sanskrit Transliteration (IAST) is a transliteration scheme that allows the lossless romanisation of Indic scripts as employed by Sanskrit and related Indic languages. It is based on a scheme that emerged during the 19th century from suggestions by Charles Trevelyan, William Jones, Monier Monier-Williams and other scholars, and formalised by the Transliteration Committee of the Geneva Oriental Congress, in September 1894. [1] [2] IAST makes it possible for the reader to read the Indic text unambiguously, exactly as if it were in the original Indic script. It is this faithfulness to the original scripts that accounts for its continuing popularity amongst scholars.
Scholars commonly use IAST in publications that cite textual material in Sanskrit, Pāḷi and other classical Indian languages.
IAST is also used for major e-text repositories such as SARIT, Muktabodha, GRETIL, and sanskritdocuments.org.
The IAST scheme represents more than a century of scholarly usage in books and journals on classical Indian studies. By contrast, the ISO 15919 standard for transliterating Indic scripts emerged in 2001 from the standards and library worlds. For the most part, ISO 15919 follows the IAST scheme, departing from it only in minor ways (e.g., ḷ/l̥ and ṛ/r̥)—see comparison below.
The Indian National Library at Kolkata romanization, intended for the romanisation of all Indic scripts, is an extension of IAST.
The Tenth International Congress of Orientalists, held at Geneva on 10 September 1984, adopted the following notations: [2] [3]
a | ā | i | ī | u | ū | ṛ | ṝ | ḷ | l̤ | e | ai | o | au | |
k | kh | g | gh | ṅ | ||||||||||
c | ch | j | jh | ñ | ||||||||||
ṭ | ṭh | ḍ | ḍh | ṇ | ||||||||||
t | th | d | dh | n | ||||||||||
p | ph | b | bh | m | ||||||||||
y | r | l | v | ś | ṣ | s | h | ḻ | ṁ | m̐ | ||||
visarga | ḥ | |||||||||||||
jihvāmūlīya | ẖ | |||||||||||||
upadhmānīya | ḫ |
For the short e and o, the Committee also recommended the use of ĕ and ŏ, respectively.
The IAST letters are listed with their Devanagari equivalents and phonetic values in IPA, valid for Sanskrit, Hindi and other modern languages that use Devanagari script, but some phonological changes have occurred:
Devanāgarī | Transcription | IPA | Category | |
---|---|---|---|---|
अ | a | A | ɐ | monophthongs |
आ | ā | Ā | aː | |
इ | i | I | ɪ | |
ई | ī | Ī | iː | |
उ | u | U | ʊ | |
ऊ | ū | Ū | uː | |
ऋ | ṛ | Ṛ | r̩ | syllabic liquids |
ॠ | ṝ | Ṝ | r̩ː | |
ऌ | ḷ | Ḷ | l̩ | |
ॡ | ḹ | Ḹ | l̩ː | |
ए | e | E | e | diphthongs |
ऐ | ai | Ai | ɐi̯ | |
ओ | o | O | o | |
औ | au | Au | ɐu̯ | |
ं | ṁ/ṃ | Ṁ/Ṃ | anusvara | |
ः | ḥ | Ḥ | visarga | |
ँ | m̐ | M̐ | chandrabindu [4] | |
ऽ | ' | avagraha |
velars | palatals | retroflexes | dentals | labials | Category |
---|---|---|---|---|---|
क k K | च c C | ट ṭ Ṭ | त t T | प p P | tenuis stops |
ख kh Kh | छ ch Ch | ठ ṭh Ṭh | थ th Th | फ ph Ph | aspirated stops |
ग g G | ज j J | ड ḍ Ḍ | द d D | ब b B | voiced stops |
घ gh Gh | झ jh Jh | ढ ḍh Ḍh | ध dh Dh | भ bh Bh | breathy-voiced stops |
ङ ṅ Ṅ | ञ ñ Ñ | ण ṇ Ṇ | न n N | म m M | nasal stops |
ह h H* | य y Y | र r R | ल l L | व v V | approximants |
श ś Ś | ष ṣ Ṣ | स s S | sibilants | ||
ळ ḻ Ḻ | approximants (Vedic) [2] |
*H is actually glottal, not velar.
Some letters are modified with diacritics: Long vowels are marked with an overline (often called a macron). Vocalic (syllabic) consonants, retroflexes and ṣ (/ʂ~ɕ~ʃ/) have an underdot. Two letters have an overdot: ṁ and ṅ (/ŋ/). One has an acute accent: ś (/ʃ/). One letter has a line below: ḻ (/ɭ/) (Vedic).
Unlike ASCII-only romanisations such as ITRANS or Harvard-Kyoto, the diacritics used for IAST allow capitalisation of proper names. The capital variants of letters never occurring word-initially (Ṁ Ṇ Ṅ Ñ Ṝ Ḹ) are useful only when writing in all-caps and in Pāṇini contexts for which the convention is to typeset the IT sounds as capital letters.
For the most part, IAST is a subset of ISO 15919 that merges the retroflex (underdotted) liquids with the vocalic ones (ringed below) and the short close-mid vowels with the long ones. The following seven exceptions are from the ISO standard accommodating an extended repertoire of symbols to allow transliteration of Devanāgarī and other Indic scripts, as used for languages other than Sanskrit.
Devanāgarī | IAST | ISO 15919 | Comment |
---|---|---|---|
ए / े | e | ē (e) | ISO e generally represents short ऎ / ॆ, but optionally represents long ए / े in the Devanagari, Bengali, Gurmukhi, Gujarati, and Odia scripts. |
ओ / ो | o | ō (o) | ISO o generally represents short ऒ / ॆ, but optionally represents long ओ / ो in the Devanagari, Bengali, Gurmukhi, Gujarati, and Odia scripts. |
ऎ / ॆ | ĕ | e | ISO e generally represents short ऎ / ॆ, but optionally represents long ए / े in the Devanagari, Bengali, Gurmukhi, Gujarati, and Odia scripts. The IAST ĕ represents short ऎ / ॆ. |
ऒ / ॆ | ŏ | o | ISO o generally represents short ऒ / ॆ, but optionally represents long ओ / ो in the Devanagari, Bengali, Gurmukhi, Gujarati, and Odia scripts. The IAST ŏ represents short ऒ / ॆ. |
अं / ं | ṃ | ṁ | ISO ṃ represents Gurmukhi tippi ੰ. The Tenth Geneva convention for IAST actually adopted ṁ. |
ऋ / ृ | ṛ | r̥ | ISO ṛ represents ड़ /ɽ/. |
ॠ / ॄ | ṝ | r̥̄ | for consistency with r̥. |
ऌ / ॢ | ḷ | l̥ | ISO ḷ represents ळ /𝼈/. |
ॡ / ॣ | ḹ | l̥̄ | for consistency with l̥. |
ळ | ḻ | ḷ | Used in Vedic Sanskrit only and not found in the Classical variant |
The most convenient method of inputting romanized Sanskrit is by setting up an alternative keyboard layout. This allows one to hold a modifier key to type letters with diacritical marks. For example, alt+a = ā. How this is set up varies by operating system.
Linux/Unix and BSD desktop environments allow one to set up custom keyboard layouts and switch them by clicking a flag icon in the menu bar.
macOS One can use the pre-installed US International keyboard, or install Toshiya Unebe's Easy Unicode keyboard layout.
Microsoft Windows Windows also allows one to change keyboard layouts and set up additional custom keyboard mappings for IAST. This Pali keyboard installer [5] made by Microsoft Keyboard Layout Creator (MSKLC) supports IAST (works on Microsoft Windows up to at least version 10, can use Alt button on the right side of the keyboard instead of Ctrl+Alt combination).
Many systems provide a way to select Unicode characters visually. ISO/IEC 14755 refers to this as a screen-selection entry method.
Microsoft Windows has provided a Unicode version of the Character Map program (find it by hitting ⊞ Win+R then type charmap
then hit ↵ Enter) since version NT 4.0 – appearing in the consumer edition since XP. This is limited to characters in the Basic Multilingual Plane (BMP). Characters are searchable by Unicode character name, and the table can be limited to a particular code block. More advanced third-party tools of the same type are also available (a notable freeware example is BabelMap).
macOS provides a "character palette" with much the same functionality, along with searching by related characters, glyph tables in a font, etc. It can be enabled in the input menu in the menu bar under System Preferences → International → Input Menu (or System Preferences → Language and Text → Input Sources) or can be viewed under Edit → Emoji & Symbols in many programs.
Equivalent tools – such as gucharmap (GNOME) or kcharselect (KDE) – exist on most Linux desktop environments.
Users of SCIM on Linux based platforms can also have the opportunity to install and use the sa-itrans-iast input handler which provides complete support for the ISO 15919 standard for the romanization of Indic languages as part of the m17n library.
Or user can use some Unicode characters in Latin-1 Supplement, Latin Extended-A, Latin Extended Additional and Combining Diarcritical Marks block to write IAST.
This section needs expansion. You can help by adding to it. (September 2016) |
Only certain fonts support all the Latin Unicode characters essential for the transliteration of Indic scripts according to the IAST and ISO 15919 standards.
For example, the Arial, Tahoma and Times New Roman font packages that come with Microsoft Office 2007 and later versions also support precomposed Unicode characters like ī.
Many other text fonts commonly used for book production may be lacking in support for one or more characters from this block. Accordingly, many academics working in the area of Sanskrit studies make use of free OpenType fonts such as FreeSerif or Gentium, both of which have complete support for the full repertoire of conjoined diacritics in the IAST character set. Released under the GNU FreeFont or SIL Open Font License, respectively, such fonts may be freely shared and do not require the person reading or editing a document to purchase proprietary software to make use of its associated fonts.
Devanagari is an Indic script used in the Indian subcontinent. Also simply called Nāgari, it is a left-to-right abugida, based on the ancient Brāhmi script. It is one of the official scripts of the Republic of India and Nepal. It was developed and in regular use by the 8th century CE and achieved its modern form by 1000 CE. The Devanāgari script, composed of 48 primary characters, including 14 vowels and 34 consonants, is the fourth most widely adopted writing system in the world, being used for over 120 languages.
The Sinhala script, also known as Sinhalese script, is a writing system used by the Sinhalese people and most Sri Lankans in Sri Lanka and elsewhere to write the Sinhala language as well as the liturgical languages Pali and Sanskrit. The Sinhalese Akṣara Mālāva, one of the Brahmic scripts, is a descendant of the Ancient Indian Brahmi script. It is also related to the Grantha script.
The Kannada script is an abugida of the Brahmic family, used to write Kannada, one of the Dravidian languages of South India especially in the state of Karnataka. It is one of the official scripts of the Indian Republic. Kannada script is also widely used for writing Sanskrit texts in Karnataka. Several minor languages, such as Tulu, Konkani, Kodava, Beary and Sanketi also use alphabets based on the Kannada script. The Kannada and Telugu scripts share very high mutual intellegibility with each other, and are often considered to be regional variants of single script. Other scripts similar to Kannada script are Sinhala script, and Old Peguan script (used in Burma).
The Brahmic scripts, also known as Indic scripts, are a family of abugida writing systems. They are used throughout the Indian subcontinent, Southeast Asia and parts of East Asia. They are descended from the Brahmi script of ancient India and are used by various languages in several language families in South, East and Southeast Asia: Indo-Aryan, Dravidian, Tibeto-Burman, Mongolic, Austroasiatic, Austronesian, and Tai. They were also the source of the dictionary order (gojūon) of Japanese kana.
The Thai script is the abugida used to write Thai, Southern Thai and many other languages spoken in Thailand. The Thai script itself has 44 consonant symbols, 16 vowel symbols that combine into at least 32 vowel forms, four tone diacritics, and other diacritics.
Malayalam script is a Brahmic script used commonly to write Malayalam, which is the principal language of Kerala, India, spoken by 45 million people in the world. It is a Dravidian language spoken in the Indian state of Kerala and the union territories of Lakshadweep and Puducherry by the Malayali people. It is one of the official scripts of the Indian Republic. Malayalam script is also widely used for writing Sanskrit texts in Kerala.
In linguistics, romanization is the conversion of text from a different writing system to the Roman (Latin) script, or a system for doing so. Methods of romanization include transliteration, for representing written text, and transcription, for representing the spoken word, and combinations of both. Transcription methods can be subdivided into phonemic transcription, which records the phonemes or units of semantic meaning in speech, and more strict phonetic transcription, which records speech sounds with precision.
Anusvara, also known as Bindu, is a symbol used in many Indic scripts to mark a type of nasal sound, typically transliterated ⟨ṃ⟩ or ⟨ṁ⟩ in standards like ISO 15919 and IAST. Depending on its location in the word and the language for which it is used, its exact pronunciation can vary. In the context of ancient Sanskrit, anusvara is the name of the particular nasal sound itself, regardless of written representation.
Devanagari is an Indic script used for many Indo-Aryan languages of North India and Nepal, including Hindi, Marathi and Nepali, which was the script used to write Classical Sanskrit. There are several somewhat similar methods of transliteration from Devanagari to the Roman script, including the influential and lossless IAST notation. Romanised Devanagari is also called Romanagari.
The National Library at Kolkata romanisation is a widely used transliteration scheme in dictionaries and grammars of Indic languages. This transliteration scheme is also known as (American) Library of Congress and is nearly identical to one of the possible ISO 15919 variants. The scheme is an extension of the IAST scheme that is used for transliteration of Sanskrit.
The Harvard-Kyoto Convention is a system for transliterating Sanskrit and other languages that use the Devanāgarī script into ASCII. It is predominantly used informally in e-mail, and for electronic texts.
ISO 15919 is an international standard for the romanization of Brahmic and Nastaliq scripts. Published in 2001, it is part of a series of international standards by the International Organization for Standardization.
The "Indian languages TRANSliteration" (ITRANS) is an ASCII transliteration scheme for Indic scripts, particularly for the Devanagari script.
There are several romanisation schemes for the Malayalam script, including ITRANS and ISO 15919.
Romanisation of Bengali is the representation of written Bengali language in the Latin script. Various romanisation systems for Bengali are used, most of which do not perfectly represent Bengali pronunciation. While different standards for romanisation have been proposed for Bengali, none has been adopted with the same degree of uniformity as Japanese or Sanskrit.
Indic Computing means "computing in Indic", i.e., Indian Scripts and Languages. It involves developing software in Indic Scripts/languages, Input methods, Localization of computer applications, web development, Database Management, Spell checkers, Speech to Text and Text to Speech applications and OCR in Indian languages.
Clip fonts or split fonts are non-Unicode fonts that assign glyphs of Brahmic scripts, such as Devanagari, at code positions intended for glyphs of the Latin script or to produce glyphs not found in Unicode by using its Private Use Area (PUA).
The Sanskrit Library Phonetic basic encoding scheme (SLP1) is an ASCII transliteration scheme for the Sanskrit language from and to the Devanagari script.
The Velthuis system of transliteration is an ASCII transliteration scheme for the Sanskrit language from and to the Devanagari script. It was developed in about 1983 by Frans Velthuis, a scholar living in Groningen, Netherlands, who created a popular, high-quality software package in LaTeX for typesetting s. The primary documentation for the scheme is the system's clearly written software Daniella and awwkeiwek. It is based on using the ISO 646 repertoire to represent mnemonically the accents used in standard scholarly transliteration.
Meitei input methods are the methods that allow users of computers to input texts in the Meitei script, systematically for Meitei language.