The Sanskrit Library Phonetic basic encoding scheme (SLP1) is an ASCII transliteration scheme for the Sanskrit language from and to the Devanagari script.
Differently from other transliteration schemes for Sanskrit, it can represent not only the basic Devanagari letters, but also phonetic segments, phonetic features and punctuation. SLP1 also describes how to encode classical and Vedic Sanskrit.
One of the main advantages of SLP1 is that each Devanagari letter used in Sanskrit maps to exactly one ASCII character, making it possible to create simple conversions between ASCII and Sanskrit. For example, the Harvard-Kyoto transliteration uses the single character "D" to represent "ड" and the combination "Dh" to represent "ढ". SLP1, in contrast, always uses a single character: "q" for "ड" and "Q" for "ढ". Such intermediate mappings, while convenient for the design of transliteration conversion functions, tend to hinder readability until they are re-converted to either Devanagari or the widely used IAST romanization scheme.
The tables in the following sections are taken from Peter Scharf's May 2008 talk. [1]
SLP1 was formally introduced in the book Linguistic Issues in Encoding Sanskrit by Peter M. Scharf and Malcolm D. Hyman [2] as part of the Sanskrit Library project.
अ | आ | इ | ई | उ | ऊ | ए | ऐ | ओ | औ |
a | A | i | I | u | U | e | E | o | O |
The numeral "3" is suffixed to denote a prolonged vowel (pluta svara). For example, ओ३म् = o3m
. Similarly, the numeral "1" is suffixed to denote a short "e" and "o", as in Dravidian: ऎ = e1
, ऒ = o1
. "1" and "3" are also used after a short and long agitated kampa respectively. Avagraha (ऽ) is represented by a single quote (').
ऋ | ॠ | ऌ | ॡ |
f | F | x | X |
अं | अः |
M | H |
Anunasika is represented by a tilde. For example, माँ = mA~
. Jihvamuliya and upadhmaniya are encoded as "Z" and "V" respectively.
क | ख | ग | घ | ङ | Velar |
k | K | g | G | N | |
च | छ | ज | झ | ञ | Palatal |
c | C | j | J | Y | |
ट | ठ | ड | ढ | ण | Retroflex |
w | W | q | Q | R | |
त | थ | द | ध | न | Dental |
t | T | d | D | n | |
प | फ | ब | भ | म | Labial |
p | P | b | B | m | |
य | र | ल | व | Semi-vowel | |
y | r | l | v | ||
श | ष | स | ह | ळ | Fricative |
S | z | s | h | L |
Udatta, anudatta and svarita are encoded as "/", "\" and "^" respectively.
Devanagari is an Indic script used in the northern Indian subcontinent. Also simply called Nāgari, it is a left-to-right abugida, based on the ancient Brāhmi script. It is one of the official scripts of the Republic of India and Nepal. It was developed and in regular use by the 7th century CE and achieved its modern form by 1000 CE. The Devanāgari script, composed of 48 primary characters, including 14 vowels and 34 consonants, is the fourth most widely adopted writing system in the world, being used for over 120 languages.
In linguistics, romanization is the conversion of text from a different writing system to the Roman (Latin) script, or a system for doing so. Methods of romanization include transliteration, for representing written text, and transcription, for representing the spoken word, and combinations of both. Transcription methods can be subdivided into phonemic transcription, which records the phonemes or units of semantic meaning in speech, and more strict phonetic transcription, which records speech sounds with precision.
Devanagari is an Indic script used for many Indo-Aryan languages of North India and Nepal, including Hindi, Marathi and Nepali, which was the script used to write Classical Sanskrit. There are several somewhat similar methods of transliteration from Devanagari to the Roman script, including the influential and lossless IAST notation. Romanised Devanagari is also called Romanagari.
Dharamshala is a town in the Indian state of Himachal Pradesh. It serves as the winter capital of the state and the administrative headquarters of the Kangra district since 1855. The town also hosts the Tibetan Government-in-exile. Dharamshala was a municipal council until 2015, when it was upgraded to a municipal corporation.
The International Alphabet of Sanskrit Transliteration (IAST) is a transliteration scheme that allows the lossless romanisation of Indic scripts as employed by Sanskrit and related Indic languages. It is based on a scheme that emerged during the 19th century from suggestions by Charles Trevelyan, William Jones, Monier Monier-Williams and other scholars, and formalised by the Transliteration Committee of the Geneva Oriental Congress, in September 1894. IAST makes it possible for the reader to read the Indic text unambiguously, exactly as if it were in the original Indic script. It is this faithfulness to the original scripts that accounts for its continuing popularity amongst scholars.
The Harvard-Kyoto Convention is a system for transliterating Sanskrit and other languages that use the Devanāgarī script into ASCII. It is predominantly used informally in e-mail, and for electronic texts.
The "Indian languages TRANSliteration" (ITRANS) is an ASCII transliteration scheme for Indic scripts, particularly for the Devanagari script.
There are several romanisation schemes for the Malayalam script, including ITRANS and ISO 15919.
Hindustani has been written in several different scripts. Most Hindi texts are written in the Devanagari script, which is derived from the Brāhmī script of Ancient India. Most Urdu texts are written in the Urdu alphabet, which comes from the Persian alphabet. Hindustani has been written in both scripts. In recent years, the Latin script has been used in these languages for technological or internationalization reasons. Historically, Kaithi script has also been used.
Romanisation of Bengali is the representation of written Bengali language in the Latin script. Various romanisation systems for Bengali are used, most of which do not perfectly represent Bengali pronunciation. While different standards for romanisation have been proposed for Bengali, none has been adopted with the same degree of uniformity as Japanese or Sanskrit.
WX notation is a transliteration scheme for representing Indian languages in ASCII. This scheme originated at IIT Kanpur for computational processing of Indian languages, and is widely used among the natural language processing (NLP) community in India. The notation is used, for example, in a textbook on NLP from IIT Kanpur. The salient features of this transliteration scheme are: Every consonant and every vowel has a single mapping into Roman. Hence it is a prefix code, advantageous from a computation point of view. Typically the small case letters are used for un-aspirated consonants and short vowels while the capital case letters are used for aspirated consonants and long vowels. While the retroflexed voiceless and voiced consonants are mapped to 't, T, d and D', the dentals are mapped to 'w, W, x and X'. Hence the name of the scheme "WX", referring to the idiosyncratic mapping. Ubuntu Linux provides a keyboard support for WX notation.
Ga is the third consonant of Indic abugidas. In modern Indic scripts, ga is derived from the early "Ashoka" Brahmi letter , which is probably derived from the Aramaic letter after having gone through the Gupta letter .
Gha is the fourth consonant of Indic abugidas. In modern Indic scripts, gha is derived from the early "Ashoka" Brahmi letter , which is probably derived from the Aramaic ("H/X") after having gone through the Gupta letter .
The Velthuis system of transliteration is an ASCII transliteration scheme for the Sanskrit language from and to the Devanagari script. It was developed in about 1983 by Frans Velthuis, a scholar living in Groningen, Netherlands, who created a popular, high-quality software package in LaTeX for typesetting Devanāgarī. The primary documentation for the scheme is the system's clearly-written software manual. It is based on using the ISO 646 repertoire to represent mnemonically the accents used in standard scholarly transliteration. It does not use diacritics as IAST does. It may optionally use capital letters in a manner similar but not identical to the Harvard-Kyoto or ITRANS schemes.manual para 4.1
Ḍa is a consonant of Indic abugidas. In modern Indic scripts, Ḍa is derived from the early "Ashoka" Brahmi letter after having gone through the Gupta letter . As with the other cerebral consonants, ḍa is not found in most scripts for Tai, Sino-Tibetan, and other non-Indic languages, except for a few scripts, which retain these letters for transcribing Sanskrit religious terms.
Ḍha is a consonant of Indic abugidas. In modern Indic scripts, Ḍha is derived from the early "Ashoka" Brahmi letter after having gone through the Gupta letter . As with the other cerebral consonants, ḍha is not found in most scripts for Tai, Sino-Tibetan, and other non-Indic languages, except for a few scripts, which retain these letters for transcribing Sanskrit religious terms.
Tha is a consonant of Indic abugidas. In modern Indic scripts, tha is derived from the early "Ashoka" Brahmi letter after having gone through the Gupta letter .
Da is a consonant of Indic abugidas. In modern Indic scripts, Da is derived from the early "Ashoka" Brahmi letter after having gone through the Gupta letter .
Bha is a consonant of Indic abugidas. In modern Indic scripts, Bha is derived from the early "Ashoka" Brahmi letter after having gone through the Gupta letter .
Ṣa is a consonant of Indic abugidas. In modern Indic scripts, Ssa is derived from the early "Ashoka" Brahmi letter after having gone through the Gupta letter .