Tibetan script

Last updated
Om Mani Padme Hum mantra.svg
Script type
Time period
c. 650–present
Directionleft-to-right  OOjs UI icon edit-ltr-progressive.svg
Related scripts
Parent systems
Child systems
Sister systems
Sharada, Siddhaṃ
ISO 15924
ISO 15924 Tibt, 330  OOjs UI icon edit-ltr-progressive.svg ,Tibetan
Unicode alias
U+0F00U+0FFF Final Accepted Script Proposal of the First Usable Edition (3.0)
 This article contains IPA phonetic symbols. Without proper rendering support, you may see question marks, boxes, or other symbols instead of Unicode characters. For an introductory guide on IPA symbols, see Help:IPA.

The Tibetan script is a segmental writing system ( abugida ) of Indic origin used to write certain Tibetic languages, including Tibetan, Dzongkha, Sikkimese, Ladakhi, Jirel and sometimes Balti. It has also been used for some non-Tibetic languages in close cultural contact with Tibet, such as Thakali. [4] The printed form is called uchen script while the hand-written cursive form used in everyday writing is called umê script.


The script is closely linked to a broad ethnic Tibetan identity, spanning across areas in India, Nepal, Bhutan and Tibet. [5] The Tibetan script is of Indic origin and it is ancestral to the Limbu script, the Lepcha script [6] and the multilingual ʼPhags-pa script. [6]


The creation of the Tibetan alphabet is attributed to Thonmi Sambhota of the mid-7th century. Tradition holds that Thonmi Sambhota, a minister of Songtsen Gampo in the 7th century, was sent to India to study the art of writing, to find a system of writing suitable for the Tibetan language. Upon his return, he introduced an alphabet based on the Nagari that was used in Kashmir at the time. The script had 30 consonantal characters, of which 6 were created specifically to match Tibetan phonology. [7] The origin is still debated however; other studies suggest that the Tibetan script was based on an adaption of the Indian Brahmi and Gupta scripts from Khotan, taught to Thonmi Sambhota in Kashmir. [8]

Three orthographic standardisations were developed. The most important, an official orthography aimed to facilitate the translation of Buddhist scriptures, emerged during the early 9th century. Standard orthography has not altered since then, while the spoken language has changed by, for example, losing complex consonant clusters. As a result, in all modern Tibetan dialects and in particular in the Standard Tibetan of Lhasa, there is a great divergence between current spelling (which still reflects the 9th-century spoken Tibetan) and current pronunciation. This divergence is the basis of an argument in favour of spelling reform, to write Tibetan as it is pronounced; for example, writing Kagyu instead of Bka'-rgyud. In contrast, the pronunciation of the Balti, Ladakhi and Purigi languages adheres more closely to the original spelling. Purigi is the closest linguistic heir of Old Tibetan and therefore Purigi pronunciations adhered most closely to the Old Tibetan.

New research and writings suggest that there were one or more Tibetan scripts in use prior to the introduction of the current script by Songtsen Gampo and Thonmi Sambhota. The Tunhong manuscripts (dunhuang manuscript) are key evidence for this hypothesis. [9]


Basic alphabet

In the Tibetan script, the syllables are written from left to right. Syllables are separated by a tsek (་); since many Tibetan words are monosyllabic, this mark often functions almost as a space. Spaces are not used to divide words.

The Tibetan alphabet has thirty basic letters, sometimes known as "radicals", for consonants. [6] As in other Indic scripts, each consonant letter assumes an inherent vowel; in the Tibetan script it is /a/. The letter is also the base for dependent vowel marks.

Although some Tibetan dialects are tonal, the language had no tone at the time of the script's invention, and there are no dedicated symbols for tone. However, since tones developed from segmental features, they can usually be correctly predicted by the archaic spelling of Tibetan words.

Letter IPA LetterIPALetterIPALetterIPA
Guttural /ka//kʰa//ɡa*//ŋa/
Palatal /tʃa//tʃʰa//dʒa*//ɲa/
Dental /ta//tʰa//da*//na/
Labial /pa//pʰa//ba*//ma/
* – Only in loanwords

Consonant clusters

Tibetan map of the Kizil Caves, Tarim Basin. 13th century CE Tibetan map of the Kizil Caves (13th entury CE).jpg
Tibetan map of the Kizil Caves, Tarim Basin. 13th century CE

One unique aspect of the Tibetan script is that the consonants can be written either as radicals or they can be written in other forms, such as subscript and superscript forming consonant clusters.

To understand how this works, one can look at the radical /ka/ and see what happens when it becomes ཀྲ /kra/ or རྐ /rka/. In both cases, the symbol for /ka/ is used, but when the /ra/ is in the middle of the consonant and vowel, it is added as a subscript. On the other hand, when the /ra/ comes before the consonant and vowel, it is added as a superscript. [6] /ra/ actually changes form when it is above most other consonants; thus རྐ rka. However, an exception to this is the cluster རྙ /rɲa/. Similarly, the consonants /wa/, /ra/, and /ja/ change form when they are beneath other consonants; thus ཀྭ /kwa/; ཀྲ /kra/; ཀྱ /kja/.

Besides being written as subscripts and superscripts, some consonants can also be placed in prescript, postscript, or post-postscript positions. For instance, the consonants /ʰka/, /ʰta/, /ʰpa/, /ma/ and /a/ can be used in the prescript position to the left of other radicals, while the position after a radical (the postscript position), can be held by the ten consonants /ʰka/, /na/, /ʰpa/, /ʰta/, /ma/, /a/, /ra/, /ŋa/, /sa/, and /la/. The third position, the post-postscript position is solely for the consonants /ʰta/ and /sa/. [6]

Head letters

The superscript position above a radical is reserved for the consonants /ra/, /la/, and /sa/.

  • When /ra/, /la/, and /sa/ are in superscript position with /ka/, /tʃa/, /ta/, /pa/ and /tsa/, there are no changes in the sound. They look and sound like:
    • རྐ /ka/, རྟ /ta/, རྤ /pa/, རྩ /tsa/
    • ལྐ /ka/, ལྕ /tʃa/, ལྟ /ta/, ལྤ /pa/,
    • སྐ /ka/, སྕ /tʃa/, སྟ /ta/, སྤ /pa/, སྩ /tsa/
  • When /ra/, /la/, and /sa/ are in superscript position with /ʰka/, /ʰtʃa/, /ʰta/, /ʰpa/ and /ʰtsa/, they lose their aspiration and become voiced. They look and sound like:
    • རྒ /ga/, རྗ /d͡ʒa/, རྡ /da/, རྦ /ba/, རྫ /dza/
    • ལྒ /ga/, ལྗ /d͡ʒa/, ལྡ /da/, ལྦ /ba/,
    • སྒ /ga/, སྗ /d͡ʒa/, སྡ /da/, སྦ /ba/, སྫ /dza/
  • When /ra/, /la/, and /sa/ are in superscript position with /ŋa/, /ɲa/, /na/ and /ma/, the nasal sound gets high. They look and sound like:
    • རྔ /ŋa/, རྙ /ɲa/, རྣ /na/, རྨ /ma/
    • ལྔ /ŋa/, ལྨ /ma/
    • སྔ /ŋa/, སྙ /ɲa/, སྣ /na/, སྨ /ma/

Sub-joined letters

The subscript position under a radical is for the consonants /ja/, /ra/, /la/, and /wa/.

Vowel marks

The vowels used in the alphabet are /a/, ཨི /i/, ཨུ /u/, ཨེ /e/, and ཨོ /o/. While the vowel /a/ is included in each consonant or radical, the other vowels are indicated by marks; thus /ka/, ཀི /ki/, ཀུ /ku/, ཀེ /ke/, ཀོ /ko/. The vowels ཨི /i/, ཨེ /e/, and ཨོ /o/ are placed above consonants as diacritics, while the vowel ཨུ /u/ is placed underneath consonants. [6] Old Tibetan included a reversed form of the mark for /i/, the gigu 'verso', of uncertain meaning. There is no distinction between long and short vowels in written Tibetan, except in loanwords, especially transcribed from the Sanskrit.

Vowel markIPAVowel markIPAVowel markIPAVowel markIPA


Tibetan numerals
Devanagari numerals
Arabic numerals 0123456789
Tibetan fractions
Arabic fractions-


yig mgo
marks beginning of text
sbrul shad
separates sections of meaning equivalent to topics and sub-topics
bskur yig mgo
list enumerator (Dzongkha)
morpheme delimiter
full stop (marks end of a section of text)
full stop (marks end of a whole topic)
bsdus rtags
gug rtags g.yon
left bracket
gug rtags g.yas
right bracket
ang khang g.yon
left bracket used for bracketing with a roof over
ang khang g.yas
right bracket used for bracketing with a roof over

Extended use

A text in Tibetan script suspected to be Sanskrit in content. From the personal artifact collection of Donald Weir. Unknown Tibetan Sanskrit Text.jpg
A text in Tibetan script suspected to be Sanskrit in content. From the personal artifact collection of Donald Weir.

The Tibetan alphabet, when used to write other languages such as Balti and Sanskrit, often has additional and/or modified graphemes taken from the basic Tibetan alphabet to represent different sounds.

Extended alphabet

LetterUsed inRomanization & IPA
Balti qa /qa/ (/q/)
Balti ɽa /ɽa/ (/ɽ/)
ཁ༹ Balti xa /χa/ (/χ/)
ག༹ Balti ɣa /ʁa/ (/ʁ/)
གྷ Sanskrit gha /ɡʱ/
ཛྷ Sanskrit jha /ɟʱ, d͡ʒʱ/
Sanskrit ṭa /ʈ/
Sanskrit ṭha /ʈʰ/
Sanskrit ḍa /ɖ/
ཌྷ Sanskrit ḍha /ɖʱ/
Sanskrit ṇa /ɳ/
དྷ Sanskrit dha /d̪ʱ/
བྷ Sanskrit bha /bʱ/
Sanskrit ṣa /ʂ/
ཀྵ Sanskrit kṣa /kʂ/

Extended vowel marks and modifiers

Vowel MarkUsed inRomanization & IPA
Sanskrit ā /ā/
ཱི Sanskrit ī /ī/
ཱུ Sanskrit ū /ū/
Sanskrit ai /ai/
Sanskrit au /au/
ྲྀ Sanskrit ṛ /ṛ/
Sanskrit /ṝ/
ླྀ Sanskrit /ḷ/
Sanskrit /ḹ/
Sanskrit aṃ /ṃ/
Sanskrit aṃ /ṃ/
ཿ Sanskrit aḥ /ḥ/
NameUsed inFunction
srog med Sanskrit suppresses the inherent vowel sound
paluta Sanskrit used for prolonging vowel sounds

Romanization and transliteration

Romanization and transliteration of the Tibetan script is the representation of the Tibetan script in the Latin script. Multiple Romanization and transliteration systems have been created in recent years, but do not fully represent the true phonetic sound. [10] While the Wylie transliteration system is widely used to Romanize Standard Tibetan, others include the Library of Congress system and the IPA-based transliteration (Jacques 2012).

Below is a table with Tibetan letters and different Romanization and transliteration system for each letter, listed below systems are: Wylie transliteration (W), Tibetan pinyin (TP), Dzongkha phonetic (DP), ALA-LC Romanization (A) [11] and THL Simplified Phonetic Transcription (THL).

* – Only in loanwords

Input method and keyboard layout


Tibetan keyboard layout Tibetan Keyboard.png
Tibetan keyboard layout

The first version of Microsoft Windows to support the Tibetan keyboard layout is MS Windows Vista. The layout has been available in Linux since September 2007. In Ubuntu 12.04, one can install Tibetan language support through Dash / Language Support / Install/Remove Languages, the input method can be turned on from Dash / Keyboard Layout, adding Tibetan keyboard layout. The layout applies the similar layout as in Microsoft Windows.

Mac OS-X introduced Tibetan Unicode support with OS-X version 10.5 and later, now with three different keyboard layouts available: Tibetan-Wylie, Tibetan QWERTY and Tibetan-Otani.


Dzongkha keyboard layout Dzongkha Keyboard layout Main.svg
Dzongkha keyboard layout

The Dzongkha keyboard layout scheme is designed as a simple means for inputting Dzongkha text on computers. This keyboard layout was standardized by the Dzongkha Development Commission (DDC) and the Department of Information Technology (DIT) of the Royal Government of Bhutan in 2000.

It was updated in 2009 to accommodate additional characters added to the Unicode & ISO 10646 standards since the initial version. Since the arrangement of keys essentially follows the usual order of the Dzongkha and Tibetan alphabet, the layout can be quickly learned by anyone familiar with this alphabet. Subjoined (combining) consonants are entered using the Shift key.

The Dzongkha (dz) keyboard layout is included in Microsoft Windows, Android, and most distributions of Linux as part of XFree86.


Tibetan was originally one of the scripts in the first version of the Unicode Standard in 1991, in the Unicode block U+1000U+104F. However, in 1993, in version 1.1, it was removed (the code points it took up would later be used for the Burmese script in version 3.0). The Tibetan script was re-added in July, 1996 with the release of version 2.0.

The Unicode block for Tibetan is U+0F00U+0FFF. It includes letters, digits and various punctuation marks and special symbols used in religious texts:

Tibetan [1] [2] [3]
Official Unicode Consortium code chart (PDF)
1. ^ As of Unicode version 13.0
2. ^ Grey areas indicate non-assigned code points
3. ^ Unicode code points U+0F77 and U+0F79 are deprecated in Unicode 5.2 and later

See also


  1. Daniels, P.T. (January 2008). "Writing systems of major and minor languages".Cite journal requires |journal= (help)
  2. Masica, Colin (1993). The Indo-Aryan languages. p. 143.
  3. https://www.worldcat.org/oclc/63062039
  4. http://himalaya.socanth.cam.ac.uk/collections/journals/kailash/pdf/kailash_09_01_02.pdf
  5. Chamberlain 2008
  6. 1 2 3 4 5 6 Daniels, Peter T. and William Bright. The World's Writing Systems. New York: Oxford University Press, 1996.
  7. William Woodville Rockhill, Annual Report of the Board of Regents of the Smithsonian Institution , p. 671, at Google Books, United States National Museum, page 671
  8. Berzin, Alexander. A Survey of Tibetan History - Reading Notes Taken by Alexander Berzin from Tsepon, W. D. Shakabpa, Tibet: A Political History. New Haven, Yale University Press, 1967: http://studybuddhism.com/web/en/archives/e-books/unpublished_manuscripts/survey_tibetan_history/chapter_1.html.
  9. Phuntsok, Thubten. བོད་ཀྱི་ལོ་རྒྱུས་སྤྱི་དོན་པདྨ་ར་གཱའི་ལྡེ་མིག "A General History of Tibet".
  10. See for instance
  11. ALA-LC Romanization of Tibetan script (PDF)

Related Research Articles

Arabic alphabet Alphabet for Arabic and other languages

The Arabic alphabet, or Arabic abjad, is the Arabic script as it is codified for writing Arabic. It is written from right to left in a cursive style and includes 28 letters. Most letters have contextual letterforms. The Arabic script is also a religious text, it is used mainly in Islamic countries, namely in Arabia, North Africa, Persia/Iran, Central Asia and the Northwestern Indian Subcontinent.

Devanagari Writing script for many Indian and Nepalese languages

Devanagari, also called Nagari, is a left-to-right abugida (alphasyllabary), based on the ancient Brāhmī script, used in the Indian subcontinent. It was developed in ancient India from the 1st to the 4th century CE and was in regular use by the 7th century CE. The Devanagari script, composed of 47 primary characters including 14 vowels and 33 consonants, is the fourth most widely adopted writing system in the world, being used for over 120 languages.

Kannada script

The Kannada script is an abugida of the Brahmic family, used primarily to write the Kannada language, one of the Dravidian languages of South India especially in the state of Karnataka. Kannada script is widely used for writing Sanskrit texts in Karnataka. Several minor languages, such as Tulu, Konkani, Kodava, Havyaka, Sanketi and Beary, also use alphabets based on the Kannada script. The Kannada and Telugu scripts share high mutual intellegibility with each other, and are often considered to be regional variants of single script. Other scripts similar to Kannada script are Sinhala script, and Old Peguan script (used in Burma).

Gujarati script Indian script

The Gujarati script is an abugida used to write the Gujarati and Kutchi languages. It is a variant of the Devanagari script differentiated by the loss of the characteristic horizontal line running above the letters and by a number of modifications to some characters.

Malayalam script

Malayalam script is a Brahmic script used commonly to write the Malayalam language, which is the principal language of Kerala, India, spoken by 45 million people in the world. Malayalam script is also widely used for writing Sanskrit texts in Kerala. Like many other Indic scripts, it is an alphasyllabary (abugida), a writing system that is partially “alphabetic” and partially syllable-based. The modern Malayalam alphabet has 15 vowel letters, 42 consonant letters, and a few other symbols. The Malayalam script is a Vatteluttu alphabet extended with symbols from the Grantha alphabet to represent Indo-Aryan loanwords. The script is also used to write several minority languages such as Paniya, Betta Kurumba, and Ravula. The Malayalam language itself was historically written in several different scripts.

Soyombo script

The Soyombo script is an abugida developed by the monk and scholar Zanabazar in 1686 to write Mongolian. It can also be used to write Tibetan and Sanskrit.

The Balinese script, natively known as Aksara Bali and Hanacaraka, is an abugida used in the island of Bali, Indonesia, commonly for writing the Austronesian Balinese language, Old Javanese, and the liturgical language Sanskrit. With some modifications, the script is also used to write the Sasak language, used in the neighboring island of Lombok. The script is a descendant of the Brahmi script, and so has many similarities with the modern scripts of South and Southeast Asia. The Balinese script, along with the Javanese script, is considered the most elaborate and ornate among Brahmic scripts of Southeast Asia.

Dzongkha Sino-Tibetan language spoken in Bhutan

Dzongkha is a Sino-Tibetan language spoken by over half a million people in Bhutan; it is the sole official and national language of the Kingdom of Bhutan. The Tibetan script is used to write Dzongkha.

Uchen is the upright, block style of the Tibetan script. The name means "with a head," and is the style of the script used for printing and for formal manuscripts.

The Lontara script is a Brahmic script traditionally used for the Bugis, Makassarese and Mandar languages of Sulawesi in Indonesia. It is also known as the Bugis script, as Lontara documents written in this language are the most numerous.

Tibetan Braille is the braille alphabet for writing the Tibetan language. It was invented in 1992 by German socialworker Sabriye Tenberken. It is based on German braille, with some extensions from international usage. As in print, the vowel a is not written.

Sundanese script

Sundanese script is a writing system which is used by the Sundanese people. It is built based on Old Sundanese script which was used by the ancient Sundanese between the 14th and 18th centuries.

Mongolian script writing system used for the Mongolian language

The classical or traditional Mongolian script, also known as the Qudum Mongγol bičig, was the first writing system created specifically for the Mongolian language, and was the most widespread until the introduction of Cyrillic in 1946. It is traditionally written in vertical lines Top-Down, right across the page. Derived from the Old Uyghur alphabet, Mongolian is a true alphabet, with separate letters for consonants and vowels. The Mongolian script has been adapted to write languages such as Oirat and Manchu. Alphabets based on this classical vertical script are used in Inner Mongolia and other parts of China to this day to write Mongolian, Xibe and experimentally, Evenki.

Chakma script

The Chakma Script, also called Ojhapath, Ojhopath, Aaojhapath, is an abugida used for the Chakma language.

Old Tibetan refers to the period of Tibetan language reflected in documents from the adoption of writing by the Tibetan Empire in the mid-7th century to works of the early 11th century.


Kulitan, also known as súlat Kapampángan and pamagkulit, is one of the various indigenous suyat writing systems in the Philippines. It was used for writing Kapampangan, a language mainly spoken in Central Luzon, until it was gradually supplanted by the Latin alphabet.

Jirel is a Southern Tibetic language of Nepal. It is spoken in Jiri, Tshetrapa village, Jungu village, Cheppu village of Dolakha District and Sindhupalchok and different parts of Nepal.

Ka is the first consonant of Indic abugidas. In modern Indic scripts, ka is derived from the Brahmi letter , which is derived from the Aramaic ("K").

Ta is the sixteenth consonant of Indic abugidas. In modern Indic scripts, ja is derived from the early "Ashoka" Brahmi letter after having gone through the Gupta letter .

Ṭa is a consonant of Indic abugidas. It is derived from the early "Ashoka" Brahmi letter after having gone through the Gupta letter . As with the other cerebral consonants, ṭa is not found in most scripts for Tai, Sino-Tibetan, and other non-Indic languages, except for a few scripts, which retain these letters for transcribing Sanskrit religious terms.