Sinhala script

Last updated
Sinhala script (Sinhalese)
සිංහල අක්ෂර මාලාව
Siṁhala Akṣara Mālāva
siNhl in Noto Serif Sinhala Black.svg
Script type
Time period
c. 300–present [1]
Directionleft-to-right  OOjs UI icon edit-ltr-progressive.svg
Languages Sinhala, Pali, Sanskrit
Related scripts
Parent systems
Sister systems
ISO 15924
ISO 15924 Sinh, 348  OOjs UI icon edit-ltr-progressive.svg ,Sinhala
Unicode alias
 This article contains phonetic transcriptions in the International Phonetic Alphabet (IPA).For an introductory guide on IPA symbols, see Help:IPA.For the distinction between [ ], / / and  , see IPA § Brackets and transcription delimiters.

Sinhala script (Sinhala : සිංහල අක්ෂර මාලාව; Siṁhala Akṣara Mālāva), also known as Sinhalese script, is a writing system used by the Sinhalese people and most Sri Lankans in Sri Lanka and elsewhere to write the Sinhala language, as well as the liturgical languages, Pali and Sanskrit. [2] The Sinhalese Akṣara Mālāva, one of the Brahmic scripts, is a descendant of the Ancient Indian Brahmi script. [3]


Sinhala script is an abugida written from left to right. Sinhala letters are ordered into two sets. The core set of letters forms the śuddha siṃhala alphabet (Pure Sinhala, ශුද්ධ සිංහල), which is a subset of the miśra siṃhala alphabet (Mixed Sinhala, මිශ්‍ර සිංහල).


Sinhala script is a Brahmi derivate, and was imported from Northern India, around the 3rd century BCE. [4] Sinhala script developed in a complex manner, part independent but also strongly influenced by South Indian scripts at various stages, [5] manifestly influenced by the early Grantha script. [2] Pottery from the 6th century BCE has been found in Anuradhapura, with lithic inscriptions dating from 2nd century BCE written in Prakrit. [6]

Medieval Sinhalese, which emerged around 750 AD, is marked by very strong influence from the Grantha script. [1] Subsequently, Medieval (and modern) Sinhalese resemble the South Indian scripts. [5] By the 9th century CE, literature written in Sinhala script had emerged and the script began to be used in other contexts. For instance, the Buddhist literature of the Theravada-Buddhists of Sri Lanka, written in Pali, used Sinhala script.

Modern Sinhalese emerged in the 13th century and is marked by the composition of the grammar book Sidat Sangara. [1] In 1736 the Dutch were the first to print with Sinhala type on the island. The resulting type followed the features of that of the native Sinhala script practiced on palm leaves. The Dutch created type was monolinear and geometric in fashion with no separation between words in early documents. During the second half of the 19th century, during the Colonial period, a new style of Sinhala letterforms emerged in opposition to the monolinear and geometric form being high contrast in appearance and having varied thicknesses. This high contrast type gradually replaced the monolinear type as the preferred style which continues to be used in the present day. The high contrast style is still preferred for text typesetting in printed newspapers, books and magazines in Sri Lanka. [7]

Today, the alphabet is used by over 16 million people to write Sinhala in very diverse contexts, such as newspapers, TV commercials, government announcements, graffiti, and schoolbooks.

Sinhala is the main language written in this script, but rare instances of Sri Lanka Malay are recorded.[ citation needed ]


The basic form of the letter k is k "ka". For "ki", a small arch called ispilla is placed over the k: ki. This replaces the inherent /a/ by /i/. It is also possible to have no vowel following a consonant. In order to produce such a pure consonant, a special marker, the hal kirima has to be added: k. This marker suppresses the inherent vowel. Sinhala letter "k" (ka) combinations example.svg
The basic form of the letter k is ක "ka". For "ki", a small arch called ispilla is placed over the ක: කි. This replaces the inherent /a/ by /i/. It is also possible to have no vowel following a consonant. In order to produce such a pure consonant, a special marker, the hal kirīma has to be added: ක්. This marker suppresses the inherent vowel.

Sinhala script is an abugida written from left to right. It uses consonants as the basic unit for word construction as each consonant has an inherent vowel (/a/), which can be changed with a different vowel stroke. To represent different sounds it is necessary to add vowel strokes, or diacritics called පිලි Pili, that can be used before, after, above or below the base-consonant. Most of the Sinhala letters are curlicues; straight lines are almost completely absent from the alphabet, and it does not have joining characters. This is because Sinhala used to be written on dried palm leaves, which would split along the veins on writing straight lines. This was undesirable, and therefore, the round shapes were preferred. Upper and lower cases do not exist in Sinhala. [7]

Sinhala letters are ordered into two sets. The core set of letters forms the śuddha siṃhala alphabet (Pure Sinhala, ශුද්ධ සිංහල), which is a subset of the miśra siṃhala alphabet (Mixed Sinhala, මිශ්‍ර සිංහල). This "pure" alphabet contains all the graphemes necessary to write Eḷu (classical Sinhala) as described in the classical grammar Sidatsan̆garā (1300 AD). [8] This is the reason why this set is also called Eḷu hōdiya ("Eḷu alphabet" එළු හෝඩිය). The definition of the two sets is thus a historic one. Out of pure coincidence, the phoneme inventory of present-day colloquial Sinhala is such that yet again the śuddha alphabet suffices as a good representation of the sounds. [8] All native phonemes of the Sinhala spoken today can be represented in śuddha, while in order to render special Sanskrit and Pali sounds, one can fall back on miśra siṃhala. This is most notably necessary for the graphemes for the Middle Indic phonemes that the Sinhala language lost during its history, such as aspirates. [8]

Most phonemes of Sinhala can be represented by a śuddha letter or by a miśra letter, but normally only one of them is considered correct. This one-to-many mapping of phonemes onto graphemes is a frequent source of misspellings. [9]

While a phoneme can be represented by more than one grapheme, each grapheme can be pronounced in only one way, with the exceptions of the inherent vowel sound, which can be either [a] (stressed) or [ə] (unstressed), and "ව" where the consonant is either [v] or [w] depending on the word. This means that the actual pronunciation of a word is almost always clear from its orthographic form. Stress is almost always predictable; only words with [v] or [w] (which are both allophones of "ව"), and a very few other words need to be learnt individually.

Some pronunciation exceptions in Sinhala:


The two shapes of the hal kirima for p (left) and b (right). Sinhala-hal-kiriima.svg
The two shapes of the hal kirīma for p (left) and b (right).

In Sinhala the diacritics are called පිලි pili (vowel strokes). දිග diga means "long" because the vowel is sounded for longer and දෙක deka means "two" because the stroke is doubled when written.

Using the consonant 'k' + 'vowel' as an example:
පිල්ල pillaNameTransliterationFormationCompound form ISO 15919 IPA
හල් කිරිමHal kirīmaක්ක්k[k]
Inherent /a/ (without any pili)ක් + අka[kʌ]
ඇලපිල්ලÆlapillaක් + ආකා[kɑː]
ඇදයÆdayaක් + ඇකැ[kæ]
දිග ඇදයDiga ædayaක් + ඈකෑ[kæː]
ඉස්පිල්ලIspillaක් + ඉකිki[ki]
දිග ඉස්පිල්ලDiga ispillaක් + ඊකී[kiː]
පාපිල්ලPāpillaක් + උකුku[ku], [kɯ]
දිග පාපිල්ලDiga pāpillaක් + ඌකූ[kuː]
ගැටය සහිත ඇලපිල්ලGæṭa sahita ælapillaක් + ර් + උකෘkru[kru]
ගැටය සහිත ඇලපිලි දෙකGæṭa sahita ælapili dekaක් + ර් + ඌකෲkrū[kruː]
ගයනුකිත්තGayanukittaUsed in conjunction with kombuva for consonants.
දිග ගයනුකිත්තDiga gayanukittaNot in contemporary use
කොම්බුවKombuvaක් + එකෙke[ke]
කොම්බුව සහ හල්කිරීමKombuva saha halkirīmaක් + ඒකේ[keː]
කොම්බු දෙකKombu dekaක් + ඓකෛkai[kʌj]
කොම්බුව සහ ඇලපිල්ලKombuva saha ælapillaක් + ඔකොko[ko]
කොම්බුව සහ හල්ඇලපිල්ලKombuva saha halælapillaක් + ඕකෝ[koː]
කොම්බුව සහ ගයනුකිත්තKombuva saha gayanukittaක් + ඖකෞkau[kʌʋ]

Non-vocalic diacritics

The anusvara (often called binduva 'zero' ) is represented by one small circle ං (Unicode 0D82), [10] and the visarga (technically part of the miśra alphabet) by two ඃ (Unicode 0D83). The inherent vowel can be removed by a special virama diacritic, the hal kirīma ( ්), which has two shapes depending on which consonant it attaches to. Both are represented in the image on the right side. The first one is the most common one, while the second one is used for letters ending at the top left corner.


Śuddha set

The śuddha graphemes are the mainstay of Sinhala script and are used on an everyday-basis. Every sequence of sounds of Sinhala of today can be represented by these graphemes. Additionally, the śuddha set comprises graphemes for retroflex and , which are no longer phonemic in modern Sinhala. These two letters were needed for the representation of Eḷu, but are now obsolete from a purely phonemic view. However, words which historically contain these two phonemes are still often written with the graphemes representing the retroflex sounds.

Transliteration aāæǣiīuūeēoō
IPA [a,ə][aː,a][æ][æː][i][iː][u][uː][e][eː][o][oː]
Transliteration kgňgcjňḍtdňdpbmm̌byrlvsh
IPA [k][g][ᵑɡ][ʧ~ʨ][ʤ~ʥ][ʈ][ɖ][n][ⁿɖ][t][d][ⁿd][p][b][m][ᵐb][j][r][l][ɭ][ʋ][s][ɦ]


independentdiacritic independentdiacritic
0D85a[a]inherenta[a, ə] 0D86ā[aː]0DCFā[aː]
0D87æ/ä[æ]0DD0æ[æ] 0D88ǣ[æː]0DD1ǣ[æː]
0D89i[i]0DD2i[i] 0D8Aī[iː]0DD3ī[iː]
0D8Bu[u]0DD4u[u] 0D8Cū[uː]0DD6ū[uː]
0D91e[e]0DD9e[e] 0D92ē[eː]0DDAē[eː]
0D94o[o]0DDCo[o] 0D95ō[oː]0DDDō[oː]
Display this table as an image

Vowels come in two shapes: independent and diacritic. The independent shape is used when a vowel does not follow a consonant, e.g. at the beginning of a word. The diacritic shape is used when a vowel follows a consonant. Depending on the vowel, the diacritic can attach at several places (see diacritics section above)

While most diacritics are regular, the diacritic for u takes a different shape according to the consonant it attaches to. The most common one is the one used for the consonant ප (p): පු (pu) and පූ (pū). Some consonants ending at the lower right corner (ක (k),ග (g), ත(t), but not න(n) or හ(h)) use this diacritic: කු (ku) and කූ (kuu). Combinations of ර(r) or ළ(ḷ) with u have idiosyncratic shapes, viz රු (ru) රූ (rū) ළු (ḷu) ළූ (ḷū). [11]

Note that the diacritic used for රු (ru) and රූ (rū) is what is normally used for the æ, and therefore there are idiosyncratic forms for ræ and rǣ, viz රැ and රෑ [difference may not be visible depending on how unicode is rendered in your browser]


voiceless voiced
Unicode translit. IPA Unicodetranslit.IPA
velar 0D9Aka[ka]0D9Cga[ɡa]
retroflex 0DA7ṭa[ʈa]0DA9ḍa[ɖa]
dental 0DADta[t̪a]0DAFda[d̪a]
labial 0DB4pa[pa]0DB6ba[ba]
Other letters
fricatives 0DC3sa[sa]0DC4ha[ha]
affricates (ච)(0DA0)(ca)([t͡ʃa])0DA2ja[d͡ʒa]
nasals 0DB8ma[ma]0DB1na[na]
liquid 0DBDla[la]0DBBra[ra]
glide 0DC0va[ʋa]0DBAya[ja]
Display this table as an image

The śuddha alphabet comprises 8 plosives, 2 fricatives, 2 affricates, 2 nasals, 2 liquids and 2 glides. Additionally, there are the two graphemes for the retroflex sounds /ɭ/ and /ɳ/, which are not phonemic in modern Sinhala, but which still form part of the set. These are shaded in the table.

The voiceless affricate (ච [t͡ʃa]) is not included in the śuddha set by purists since it does not occur in the main text of the Sidatsan̆garā. The Sidatsan̆garā does use it in examples though, so this sound did exist in Eḷu. In any case, it is needed for the representation of modern Sinhala. [8]

The basic shapes of these consonants carry an inherent /a/ unless this is replaced by another vowel or removed by the hal kirīma.

Prenasalized consonants

Prenasalized consonants
Display this table as an image

The prenasalized consonants resemble their plain counterparts. m̆b is made up by the left half of m and the right half of b, while the other three are just like the grapheme for the plosive with a little stroke attached to their left. [12] Vowel diacritics attach in the same way as they would to the corresponding plain plosive.

Miśra set

The miśra alphabet is a superset of śuddha. It adds letters for aspirates, retroflexes and sibilants, which are not phonemic in today's Sinhala, but which are necessary to represent non-native words, like loanwords from Sanskrit, Pali or English. The use of the extra letters is mainly a question of prestige. From a purely phonemic point of view, there is no benefit in using them, and they can be replaced by a (sequence of) śuddha letters as follows: For the miśra aspirates, the replacement is the plain śuddha counterpart, for the miśra retroflex liquids the corresponding śuddha coronal liquid, [13] for the sibilants, s. [14] ඤ (ñ) and ඥ (gn) cannot be represented by śuddha graphemes but are found only in fewer than 10 words each. ෆ fa can be represented by ප pa with a Latin f inscribed in the cup.

Transliteration r̥̄aiaul̥̄
IPA [ri,ru][riː,ruː][ɑj][ɑw][li][liː]
Transliteration khghchjhñṭhḍhthdhnphbhśf
IPA [k][g][ŋ][ʧ~ʨ][ʤ~ʥ][ɲ][ʈ][ɖ][t][d][n][p][b][ʃ][ʃ][f]


(Click on [show] on the right if you see only boxes below)


Vocalic diacritics
independentdiacritic independentdiacritic
diphthongs 0D93ai[ai]0DDBai[ai]0D96au[au]0DDEau[au]diphthongs
syllabic r 0D8D[ur]0DD8[ru, ur]0D8E[ruː]0DF2[ruː, uːr]syllabic r
syllabic l 0D8F[li]0DDF[li]0D90[liː]0DF3[liː]syllabic l
Display this table as an image

There are six additional vocalic diacritics in the miśra alphabet. The two diphthongs are quite common, while the "syllabic" ṛ is much rarer, and the "syllabic" ḷ is all but obsolete. The latter are almost exclusively found in loanwords from Sanskrit. [15]

The miśra can also be written with śuddhar+u or u+r, which corresponds to the actual pronunciation. The miśra syllabic is obsolete, but can be rendered by śuddhal+i. [16] Miśra au is rendered as śuddhaawu, miśraai as śuddhaayi.

Note that the transliteration of both ළ් and ෟ is . This is not very problematic as the second one is extremely scarce.


(Click on [show] on the right if you see only boxes below)


Extra miśra plosives
Other additional miśra graphemes
aspirate affricates0DA1cha[t͡ʃa]0DA3jha[d͡ʒa]aspirate affricates
other0D9Eṅa[ŋa]0DC6fa[fa, ɸa, pa]other
other0DA6n̆ja [17] [nd͡ʒa]fපn/afa[fa, ɸa, pa]other
Display this table as an image

Consonant conjuncts

Sri shrii (Sri) in Sinhala with components marked.svg

Certain combinations of graphemes trigger special ligatures. Special signs exist for an ර (r) following a consonant (inverted arch underneath), a ර (r) preceding a consonant (loop above) and a ය (y) following a consonant (half a ය on the right). [13] [18] [19] Furthermore, very frequent combinations are often written in one stroke, like ddh, kv or . If this is the case, the first consonant is not marked with a hal kirīma. [13] [15] [19] The image on the right shows the glyph for śrī , which is composed of the letter ś with a ligature indicating the r below and the vowel ī marked above. Most other conjunct consonants are made with an explicit virama, called al-lakuna or hal kirīma, and the zero-width joiner as shown in the following table, some of which may not display correctly due to limitations of your system. Some of the more common are displayed in the following table. Note that although modern Sinhala sounds are not aspirated, aspiration is marked in the sound where it was historically present to highlight the differences in modern spelling. Also note that all of the combinations are encoded with the al-lakuna (Unicode U+0DCA) first, followed by the zero-width joiner (Unicode U+200D) except for touching letters which have the zero-width joiner (Unicode U+200D) first followed by the al-lakuna (Unicode U+0DCA). Touching letters were used in ancient scriptures but are not used in modern Sinhala. Vowels may be attached to any of the ligatures formed, attaching to the rightmost part of the glyph except for vowels that use the kombuva, where the kombuva is written before the ligature or cluster and the remainder of the vowel, if any, is attached to the rightmost part. In the table below, appending "o" (kombuva saha ælepillakombuva with ælepilla) to the cluster "ky" /kja/ only adds a single code point, but adds two vowel strokes, one each to the left and right of the consonant cluster.

/kja/ක්යU+0D9A U+0DCA U+0DBAක්‍යU+0D9A U+0DCA U+200D U+0DBAyansaya
/kjo/ක්යොU+0D9A U+0DCA U+0DBA U+0DCCක‍්‍යොU+0D9A U+0DCA U+200D U+0DBA U+0DCCyansaya
/ɡja/ග්යU+0D9C U+0DCA U+0DBAග්‍යU+0D9C U+0DCA U+200D U+0DBAyansaya
/kra/ක්රU+0D9A U+0DCA U+0DBBක්‍රU+0D9A U+0DCA U+200D U+0DBBrakāransaya
/ɡra/ග්රU+0D9C U+0DCA U+0DBBග්‍රU+0D9C U+0DCA U+200D U+0DBBrakāransaya
/rka/ර්කU+0DBB U+0DCA U+0D9Aර්‍කU+0DBB U+0DCA U+200D U+0D9Arēpaya
/rɡa/ර්ගU+0DBB U+0DCA U+0D9Cර්‍ගU+0DBB U+0DCA U+200D U+0D9Crēpaya
/kjra/ක්ය්රU+0D9A U+0DCA U+0DBA U+0DCA U+0DBBක්‍ය්‍රU+0D9A U+0DCA U+200D U+0DBA U+0DCA U+200D U+0DBByansaya + rakāransaya
/ɡjra/ග්ය්රU+0D9C U+0DCA U+0DBA U+0DCA U+0DBBග්‍ය්‍රU+0D9C U+0DCA U+200D U+0DBA U+0DCA U+200D U+0DBByansaya + rakāransaya
/rkja/ර්ක්යU+0DBB U+0DCA U+0D9A U+0DCA U+0DBAර්‍ක්‍යU+0DBB U+0DCA U+200D U+0D9A U+0DCA U+200D U+0DBArēpaya + yansaya
/rɡja/ර්ග්යU+0DBB U+0DCA U+0D9C U+0DCA U+0DBAර්‍ග්‍යU+0DBB U+0DCA U+200D U+0D9C U+0DCA U+200D U+0DBArēpaya + yansaya
/kva/ක්වU+0D9A U+0DCA U+0DC0ක්‍වU+0D9A U+0DCA U+200D U+0DC0conjunct
/kʃa/ක්ෂU+0D9A U+0DCA U+0DC2ක්‍ෂU+0D9A U+0DCA U+200D U+0DC2conjunct
/ɡdʰa/ග්ධU+0D9C U+0DCA U+0DB0ග්‍ධU+0D9C U+0DCA U+200D U+0DB0conjunct
/ʈʈʰa/ට්ඨU+0DA7 U+0DCA U+0DA8ට්‍ඨU+0DA7 U+0DCA U+200D U+0DA8conjunct
/t̪t̪ʰa/ත්ථU+0DAD U+0DCA U+0DAEත්‍ථU+0DAD U+0DCA U+200D U+0DAEconjunct
/t̪va/ත්වU+0DAD U+0DCA U+0DC0ත්‍වU+0DAD U+0DCA U+200D U+0DC0conjunct
/d̪d̪ʰa/ද්ධU+0DAF U+0DCA U+0DB0ද්‍ධU+0DAF U+0DCA U+200D U+0DB0conjunct
/d̪va/ද්වU+0DAF U+0DCA U+0DC0ද්‍වU+0DAF U+0DCA U+200D U+0DC0conjunct
/nd̪a/න්දU+0DB1 U+0DCA U+0DAFන්‍දU+0DB1 U+0DCA U+200D U+0DAFconjunct
/nd̪ʰa/න්ධU+0DB1 U+0DCA U+0DB0න්‍ධU+0DB1 U+0DCA U+200D U+0DB0conjunct
/mma/ම්මU+0DB8 U+0DCA U+0DB8ම‍්මU+0DB8 U+200D U+0DCA U+0DB8touching

Letter names

The Sinhala śuddha graphemes are named in a uniform way adding -yanna to the sound produced by the letter, including vocalic diacritics. [10] [20] The name for the letter අ is thus ayanna, for the letter ආ āyanna, for the letter ක kayanna, for the letter කා kāyanna, for the letter කෙ keyanna and so forth. For letters with hal kirīma, an epenthetic a is added for easier pronunciation: the name for the letter ක් is akyanna. Another naming convention is to use al- before a letter with suppressed vowel, thus alkayanna.

Since the extra miśra letters are phonetically not distinguishable from the śuddha letters, proceeding in the same way would lead to confusion. Names of miśra letters are normally made up of the names of two śuddha letters pronounced as one word. The first one indicates the sound, the second one the shape. For example, the aspirated ඛ (kh) is called bayanu kayanna. kayanna indicates the sound, while bayanu indicates the shape: ඛ (kh) is similar in shape to බ (b) (bayunu = like bayanna). Another method is to qualify the miśra aspirates by mahāprāna (ඛ: mahāprāna kayanna) and the miśra retroflexes by mūrdhaja (ළ: mūrdhaja layanna).


Sinhala had special symbols to represent numerals, which were in use until the beginning of the 19th century. This system is now superseded by Hindu–Arabic numeral system. [21] [22]

Sinhala Illakkam (Sinhala Archaic Numbers)

Sinhala Illakkam were used for writing numbers prior to the fall of Kandyan Kingdom in 1815. These digits did not have a zero instead the numbers had signs for 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 1000. These digits and numbers can be seen primarily in Royal documents and artefacts.

Sinhala Lith Illakkam (Sinhala Astrological Numbers)

Prior to the fall of Kandyan Kingdom all calculations were carried out using Lith digits. After the fall of the Kandyan Kingdom, Sinhala Lith Illakkam were primarily used for writing horoscopes. However, there is evidence that they were used for other purposes such as writing page numbers etc. The tradition of writing degrees and minutes of zodiac signs in horoscopes continued into the 20th century using different versions of Lith Digits. Unlike the Sinhala Illakkam, Sinhala Lith Illakkam included a 0.


Neither the Sinhala numerals nor U+0DF4 ෴ Sinhala punctuation kunddaliya is in general use today. The kunddaliya was formerly used as a full stop. [23]


Sinhala transliteration (Sinhala: රෝම අකුරින් ලිවීම rōma akurin livīma, literally "Roman letter writing") can be done in analogy to Devanāgarī transliteration.

Layman's transliterations in Sri Lanka normally follow neither of these. Vowels are transliterated according to English spelling equivalences, which can yield a variety of spellings for a number of phonemes. /iː/ for instance can be ee, e, ea, i, etc. A transliteration pattern peculiar to Sinhala, and facilitated by the absence of phonemic aspirates, is the use of th for the voiceless dental plosive, and the use of t for the voiceless retroflex plosive. This is presumably because the retroflex plosive /ʈ/ is perceived the same as the English alveolar plosive /t/, and the Sinhala dental plosive /t̪/ is equated with the English voiceless dental fricative /θ/. [24] Dental and retroflex voiced plosives are always rendered as d, though, presumably because dh is not found as a representation of /ð/ in English orthography.

Use for the Pali language

Many of the oldest manuscripts in the Pali language are written in the Sinhala script. Miśra consonants are used to represent Pali phonemes that have no Sinhala counterpart. The following table lays out the Sinhala representations of Pali consonants with their standard academic Romanizations:

velar (ka) (kha) (ga) (gha) (ṅa)
palatal (ca) (cha) (ja) (jha) (ña)
retroflex (ṭa) (ṭha) (ḍa) (ḍha) (ṇa)
dental (ta) (tha) (da) (dha) (na)
labial (pa) (pha) (ba) (bha) (ma)
unordered (ya) (ra) (la) (va) (sa) (ha) (ḷa)

The vowels are a subset of those for writing Sinhala:

(on ක ka)

The niggahīta is represented with the sign ං. Consonant sequences may be combined in ligatures in a manner identical to that described above for Sinhala.

As an example, below is the first verse from the Dhammapada in Pali in Sinhala script, followed by Romanization:

මනොපුබ්‌බඞ්‌ගමා ධම්‌මා, මනොසෙට්‌ඨා මනොමයා;
මනසා චෙ පදුට්‌ඨෙන, භාසති වා කරොති වා;
තතො නං දුක්‌ඛමන්‌වෙති, චක්‌කංව වහතො පදං.

Manopubbaṅgamā dhammā, manoseṭṭhā manomayā;
manasā ce paduṭṭhena bhāsati vā karoti vā;
tato naṁ dukkhamanveti cakkaṁva vahato padaṁ.

Yamaka-vaggo 1

Relation to other scripts


Sinhala is one of the Brahmic scripts, and thus shares many similarities with other members of the family, such as the Kannada, Malayalam, Telugu, Tamil script and Devanāgarī. As a general example, /a/ is the inherent vowel in all these scripts. [2] Other similarities include the diacritic for ai, which resembles a doubled e in all scripts and the diacritic for au which is composed of preceding e and following .


Likewise, the combination of the diacritics for e and ā yields o in all these scripts.


Sinhala alphabet differs from other Indo-Aryan alphabets in that it contains a pair of vowel sounds (U+0DD0 and U+0DD1 in the proposed Unicode Standard) that are unique to it. These are the two vowel sounds that are similar to the two vowel sounds that occur at the beginning of the English words at (ඇ) and ant (ඈ). [25]

Another feature that distinguishes Sinhala from its sister Indo-Aryan languages is the presence of a set of five nasal sounds known as half-nasal or prenasalized stops.


Computer encoding

Sinhala language support in Linux. Firefox is shown in the background, with mlterm in the foreground with text having been entered into it by ibus-m17n. Sinhala language support in Linux.png
Sinhala language support in Linux. Firefox is shown in the background, with mlterm in the foreground with text having been entered into it by ibus-m17n.

Generally speaking, Sinhala support is less developed than support for Devanāgarī, for instance. A recurring problem is the rendering of diacritics which precede the consonant and diacritic signs which come in different shapes, like the one for u.

Sinhala support did not come built in with Microsoft Windows XP, unlike Tamil and Hindi, but was supported by third-party means such as Keyman by SIL International. Thereafter, all versions of Windows Vista and above, including Windows 10 come with Sinhala support by default, and do not require external fonts to be installed to read Sinhala script. Nirmala UI is the default Sinhala font in Windows 10. The latest versions of Windows 10 have added support for Sinhala Archaic Numbers that were not supported by default in previous versions.

For macOS, Apple Inc. has provided Sinhala font support for versions of macOS that are Catalina and above through Unicode integration. Keyboard support is available by third-party means such as Helakuru and Keyman. In Mac OS X, Sinhala font and keyboard support were provided by Nickshanks and Xenotypetech.

For Linux, the IBus, and SCIM input methods allow the use Sinhala script in applications with support for a number of key maps and techniques such as traditional, phonetic and assisted techniques. [26] In addition, newer versions of the Android mobile operating system also support both rendering and input of Sinhala script by default and applications like Helakuru serve as dedicated keyboard integrators.


Sinhala script was added to the Unicode Standard in September 1999 with the release of version 3.0. This character allocation has been adopted in Sri Lanka as the Standard SLS1134.

The main Unicode block for Sinhala is U+0D80–U+0DFF. Another block, Sinhala Archaic Numbers, was added to Unicode in version 7.0.0 in June 2014. Its range is U+111E0–U+111FF.

Sinhala [1] [2]
Official Unicode Consortium code chart (PDF)
1. ^ As of Unicode version 13.0
2. ^ Grey areas indicate non-assigned code points
Sinhala Archaic Numbers [1] [2]
Official Unicode Consortium code chart (PDF)
1. ^ As of Unicode version 13.0
2. ^ Grey areas indicate non-assigned code points

See also

Related Research Articles

Arabic alphabet Alphabet for Arabic and other languages

The Arabic alphabet, or Arabic abjad, is the Arabic script as it is codified for writing Arabic. It is written from right to left in a cursive style and includes 28 letters. Most letters have contextual letterforms.

A diacritic is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek διακριτικός, from διακρίνω. The word diacritic is a noun, though it is sometimes used in an attributive sense, whereas diacritical is only an adjective. Some diacritics, such as the acute ( ´ ) and grave ( ` ), are often called accents. Diacritics may appear above or below a letter or in some other position such as within the letter or between two letters.

The Hebrew alphabet, known variously by scholars as the Ktav Ashuri, Jewish script, square script and block script, is an abjad script used in the writing of the Hebrew language and other Jewish languages, most notably Yiddish, Ladino, Judeo-Arabic, and Judeo-Persian. It is an offshoot of the Imperial Aramaic alphabet, which flourished during the Achaemenid Empire and which itself derives from the Phoenician alphabet.

International Phonetic Alphabet Alphabetic system of phonetic notation

The International Phonetic Alphabet (IPA) is an alphabetic system of phonetic notation based primarily on the Latin script. It was devised by the International Phonetic Association in the late 19th century as a standardized representation of speech sounds in written form. The IPA is used by lexicographers, foreign language students and teachers, linguists, speech–language pathologists, singers, actors, constructed language creators and translators.

Kannada script

The Kannada script is an abugida of the Brahmic family, used primarily to write the Kannada language, one of the Dravidian languages of South India especially in the state of Karnataka. Kannada script is widely used for writing Sanskrit texts in Karnataka. Several minor languages, such as Tulu, Konkani, Kodava, Havyaka, Sanketi and Beary, also use alphabets based on the Kannada script. The Kannada and Telugu scripts share high mutual intellegibility with each other, and are often considered to be regional variants of single script. Other scripts similar to Kannada script are Sinhala script, and Old Peguan script (used in Burma).

The Thai script is the abugida used to write Thai, Southern Thai and many other languages spoken in Thailand. The Thai alphabet itself has 44 consonant symbols, 16 vowel symbols that combine into at least 32 vowel forms and four tone diacritics to create characters mostly representing syllables.

Malayalam script

Malayalam script is a Brahmic script used commonly to write the Malayalam language, which is the principal language of Kerala, India, spoken by 45 million people in the world. Malayalam script is also widely used for writing Sanskrit texts in Kerala. Malayalam script bears high similarity with Tigalari script, which was used for writing the Tulu language, spoken in coastal Karnataka and the northernmost Kasargod district of Kerala. Like many other Indic scripts, it is an alphasyllabary (abugida), a writing system that is partially “alphabetic” and partially syllable-based. The modern Malayalam alphabet has 15 vowel letters, 42 consonant letters, and a few other symbols. The Malayalam script is a Vatteluttu alphabet extended with symbols from the Grantha alphabet to represent Indo-Aryan loanwords. The script is also used to write several minority languages such as Paniya, Betta Kurumba, and Ravula. The Malayalam language itself was historically written in several different scripts. Malayalam script is the only script in India that can be used to write any other language of India as it contain letters to denote both of the Voiced retroflex approximant (/ɻ/) and the letters unique to Sanskrit.

Anusvara is a symbol used in many Indic scripts to mark a type of nasal sound, typically transliterated ⟨ṃ⟩. Depending on its location in the word and the language for which it is used, its exact pronunciation can vary. In the context of ancient Sanskrit, anusvara is the name of the particular nasal sound itself, regardless of written representation.

Finnish orthography is based on the Latin script, and uses an alphabet derived from the Swedish alphabet, officially comprising 29 letters but also has two additional letters found in some loanwords. The Finnish orthography strives to represent all morphemes phonologically and, roughly speaking, the sound value of each letter tends to correspond with its value in the International Phonetic Alphabet (IPA) – although some discrepancies do exist.

The Burmese alphabet is an abugida used for writing Burmese. It is ultimately a Brahmic script adapted from either the Kadamba or Pallava alphabet of South India and more immediately an adaptation of Pyu script or Old Mon script. The Burmese alphabet is also used for the liturgical languages of Pali and Sanskrit.

Devanāgarī is an Indian script used for languages including Hindi, Marathi, Nepali and Sanskrit. There are several somewhat similar methods of transliteration from Devanāgarī to the Roman script, including the influential and lossless IAST notation.

Sinhala language Indo-Aryan language native to Sri Lanka

Sinhala, also known as Sinhalese, is an Indo-Aryan language primarily spoken by the Sinhalese people of Sri Lanka, who make up the largest ethnic group on the island, numbering about 16 million. Sinhala is also spoken as the first language by other ethnic groups in Sri Lanka, totaling about 4 million people as of 2001. It is written using the Sinhala script, which is one of the Brahmic scripts; a descendant of the ancient Indian Brahmi script closely related to the Kadamba script.

Digraph (orthography)

A digraph or digram is a pair of characters used in the orthography of a language to write either a single phoneme, or a sequence of phonemes that does not correspond to the normal values of the two characters combined.

Tamil script

The Tamil script is an abugida script that is used by Tamils and Tamil speakers in India, Sri Lanka, Malaysia, Singapore, Indonesia and elsewhere to write the Tamil language. Certain minority languages such as Saurashtra, Badaga, Irula and Paniya are also written in the Tamil script.

The Greek alphabet has been used to write the Greek language since the late ninth or early eighth century BC. It is derived from the earlier Phoenician alphabet, and was the first alphabetic script in history to have distinct letters for vowels as well as consonants. In Archaic and early Classical times, the Greek alphabet existed in many local variants, but, by the end of the fourth century BC, the Euclidean alphabet, with twenty-four letters, ordered from alpha to omega, had become standard and it is this version that is still used to write Greek today.

The Assamese alphabet, is a writing system of the Assamese language and is a part of the Bengali-Assamese script. This script was also used in Assam and nearby regions for Sanskrit as well as other languages such as Bodo, Khasi, Missing, Jaintia etc. It evolved from Kamarupi script. The current form of the script has seen continuous development from the 5th-century Umachal/Nagajari-Khanikargaon rock inscriptions written in an eastern variety of the Gupta script, adopting significant traits from the Siddhaṃ script in the 7th century. By the 17th century three styles of Assamese alphabets could be identified that converged to the standard script following typesetting required for printing. The present standard is identical to the Bengali alphabet except for two letters, ৰ (ro) and ৱ (vo); and the letter ক্ষ (khya) has evolved into an individual consonant by itself with its own phonetic quality whereas in the Bengali alphabet it is a conjugate of two letters.

Unicode supports several phonetic scripts and notations through the existing writing systems and the addition of extra blocks with phonetic characters. These phonetic extras are derived of an existing script, usually Latin, Greek or Cyrillic. In Unicode there is no "IPA script". Apart from IPA, extensions to the IPA and obsolete and nonstandard IPA symbols, these blocks also contain characters from the Uralic Phonetic Alphabet and the Americanist Phonetic Alphabet.

Urdu alphabet Perso-Arabic-based alphabet for Urdu of 39 letters

The Urdu alphabet, is the right-to-left alphabet used for the Urdu language. It is a modification of the Persian alphabet, which is itself a derivative of the Arabic alphabet. The Urdu alphabet has up to 39 or 40 distinct letters with no distinct letter cases and is typically written in the calligraphic Nastaʿlīq script, whereas Arabic is more commonly written in the Naskh style.

Bengali alphabet Abugida script used in writing Bengali

The Bengali alphabet or Bangla alphabet is the alphabet used to write the Bengali language and is a part of the Bengali-Assamese script, and has historically been used to write Sanskrit within Bengal. It is one of the most widely adopted writing systems in the world.

The Pashto alphabet is transliterated vis-à-vis Perso-Arabic scriptural denotation with additional glyphs added to accommodate phonemes used in Pashto.


  1. 1 2 3 Diringer, David (1948). Alphabet a key to the history of mankind. p. 389.
  2. 1 2 3 Daniels (1996), p. 408.
  3. Masica, Colin P. (1993). The Indo-Aryan Languages. p. 143.
  4. Daniels (1996), p. 379.
  5. 1 2 Cardona, George; Dhanesh, Jain (2003). THE INDO-ARYAN LANGUAGES. p. 109.
  6. Ray, Himanshu Prabha (2003-08-14). The Archaeology of Seafaring in Ancient South Asia. Cambridge University Press. ISBN   9780521011099.
  7. 1 2 "The Sinhala Script". Dalton Maag. Retrieved 26 August 2018.
  8. 1 2 3 4 Gair and Paolillo 1997.
  9. Matzel (1983) p. 15, 17, 18
  10. 1 2 Karunatillake (2004), p. xxxii
  11. Jayawardena-Moser (2004) p. 11
  12. Fairbanks et al. (1968), p. 126
  13. 1 2 3 Karunatillake (2004), p. xxxi
  14. Daniels (1996), p. 410.
  15. 1 2 Matzel (1983), p. 8
  16. Matzel (1983), p. 14
  17. This letter is not used anywhere, neither in modern nor ancient Sinhala. Its usefulness is unclear, but it forms part of the standard alphabet <>.
  18. Fairbanks et al. (1968), p. 109
  19. 1 2 Jayawardena-Moser (2004), p. 12
  20. Fairbanks et al. (1968), p. 366
  21. "Online edition of Sunday Observer – Business". Sunday Observer. Archived from the original on 7 February 2009. Retrieved 21 September 2008.External link in |publisher= (help)
  22. "Unicode Mail List Archive: Re: Sinhala numerals". Unicode Consortium . Retrieved 21 September 2008.External link in |publisher= (help)
  23. Roland Russwurm. "Old Sinhala Numbers and Digits". Sinhala Online . Retrieved 23 September 2008.External link in |publisher= (help)
  24. Matzel (1983), p. 16
  26. A screenshot showing some of the options

Further reading

Online resources