National Library at Kolkata romanisation

Last updated

The National Library at Kolkata romanisation [1] is a widely used transliteration scheme in dictionaries and grammars of Indic languages. This transliteration scheme is also known as (American) Library of Congress and is nearly identical to one of the possible ISO 15919 variants. The scheme is an extension of the IAST scheme that is used for transliteration of Sanskrit.

Contents

Scheme table

The table below mostly uses Devanagari but it also includes letters from Bengali (purple) and Tamil (green) to illustrate the transliteration of non-Devanagari characters.

अंअः
aāiīuūleēaiêoōauaṃaḥ
kakhagaghaṅacachajajhaña
ṭaṭhaḍaḍhaṇatathadadhana
paphababhamaẏazaḷaṟaṉa
yaralavaśaṣasaha
क़ख़ग़ज़ड़ढ़फ़
qakhaghazad̂ad̂hafa

Computer input by selection from a screen

Applet for character selection Gucharmap screenshot.png
Applet for character selection

Many systems provide a way to select Unicode characters visually. ISO/IEC 14755 refers to this as a screen-selection entry method.

Microsoft Windows has provided a Unicode version of the Character Map program (find it by hitting ⊞ Win+R then type charmap then hit ↵ Enter) since version NT 4.0 – appearing in the consumer edition since XP. This is limited to characters in the Basic Multilingual Plane (BMP). Characters are searchable by Unicode character name, and the table can be limited to a particular code block. More advanced third-party tools of the same type are also available (a notable freeware example is BabelMap).

macOS provides a "character palette" with much the same functionality, along with searching by related characters, glyph tables in a font, etc. It can be enabled in the input menu in the menu bar under System Preferences → International → Input Menu (or System Preferences → Language and Text → Input Sources) or can be viewed under Edit → Emoji & Symbols in many programs.

Equivalent tools – such as gucharmap (GNOME) or kcharselect (KDE) – exist on most Linux desktop environments.

Font support

Only certain fonts support all Latin Unicode characters for the transliteration of Indic scripts according to the ISO 15919 standard. For example, Tahoma supports almost all the characters needed. Arial and Times New Roman font packages that come with Microsoft Office 2007 and later also support most Latin Extended Additional characters like ḍ, ḥ, ḷ, ḻ, ṁ, ṅ, ṇ, ṛ, ṣ and ṭ. The open-source fonts Libertinus Serif and Libertinus Sans (forked from the Linux Libertine project) also have full support.

Literature

See also

Related Research Articles

<span class="mw-page-title-main">Devanagari</span> Writing script for many North Indian and Nepalese languages

Devanāgarī or Devanagari, also called Nāgarī, is a left-to-right abugida, based on the ancient Brāhmī script, used in the northern Indian subcontinent. It is one of the official scripts of the Republic of India and Nepal. It was developed and in regular use by the 7th century CE and achieved its modern form by 1000 CE. The Devanāgarī script, composed of 48 primary characters, including 14 vowels and 34 consonants, is the fourth most widely adopted writing system in the world, being used for over 120 languages.

<span class="mw-page-title-main">Brahmic scripts</span> Family of abugida writing systems

The Brahmic scripts, also known as Indic scripts, are a family of abugida writing systems. They are used throughout the Indian subcontinent, Southeast Asia and parts of East Asia. They are descended from the Brahmi script of ancient India and are used by various languages in several language families in South, East and Southeast Asia: Indo-Aryan, Dravidian, Tibeto-Burman, Mongolic, Austroasiatic, Austronesian, and Tai. They were also the source of the dictionary order (gojūon) of Japanese kana.

OpenType is a format for scalable computer fonts. Derived from TrueType, it retains TrueType's basic structure but adds many intricate data structures for describing typographic behavior. OpenType is a registered trademark of Microsoft Corporation.

Devanagari is an Indic script used for many Indo-Aryan languages of North India and Nepal, including Hindi, Marathi and Nepali, which was the script used to write Classical Sanskrit. There are several somewhat similar methods of transliteration from Devanagari to the Roman script, including the influential and lossless IAST notation. Romanized Devanagari is also called Romanagari.

Indian Standard Code for Information Interchange (ISCII) is a coding scheme for representing various writing systems of India. It encodes the main Indic scripts and a Roman transliteration. The supported scripts are: Bengali–Assamese, Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, and Telugu. ISCII does not encode the writing systems of India that are based on Persian, but its writing system switching codes nonetheless provide for Kashmiri, Sindhi, Urdu, Persian, Pashto and Arabic. The Persian-based writing systems were subsequently encoded in the PASCII encoding.

The International Alphabet of Sanskrit Transliteration (IAST) is a transliteration scheme that allows the lossless romanisation of Indic scripts as employed by Sanskrit and related Indic languages. It is based on a scheme that emerged during the 19th century from suggestions by Charles Trevelyan, William Jones, Monier Monier-Williams and other scholars, and formalised by the Transliteration Committee of the Geneva Oriental Congress, in September 1894. IAST makes it possible for the reader to read the Indic text unambiguously, exactly as if it were in the original Indic script. It is this faithfulness to the original scripts that accounts for its continuing popularity amongst scholars.

The Harvard-Kyoto Convention is a system for transliterating Sanskrit and other languages that use the Devanāgarī script into ASCII. It is predominantly used informally in e-mail, and for electronic texts.

ISO 15919 is one of a series of international standards for romanization by the International Organization for Standardization. It was published in 2001 and uses diacritics to map the much larger set of consonants and vowels in Brahmic and Nastaliq scripts to the Latin script.

<span class="mw-page-title-main">Open-source Unicode typefaces</span>

There are Unicode typefaces which are open-source and designed to contain glyphs of all Unicode characters, or at least a broad selection of Unicode scripts. There are also numerous projects aimed at providing only a certain script, such as the Arabeyes Arabic font. The advantage of targeting only some scripts with a font was that certain Unicode characters should be rendered differently depending on which language they are used in, and that a font that only includes the characters a certain user needs will be much smaller in file size compared to one with many glyphs. Unicode fonts in modern formats such as OpenType can in theory cover multiple languages by including multiple glyphs per character, though very few actually cover more than one language's forms of the unified Han characters.

The "Indian languages TRANSliteration" (ITRANS) is an ASCII transliteration scheme for Indic scripts, particularly for the Devanagari script.

There are several romanisation schemes for the Malayalam script, including ITRANS and ISO 15919.

A Unicode font is a computer font that maps glyphs to code points defined in the Unicode Standard. The vast majority of modern computer fonts use Unicode mappings, even those fonts which only include glyphs for a single writing system, or even only support the basic Latin alphabet. Fonts which support a wide range of Unicode scripts and Unicode symbols are sometimes referred to as "pan-Unicode fonts", although as the maximum number of glyphs that can be defined in a TrueType font is restricted to 65,535, it is not possible for a single font to provide individual glyphs for all defined Unicode characters. This article lists some widely used Unicode fonts that support a comparatively large number and broad range of Unicode characters.

Unicode supports several phonetic scripts and notations through its existing scripts and the addition of extra blocks with phonetic characters. These phonetic characters are derived from an existing script, usually Latin, Greek or Cyrillic. Apart from the International Phonetic Alphabet (IPA), extensions to the IPA and obsolete and nonstandard IPA symbols, these blocks also contain characters from the Uralic Phonetic Alphabet and the Americanist Phonetic Alphabet.

InScript is the decreed standard keyboard layout for Indian scripts using a standard 104- or 105-key layout. This keyboard layout was standardised by the Government of India for inputting text in languages of India written in Brahmic scripts, as well as the Santali language, written in the non-Brahmic Ol Chiki script. It was developed by the Indian Government and supported by several public and private organisations. This is the standard keyboard for 12 Indian scripts including Devanagari, Bengali, Gujarati, Gurmukhi, Kannada, Malayalam, Odia, Tamil and Telugu, among others. The InScript layout is built into most of the major operating systems including Windows, and most Linux and Mac OS systems. It is also available in some mobile phones and in Apple's iOS 5 and higher. It is available in Android 4.0 and higher but removed from latest Google Keyboard application (Gboard) and Google Indic Keyboard. It is also available for Windows Mobile 5.x and 6.x from third parties.

Indic Computing means "computing in Indic", i.e., Indian Scripts and Languages. It involves developing software in Indic Scripts/languages, Input methods, Localization of computer applications, web development, Database Management, Spell checkers, Speech to Text and Text to Speech applications and OCR in Indian languages.

Bengali input methods refer to different systems developed to type the characters of the Bengali script for Bengali language and others, using a typewriter or a computer keyboard.

<span class="mw-page-title-main">Tamil keyboard</span> Keyboard layout

The Tamil keyboard is used in computers and mobile devices to input text in the Tamil script.

The Velthuis system of transliteration is an ASCII transliteration scheme for the Sanskrit language from and to the Devanagari script. It was developed in about 1983 by Frans Velthuis, a scholar living in Groningen, Netherlands, who created a popular, high-quality software package in LaTeX for typesetting Devanāgarī. The primary documentation for the scheme is the system's clearly-written software manual. It is based on using the ISO 646 repertoire to represent mnemonically the accents used in standard scholarly transliteration. It does not use diacritics as IAST does. It may optionally use capital letters in a manner similar but not identical to the Harvard-Kyoto or ITRANS schemes.manual para 4.1

<span class="mw-page-title-main">Meitei input methods</span>

Meitei input methods are the methods that allow users of computers to input texts in the Meitei script, systematically for Meitei language.

<span class="mw-page-title-main">Bharati script</span> Proposed common script for Indian languages

Bharati Script is a constructed script, and an abugida created by a research team led by V. Srinivasa Chakravarthy at IIT Madras. It is designed to serve as a common script or link script for Indian languages.

References

  1. See p 24-26 for table comparing Indic languages, and p 33-34 for Devanagari alphabet listing. "Annex-F: Roman Script Transliteration" (PDF). Indian Standard: Indian Script Code for Information Interchange — ISCII. Bureau of Indian Standards. 1 April 1999. p. 32. Retrieved 20 November 2006.