This article needs additional citations for verification .(August 2018) |
InScript (short for Indic Script) is the decreed standard keyboard layout for Indian scripts using a standard 104- or 105-key layout. This keyboard layout was standardised by the Government of India for inputting text in languages of India written in Brahmic scripts, as well as the Santali language, written in the non-Brahmic Ol Chiki script. [1] It was developed by the Indian Government and supported by several public and private organisations. This is the standard keyboard for 12 Indian scripts including Devanagari, Bengali, Gujarati, Gurmukhi, Kannada, Malayalam, Odia, Tamil and Telugu, among others. The InScript layout is built into most of the major operating systems including Windows (2000 and later), and most Linux and Mac OS systems. It is also available in some mobile phones and (in the case of Tamil and Hindi) in Apple's iOS 5[ citation needed ] and higher. It is available in Android 4.0 (Ice Cream Sandwich) and higher but removed from latest Google Keyboard application (Gboard) and Google Indic Keyboard. It is also available for Windows Mobile 5.x and 6.x from third parties.
Devanagari InScript bilingual keyboard layout has a common layout for all the Indian scripts. Most Indic scripts have the same phonetic character order. A person who knows InScript typing in one script can type in any other Indic script using dictation even without knowledge of that script.
The first InScript keyboard was standardised in 1986 under the auspices of the DOE (Department of Electronics at the Ministry of Communications & Information Technology). [2] It was subsequently revised in 1988 by a DOE committee and modifications were made to accommodate nuqta extended keys as well as to add certain matras. The last revision to the BIS document was made in 1992, after which the document has not undergone any revision. This was partly because very few new characters were added to the ISCII code-set and these if at all were handled by extending and generating the character by the use of the nuqta. The BIS document specifically mentions such characters. Hence the InScript keyboards were felt to be self-sufficient. With the advent of Unicode, a few new characters were added to each code-page; characters for which the BIS document had not made any provision. In addition Unicode introduced the concept of ZWJ and ZWNJ, as well as that of normalisation.
These new features had marked repercussions on storage as well as inputting and an urgent need was felt for a revision whereby each new character introduced in Unicode would be accommodated on the keyboard and a uniform manner of entering data as well as storing data would be devised. With this urgent requirement in mind, CDAC GIST involved in the initiative all major players: IBM, Microsoft and Red Hat Linux and hence in 2008, a joint meeting was organised between CDAC GIST and senior representatives of these multi-nationals to devise a common and uniform strategy for inputting and equally important for storage. This would enable the creation of one single keyboard and more importantly one single storage, essential for all high-end NLP. A task-force was created with two major briefs:
Devanagari is an Indic script used in the Indian subcontinent. Also simply called Nāgari, it is a left-to-right abugida, based on the ancient Brāhmi script. It is one of the official scripts of the Republic of India and Nepal. It was developed and in regular use by the 8th century CE and achieved its modern form by 1000 CE. The Devanāgari script, composed of 48 primary characters, including 14 vowels and 34 consonants, is the fourth most widely adopted writing system in the world, being used for over 120 languages.
The Brahmic scripts, also known as Indic scripts, are a family of abugida writing systems. They are used throughout the Indian subcontinent, Southeast Asia and parts of East Asia. They are descended from the Brahmi script of ancient India and are used by various languages in several language families in South, East and Southeast Asia: Indo-Aryan, Dravidian, Tibeto-Burman, Mongolic, Austroasiatic, Austronesian, and Tai. They were also the source of the dictionary order (gojūon) of Japanese kana.
Devanagari is an Indic script used for many Indo-Aryan languages of North India and Nepal, including Hindi, Marathi and Nepali, which was the script used to write Classical Sanskrit. There are several somewhat similar methods of transliteration from Devanagari to the Roman script, including the influential and lossless IAST notation. Romanised Devanagari is also called Romanagari.
The National Library at Kolkata romanisation is a widely used transliteration scheme in dictionaries and grammars of Indic languages. This transliteration scheme is also known as (American) Library of Congress and is nearly identical to one of the possible ISO 15919 variants. The scheme is an extension of the IAST scheme that is used for transliteration of Sanskrit.
The Tamil script is an abugida script that is used by Tamils and Tamil speakers in India, Sri Lanka, Malaysia, Singapore,and elsewhere to write the Tamil language. It is one of the official scripts of the Indian Republic. Certain minority languages such as Saurashtra, Badaga, Irula and Paniya are also written in the Tamil script.
Indian Standard Code for Information Interchange (ISCII) is a coding scheme for representing various writing systems of India. It encodes the main Indic scripts and a Roman transliteration. The supported scripts are: Bengali–Assamese, Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, and Telugu. ISCII does not encode the writing systems of India that are based on Persian, but its writing system switching codes nonetheless provide for Kashmiri, Sindhi, Urdu, Persian, Pashto and Arabic. The Persian-based writing systems were subsequently encoded in the PASCII encoding.
The International Alphabet of Sanskrit Transliteration (IAST) is a transliteration scheme that allows the lossless romanisation of Indic scripts as employed by Sanskrit and related Indic languages. It is based on a scheme that emerged during the 19th century from suggestions by Charles Trevelyan, William Jones, Monier Monier-Williams and other scholars, and formalised by the Transliteration Committee of the Geneva Oriental Congress, in September 1894. IAST makes it possible for the reader to read the Indic text unambiguously, exactly as if it were in the original Indic script. It is this faithfulness to the original scripts that accounts for its continuing popularity amongst scholars.
ISO 15919 is an international standard for the romanization of Brahmic and Nastaliq scripts. Published in 2001, it is part of a series of international standards by the International Organization for Standardization.
The "Indian languages TRANSliteration" (ITRANS) is an ASCII transliteration scheme for Indic scripts, particularly for the Devanagari script.
The zero-width joiner is a non-printing character used in the computerized typesetting of writing systems in which the shape or positioning of a grapheme depends on its relation to other graphemes, such as the Arabic script or any Indic script. Sometimes the Roman script is to be counted as complex, e.g. when using a Fraktur typeface. When placed between two characters that would otherwise not be connected, a ZWJ causes them to be printed in their connected forms.
The nuqta, is a diacritic mark that was introduced in Devanagari and some other Indic scripts to represent sounds not present in the original scripts. It takes the form of a dot placed below a character. This idea is inspired from the Arabic script; for example, there are some letters in Urdu that share the same basic shape but differ in the placement of dots(s) or nuqta(s) in the Perso-Arabic script: the letter ع ayn, with the addition of a nuqta on top, becomes the letter غ g͟hayn.
Indic Computing means "computing in Indic", i.e., Indian Scripts and Languages. It involves developing software in Indic Scripts/languages, Input methods, Localization of computer applications, web development, Database Management, Spell checkers, Speech to Text and Text to Speech applications and OCR in Indian languages.
Siddhaṃ, also known in its later evolved form as Siddhamātṛkā, is a medieval Brahmic abugida, derived from the Gupta script and ancestral to the Nāgarī, Eastern Nagari, Tirhuta, Odia and Nepalese scripts.
Bengali input methods refer to different systems developed to type the characters of the Bengali script for Bengali language and others, using a typewriter or a computer keyboard.
Clip fonts or split fonts are non-Unicode fonts that assign glyphs of Brahmic scripts, such as Devanagari, at code positions intended for glyphs of the Latin script or to produce glyphs not found in Unicode by using its Private Use Area (PUA).
The Tamil keyboard is used in computers and mobile devices to input text in the Tamil script.
Tamil All Character Encoding (TACE16) is a scheme for encoding the Tamil script in the Private Use Area of Unicode, implementing a syllabary-based character model differing from the modified-ISCII model used by Unicode's existing Tamil implementation.
Swarachakra is a free text input application developed by the IDIN group at Industrial Design Center (IDC), Indian Institute of Technology Bombay for Indic scripts. Swarachakra's alphabetical keyboard layout performed better than the Inscript layout. Currently, it is available for Android devices in twelve languages.
Indic OCR refers to the process of converting text images written in Indic scripts into e-text using Optical character recognition (OCR) techniques. Broadly, it can also refer to the OCR systems of Brahmic scripts for languages of South Asia and Southeast Asia, not just the scripts of the Indian subcontinent, which are all written in an abugida-based writing system.
Meitei input methods are the methods that allow users of computers to input texts in the Meitei script, systematically for Meitei language.