Pushpak Bhattacharyya

Last updated
Pushpak Bhattacharyya
PushpakBhattacharyya.png
Born1962 (age 6061)
India
NationalityIndian
CitizenshipIndia
Alma mater IIT Kharagpur
IIT Kanpur
IIT Bombay
Known for Director at IIT Patna, Machine translation, Word-sense disambiguation, Sentiment analysis, Psycholinguistics, IndoWordNet, Information Retrieval
Scientific career
Fields Computer science, Artificial Intelligence, Natural Language Processing
Institutions IIT Patna
IIT Bombay
Website Official website
Director's Profile

Pushpak Bhattacharyya is a computer scientist and a professor at Computer Science and Engineering Department, IIT Bombay. He served as the director of Indian Institute of Technology Patna from 2015 to 2021. [1] He is a past president of Association for Computational Linguistics (2016–17), [2] and Ex-Vijay and Sita Vashee Chair Professor [3] He currently heads the Natural language processing research group Center For Indian Language Technology (CFILT) lab at IIT Bombay.

Contents

Education

He completed his undergraduate studies from IIT Kharagpur (B. Tech.) and Masters from IIT Kanpur (M.Tech). He finished his Ph.D. from IIT Bombay in 1994.

Research

His research areas are Natural Language Processing, Artificial Intelligence, Machine Learning, Psycholinguistics, Eye Tracking, Information Retrieval, and Indian Language WordNets - IndoWordNet. A significant contribution of his research is Multilingual Lexical Knowledge Bases like IndoWordNet and Projection. He is the author of the text book ‘Machine Translation’. [4] He has led government and industry projects of international and national importance [5] and has received faculty grants [5] from IBM, Microsoft, Yahoo and the United Nations.


He has published more than 350 research papers covering all major areas of NLP in top journals and conferences and has guided more than 300 students for their PhD, masters and undergraduate research. Automatic Sarcasm Detection, Multilingual Computation, Indian Language Neural Machine Translation and Indowordnet are some of his research trail blazers. Three monographs co-authored by him titled 'Investigations in Computational Sarcasm' [6] (Springer, with Dr. Aditya Joshi), 'Cognitively Inspired Natural Language Processing- An Investigation Based on Eye Tracking' [7] (Springer, with Dr. Abhijit Mishra) and 'Machine Translation and Transliteration of Low Resource Related Languages' [8] (CRC Press Taylor and Francis group, with Dr. Anoop Kunchukuttan) describe cutting edge research in NLP and ML. Prof. Bhattacharyya has executed sponsored and consultancy projects for various ministries and top industries and startups, with amount running into tens of crores.


Awards

Related Research Articles

<span class="mw-page-title-main">Machine translation</span> Use of software for language translation

Machine translation is use of either rule-based or probabilistic machine learning approaches to translation of text or speech from one language to another, including the contextual, idiomatic and pragmatic nuances of both languages.

Natural language processing (NLP) is an interdisciplinary subfield of linguistics and computer science. It is primarily concerned with processing natural language datasets, such as text corpora or speech corpora, using either rule-based or probabilistic machine learning approaches. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them. The technology can then accurately extract information and insights contained in the documents as well as categorize and organize the documents themselves.

Word-sense disambiguation (WSD) is the process of identifying which sense of a word is meant in a sentence or other segment of context. In human language processing and cognition, it is usually subconscious/automatic but can often come to conscious attention when ambiguity impairs clarity of communication, given the pervasive polysemy in natural language. In computational linguistics, it is an open problem that affects other computer-related writing, such as discourse, improving relevance of search engines, anaphora resolution, coherence, and inference.

Devanagari is an Indic script used for many Indo-Aryan languages of North India and Nepal, including Hindi, Marathi and Nepali, which was the script used to write Classical Sanskrit. There are several somewhat similar methods of transliteration from Devanagari to the Roman script, including the influential and lossless IAST notation. Romanized Devanagari is also called Romanagari.

<span class="mw-page-title-main">Association for Computational Linguistics</span> Professional organization devoted to linguistics

The Association for Computational Linguistics (ACL) is a scientific and professional organization for people working on natural language processing. Its namesake conference is one of the primary high impact conferences for natural language processing research, along with EMNLP. The conference is held each summer in locations where significant computational linguistics research is carried out.

<span class="mw-page-title-main">Tathagat Avatar Tulsi</span> Indian physicist (born 1987)

Tathagat Avatar Tulsi is an Indian physicist and a child prodigy. He completed high school at the age of 9 years, earned a BSc at the age of 11 years, and a MSc at the age of 12 years from Patna Science College. In August 2009, he got his PhD from the Indian Institute of Science, Bangalore at the age of 21 years. In July 2010, he was offered a position as Assistant Professor on contract at IIT Mumbai. His employment was terminated in 2019,. As per his interview given to BBC, this was after a long leave related to his illness.

The Centre for Excellence in Computational Engineeringand Networking (CEN) at Amrita Vishwa Vidyapeetham, a research university in India, is a research and teaching center works on technologies to solving computational problems that can be applied in real world projects. The centre is involved in research projects funded by organizations like ISRO, NPOL, Indian Ministry of Electronics and Information Technology and Department of Science and Technology.

<span class="mw-page-title-main">IIT Patna</span> Public engineering institution in India

Indian Institute of Technology Patna is a public research university and technical institute located at Bihta near Patna, Bihar. It is recognized as an Institute of National Importance by the Government of India. It is one of the new IITs established by an Act of the Indian Parliament on August 6, 2008.

Language and Communication Technologies is the scientific study of technologies that explore language and communication. It is an interdisciplinary field that encompasses the fields of computer science, linguistics and cognitive science.

The history of natural language processing describes the advances of natural language processing. There is some overlap with the history of machine translation, the history of speech recognition, and the history of artificial intelligence.

SemEval is an ongoing series of evaluations of computational semantic analysis systems; it evolved from the Senseval word sense evaluation series. The evaluations are intended to explore the nature of meaning in language. While meaning is intuitive to humans, transferring those intuitions to computational analysis has proved elusive.

IndoWordNet is a linked lexical knowledge base of wordnets of 18 scheduled languages of India, viz., Assamese, Bangla, Bodo, Gujarati, Hindi, Kannada, Kashmiri, Konkani, Malayalam, Meitei (Manipuri), Marathi, Nepali, Odia, Punjabi, Sanskrit, Tamil, Telugu and Urdu.

The following outline is provided as an overview of and topical guide to natural-language processing:

NetOwl is a suite of multilingual text and identity analytics products that analyze big data in the form of text data – reports, web, social media, etc. – as well as structured entity data about people, organizations, places, and things.

<span class="mw-page-title-main">WX notation</span> Transliteration for Indian languages

WX notation is a transliteration scheme for representing Indian languages in ASCII. This scheme originated at IIT Kanpur for computational processing of Indian languages, and is widely used among the natural language processing (NLP) community in India. The notation is used, for example, in a textbook on NLP from IIT Kanpur. The salient features of this transliteration scheme are: Every consonant and every vowel has a single mapping into Roman. Hence it is a prefix code, advantageous from a computation point of view. Typically the small case letters are used for un-aspirated consonants and short vowels while the capital case letters are used for aspirated consonants and long vowels. While the retroflexed voiceless and voiced consonants are mapped to 't, T, d and D', the dentals are mapped to 'w, W, x and X'. Hence the name of the scheme "WX", referring to the idiosyncratic mapping. Ubuntu Linux provides a keyboard support for WX notation.

<span class="mw-page-title-main">Subhasis Chaudhuri</span>

Subhasis Chaudhuri is an Indian electrical engineer and the director at the Indian Institute of Technology, Bombay. He is a former K. N. Bajaj Chair Professor of the Department of Electrical Engineering of IIT Bombay. He is known for his pioneering studies on computer vision and is an elected fellow of all the three major Indian science academies viz. the National Academy of Sciences, India, Indian Academy of Sciences, and Indian National Science Academy. He is also a fellow of Institute of Electrical and Electronics Engineers, and the Indian National Academy of Engineering. The Council of Scientific and Industrial Research, the apex agency of the Government of India for scientific research, awarded him the Shanti Swarup Bhatnagar Prize for Science and Technology, one of the highest Indian science awards, in 2004 for his contributions to Engineering Sciences.

Bidirectional Encoder Representations from Transformers (BERT) is a family of language models introduced in 2018 by researchers at Google. A 2020 literature survey concluded that "in a little over a year, BERT has become a ubiquitous baseline in Natural Language Processing (NLP) experiments counting over 150 research publications analyzing and improving the model."

Mona Talat Diab is a computer science professor and director of Carnegie Mellon University's Language Technologies Institute. Previously, she was a professor at George Washington University and a research scientist with Facebook AI. Her research focuses on natural language processing, computational linguistics, cross lingual/multilingual processing, computational socio-pragmatics, Arabic language processing, and applied machine learning.

Sunita Sarawagi is an Indian computer scientist known for her research in databases, data mining, and machine learning, including the use of natural language processing to extract structured data from text. She is Institute Chair Professor of Computer Science and Engineering at IIT Bombay.

References

  1. Prof. Pushpak Bhattacharyya, Director IIT Patna Indian Institute of Technology Patna. Retrieved 2018-11-21.
  2. Executive Committee Retrieved 2018-11-21.
  3. Chair Professors 2015 Retrieved 2018-11-21.
  4. Bhattacharyya, Pushpak (2015). Machine Translation (1st ed.). ISBN   978-1-4398-9718-8.
  5. 1 2 Sponsored Research Projects Retrieved 2018-11-21.
  6. Investigations in Computational Sarcasm (1st ed.). ISBN   978-981-10-8395-2.
  7. Cognitively Inspired Natural Language Processing- An Investigation Based on Eye Tracking (1st ed.). ISBN   978-981-13-1515-2.
  8. Machine Translation and Transliteration of Low Resource Related Languages (1st ed.). ISBN   9781003096771.
  9. 1 2 3 "Prof. Pushpak Bhattacharyya has assumed charge as Director, IIT Patna w.e.f June 03, 2015". Indian Institute of Technology Patna. Retrieved 21 November 2018.
  10. "Pushpak Bhattacharyya". inae.in. Retrieved 21 November 2018.