Julia Hirschberg

Last updated
Julia Hirschberg
Born
Nationality American
Alma mater University of Pennsylvania
University of Michigan
Known for Natural Language Processing
Awards American Academy of Arts and Sciences (2018)
IEEE Fellow (2017)
National Academy of Engineering (2017) ACM Fellow (2015)
ACL Fellow (2011)
AAAI Fellow (1994)
International Speech Communication Association Fellow (2011)
Honorary Doctorate (Hedersdoktor) KTH (2007)
Columbia Engineering School Alumni Association Distinguished Faculty Teaching award (2009)
IEEE James L. Flanagan Speech and Audio Processing Award (2011)
ISCA Medal for Scientific Achievement (2011)
Scientific career
Fields Computer Science
Institutions
Thesis A Theory of Scalar Implicature  (1985)
Website www.cs.columbia.edu/~julia/

Julia Hirschberg is an American computer scientist noted for her research on computational linguistics and natural language processing. [1]

Contents

Hirschberg was elected a member of the National Academy of Engineering in 2017 for contributions to the use of prosody in text-to-speech and spoken dialogue systems, and to audio browsing and retrieval.

She is currently the Percy K. and Vida L. W. Hudson Professor of Computer Science at Columbia University.

Biography

Julia Linn Bell Hirschberg received her first Ph.D degree in History (16th-century Mexico) from University of Michigan in 1976. She served on the History faculty of Smith College from 1974 to 1982. She subsequently shifted to Computer Science studies, receiving her M.S. in Computer and Information Science from University of Pennsylvania in 1982 and a Ph.D in Computer and Information Science from University of Pennsylvania in 1985.

Upon graduation from University of Pennsylvania in 1985, Hirschberg joined AT&T Bell Labs as a Member of Technical staff in the Linguistics Research Department, where she worked on improving prosody assignment for Text-to-Speech Synthesis (TTS) in the Bell Labs TTS system. She was promoted to Department Head in 1994 when she created a new Human Computer Interface Research Lab. She and her department remained at Bell Labs until 1996 when they moved to AT&T Labs Research as part of a corporate reorganization. In 2002, she joined the Columbia University faculty as a Professor in the Department of Computer Science. She served as Chair of the Computer Science Department from 2012 to 2018.

Research

Hirschberg's research has included prosody, discourse structure, spoken dialogue systems, speech search, and more recently analysis of deceptive speech. [2] Hirschberg was among the first to combine Natural Language Processing (NLP) approaches to discourse and dialogue with speech research. She pioneered techniques in text analysis for prosody assignment in Text-to-Speech synthesis at Bell laboratories in the 1980s and 1990s, developing corpus-based statistical models based upon syntactic and discourse information which are in general use today in TTS systems. [3] [4] With Janet Pierrehumbert, she developed a theoretical model of intonational meaning. [5] She was a leader in the development of the ToBI conventions for intonational description, which have been extended to numerous languages and which today are the most widely used standard for intonational annotation. [6]

Hirschberg has been a pioneer together with Gregory Ward in much experimental work on intonational sources of language meaning and how these interact with pragmatic phenomena, particularly on the meaning of accent (intonational prominent) items and the meaning of intonational contours. [7] [8] She also has innovated in numerous other areas involving prosody and meaning, including the role of grammatical function and surface position in pitch accent location, [9] the use of prosody in disambiguating cue phrases (discourse markers) with Diane Litman, [10] the role of prosody in disambiguation in English, Italian, and Spanish with Cinzia Avesani and Pilar Prieto, [11] and the automatic identification of speech recognition errors using prosodic information, [12] At AT&T Labs she worked with Fernando Pereira, Steve Whittaker, and others on speech search [13] and developing new interfaces for speech navigation. [14] At Columbia, she and her students have continued and extended research on spoken dialogue systems (automatically detecting speech recognition errors [15] and inappropriate system queries, [16] modeling turn-taking behavior, [17] dialogue entrainment, [18] modeling and generating clarification dialogues [19] ); on the automatic classification of trust, charisma, [20] deception [21] and emotion [22] from speech; on speech summarization; [23] prosody translation, hedging behavior in text and speech, [24] text-to-speech synthesis, and speech search in low resource languages. [25] She also holds several patents in TTS and in speech search. Corpora she and collaborators have collected include the Boston Directions Corpus, the Columbia SRI Colorado Deception Corpus, and the Columbia Games Corpus.

She has served on numerous technical boards and editorial committees, and is now on the Computing Research Association's (CRA) Board of Directors and serves as co-chair of CRA-W. [26] She is also noted for her leadership in broadening participation in computing. She has served as a member of the CRA Committee on the Status of Women in Computing Research CRA-W since 2010.

Awards

Hirschberg's notable awards include:

Publications

Related Research Articles

Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech. The reverse process is speech recognition.

<span class="mw-page-title-main">Utterance</span> Smallest unit of speech

In spoken language analysis, an utterance is a continuous piece of speech, by one person, before or after which there is silence on the part of the person. In the case of oral languages, it is generally, but not always, bounded by silence. Utterances do not exist in written language; only their representations do. They can be represented and delineated in written language in many ways.

<span class="mw-page-title-main">Whistled language</span> Emulation of speech by whistling

Whistled languages are linguistic systems that use whistling to emulate speech and facilitate communication between individuals. More than 80 languages have been found to practice various degrees of whistling, most of them in rugged topography or dense forests, where whistling expands the area of communication while movement to carry messages is challenging. The practice is generally threatened by increased modernization and faster roads, but successful conservation efforts are recorded.

In linguistics, prosody is the study of elements of speech that are not individual phonetic segments but which are properties of syllables and larger units of speech, including linguistic functions such as intonation, stress, and rhythm. Such elements are known as suprasegmentals.

<span class="mw-page-title-main">Segment (linguistics)</span> Distinct unit of speech

In linguistics, a segment is "any discrete unit that can be identified, either physically or auditorily, in the stream of speech". The term is most used in phonetics and phonology to refer to the smallest elements in a language, and this usage can be synonymous with the term phone.

<span class="mw-page-title-main">Dialogue system</span> Computer system to converse with a human

A dialogue system, or conversational agent (CA), is a computer system intended to converse with a human. Dialogue systems employed one or more of text, speech, graphics, haptics, gestures, and other modes for communication on both the input and output channel.

In linguistics, intonation is the variation in pitch used to indicate the speaker's attitudes and emotions, to highlight or focus an expression, to signal the illocutionary act performed by a sentence, or to regulate the flow of discourse. For example, the English question "Does Maria speak Spanish or French?" is interpreted as a yes-or-no question when it is uttered with a single rising intonation contour, but is interpreted as an alternative question when uttered with a rising contour on "Spanish" and a falling contour on "French". Although intonation is primarily a matter of pitch variation, its effects almost always work hand-in-hand with other prosodic features. Intonation is distinct from tone, the phenomenon where pitch is used to distinguish words or to mark grammatical features.

<span class="mw-page-title-main">Janet Pierrehumbert</span> American linguist

Janet Pierrehumbert is Professor of Language Modelling in the Oxford e-Research Centre at the University of Oxford and a senior research fellow of Trinity College, Oxford. She developed an intonational model which includes a grammar of intonation patterns and an explicit algorithm for calculating pitch contours in speech, as well as an account of intonational meaning. It has been widely influential in speech technology, psycholinguistics, and theories of language form and meaning. Pierrehumbert is also affiliated with the New Zealand Institute of Language Brain and Behaviour at the University of Canterbury.

Automatic pronunciation assessment is the use of speech recognition to verify the correctness of pronounced speech, as distinguished from manual assessment by an instructor or proctor. Also called speech verification, pronunciation evaluation, and pronunciation scoring, the main application of this technology is computer-aided pronunciation teaching (CAPT) when combined with computer-aided instruction for computer-assisted language learning (CALL), speech remediation, or accent reduction.

A spoken dialog system (SDS) is a computer system able to converse with a human with voice. It has two essential components that do not exist in a written text dialog system: a speech recognizer and a text-to-speech module. It can be further distinguished from command and control speech systems that can respond to requests but do not attempt to maintain continuity over time.

In linguistics, a prosodic unit is a segment of speech that occurs with specific prosodic properties. These properties can be those of stress, intonation, or tonal patterns.

ToBI is a set of conventions for transcribing and annotating the prosody of speech. The term "ToBI" is sometimes used to refer to the conventions used for describing American English specifically, which was the first ToBI system, developed by Mary Beckman and Janet Pierrehumbert, among others. Other ToBI systems have been defined for a number of languages; for example, J-ToBI refers to the ToBI conventions for Tokyo Japanese, and an adaptation of ToBI to describe Dutch intonation was developed by Carlos Gussenhoven, and called ToDI. Another variation of ToBI, called IViE, was established in 1998 to enable comparison between several dialects of British English.

Jennifer Sandra Cole is a professor linguistics, Director of the Prosody and Speech Dynamics Lab, and Chair of the Department of Linguistics at Northwestern University. Her research uses experimental and computational methods to study the sound structure of language. She was previously Professor of Linguistics at the University of Illinois at Urbana-Champaign. Professor Cole served as the founding General Editor of Laboratory Phonology (2009–2015) and a founding member of the Association for Laboratory Phonology.

Laboratory phonology is an approach to phonology that emphasizes the synergy between phonological theory and scientific experiments, including laboratory studies of human speech and experiments on the acquisition and productivity of phonological patterns. The central goal of laboratory phonology is "gaining an understanding of the relationship between the cognitive and physical aspects of human speech" through the use of an interdisciplinary approach that promotes scholarly exchange across disciplines, bridging linguistics with psychology, electrical engineering, and computer science, and other fields. Although spoken speech has represented the major area of research, the investigation of sign languages and manual signs as encoding elements is also included in laboratory phonology. Important antecedents of the field include work by Kenneth N. Stevens and Gunnar Fant on the acoustic theory of speech production, Ilse Lehiste's work on prosody and intonation, and Peter Ladefoged's work on typological variation and methods for data capture. Current research in laboratory phonology draws heavily on the theories of metrical phonology and autosegmental phonology which are sought to be tested with help of experimental procedures, in laboratory settings, or through linguistic data collection at field sites, and though evaluation with statistical methods, such as exploratory data analysis.

Jacqueline Vaissière is a French phonetician.

Klaus J. Kohler is a German phonetician.

<span class="mw-page-title-main">Barbara J. Grosz</span> American computer scientist (born 1948)

Barbara J. Grosz CorrFRSE is an American computer scientist and Higgins Professor of Natural Sciences at Harvard University. She has made seminal contributions to the fields of natural language processing and multi-agent systems. With Alison Simmons, she is co-founder of the Embedded EthiCS programme at Harvard, which embeds ethics lessons into computer science courses.

<span class="mw-page-title-main">Peter Roach (phonetician)</span> British retired phonetician (born 1943)

Peter John Roach is a British retired phonetician. He taught at the Universities of Leeds and Reading, and is best known for his work on the pronunciation of British English.

<span class="mw-page-title-main">Dafydd Gibbon</span> British professor (born 1944)

Dafydd Gibbon is a British emeritus professor of English and General Linguistics at Bielefeld University in Germany, specialising in computational linguistics, the lexicography of spoken languages, applied phonetics and phonology. He is particularly concerned with endangered languages and has received awards from the Ivory Coast, Nigeria and Poland.

Ani Nenkova is Principal Scientist at Adobe Research, currently on leave from her position as an Associate Professor of Computer and Information Science at the University of Pennsylvania. Her research focuses on computational linguistics and artificial intelligence, with an emphasis on developing computational methods for analysis of text quality and style, discourse, affect recognition, and summarization.

References

  1. "Julia Hirschberg - Google Scholar Citations". scholar.google.com. Retrieved 2018-08-25.
  2. Anne Eisenberg (2011-12-03). "Software That Listens for Lies". New York Times. Retrieved 2014-06-24.
  3. J. Hirschberg (1001). "Using Text Analysis to Predict Intonational Boundaries". Eurospeech 1991: 1275–1278.
  4. J. Hirschberg (1993). "Pitch Accent in Context: Predicting Intonational Prominence from Text". Artificial Intelligence. 63 (1–2): 1–2. CiteSeerX   10.1.1.50.1694 . doi:10.1016/0004-3702(93)90020-c.
  5. Pierrehumbert & Hirschberg (1990). "The Meaning of Intonational Contours in the Interpretation of Discourse". Intentions and Plans in Communication and Discourse: 271–311. doi:10.7551/mitpress/3839.003.0016. ISBN   978-0-262-27054-0.
  6. Beckman, M. E.; Hirschberg, J. & Shattuck-Hufnagel, S. (2004). "The original ToBI system and the evolution of the ToBI framework". Prosodic Typology: The Phonology of Intonation and Phrasing: 9–54.
  7. J. Hirschberg; G. Ward (1992). "The Influence of Pitch Range, Duration, Amplitude, and Spectral Features on the Interpretation of L*+H L H%". Journal of Phonetics. doi: 10.1016/S0095-4470(19)30625-4 .
  8. J. Hirschberg; G. Ward (1991). "Accent and Bound Anaphora". Cognitive Linguistics. 2 (2): 101–122. doi:10.1515/cogl.1991.2.2.101. S2CID   144771310.
  9. J. Terken; J. Hirschberg (1994). "Deaccentuation and Persistence of Grammatical Function and Surface Position". Language and Speech. 37 (2): 125–145. doi:10.1177/002383099403700202. S2CID   145696152.
  10. J. Hirschberg; D. Litman (1993). "Empirical Studies on the Disambiguation of Cue Phrases". Computational Linguistics.
  11. J. Hirschberg; C. Avesani (2000). "Prosodic Disambiguation in English and Italian". Intonation. Text, Speech and Language Technology. Vol. 15. pp. 87–95. doi:10.1007/978-94-011-4317-2_4. ISBN   978-0-7923-6723-9.
  12. Julia Hirschberg; Diane Litman; Marc Swerts (2004). "Prosodic and Other Cues to Speech Recognition Failures". Speech Communication. 43 (1–2): 155–175. doi:10.1016/j.specom.2004.01.006.
  13. J. Choi; D. Hindle; J. Hirschberg; F. Pereira; A. Singhal & S.Whittaker (1999). "Spoken Content-Based Audio Navigation (SCAN)". ICPhS-99.
  14. S. Whittaker; J. Hirschberg; J. Choi; D. Hindle; F. Pereira; A. Singhal (1999). "SCAN: Designing and evaluating user interfaces to support retrieval from speech archives". Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval. pp. 26–33. doi:10.1145/312624.312639. ISBN   978-1581130966. S2CID   15089338.
  15. E. Pincus; S. Stoyanchev; J. Hirschberg (2013). "Exploring Features for Localized Detection of Speech Recognition Errors". SigDIAL.
  16. Alex Liu; Rose Sloan; Mei-Vern Then; Svetlana Stoyanchev; Julia Hirschberg; Elizabeth Shriberg (2014). "Detecting Inappropriate Clarification Requests in Spoken Dialogue Systems". SigDIAL.
  17. Agustín Gravano; Julia Hirschberg (2011). "Turn-taking cues in task-oriented dialogue". Computer Speech and Language. 25 (3): 601–634. doi:10.1016/j.csl.2010.10.003. hdl: 11336/68351 .
  18. Z. Xia; R. Levitan; J. Hirschberg (2014). "Prosodic Entrainment in Mandarin and English: A Cross-Linguistic Comparison". Speech Prosody. doi:10.21437/SpeechProsody.2014-1. S2CID   15063969.
  19. S. Stoyanchev; A. Liu; J. Hirschberg (2013). "Modeling Human Clarification Strategies". SigDIAL.
  20. F. Biadsy; A. Rosenberg; R. Carlson; J. Hirschberg; E. Strangert (2008). "A Cross-cultural Comparison of American, Palestinian, and Swedish Perception of Charismatic Speech". Speech Prosody.
  21. J. Hirschberg; S. Benus; J. M. Brenier; F. Enos; S. Friedman; S. Gilman; C. Girand; M. Graciarena; A. Kathol; L. Michaelis; B. Pellom; E. Shriberg; A. Stolcke (2005). "Distinguishing deceptive from non-deceptive speech". Interspeech: 1833–1836. doi:10.21437/Interspeech.2005-580. S2CID   6415344.
  22. J. Liscombe; J. Hirschberg; J. J. Venditti (2005). "Detecting certainness in spoken tutorial dialogues". Interspeech: 1837–1840. doi:10.21437/Interspeech.2005-581. S2CID   12221841.
  23. S. Maskey; J. Hirschberg (2005). "Comparing Lexical, Acoustic/Prosodic, Discourse and Structural Features for Speech Summarization". Interspeech. doi:10.21437/Interspeech.2005-66.
  24. A. Prokofieva; J. Hirschberg (2014). "Hedging and Speaker Commitment". LREC.
  25. V. Soto; L. Mangu; A. Rosenberg; J. Hirschberg (2014). "A Comparison of Multiple Methods for Rescoring Keyword Search Lists for Low Resource Languages". Interspeech.
  26. "Women in Computing: ACM SIGGRAPH Meets ACM-W". ACM SIGGRAPH. 2017-07-28. Retrieved 2018-02-03.[ permanent dead link ]
  27. "NAE Website - National Academy of Engineering Elects 84 Members and 22 Foreign Members". www.nae.edu. Retrieved 2017-02-09.
  28. Association for Computing Machinery. "ACM Fellows Named for Computing Innovations that Are Advancing Technology in the Digital Age". ACM. Archived from the original on 2015-12-09. Retrieved 2015-12-08.
  29. Columbia University. "Julia Hirschberg Elected to American Philosophical Society". Columbia University. Retrieved 2015-03-26.
  30. Association for Laboratory Phonology. "About the Association for Laboratory Phonology". Association for Laboratory Phonology. Retrieved 2015-03-26.
  31. Columbia University. "CS Professors Named Founding Fellows of Association for Computational Linguistics". Columbia University. Archived from the original on 2012-05-09. Retrieved 2015-03-17.
  32. International Speech Communication Association (2010-11-05). "Professor Julia Hirschberg awarded with the ISCA Scientific Achievement Medal 2011". ISCA. Retrieved 2014-06-24.
  33. IEEE. "IEEE James L. Flanagan Speech and Audio Processing Award Recipients". IEEE. Archived from the original on April 7, 2010. Retrieved 2014-06-24.
  34. KTH (2007-09-07). "Hans Blix new Honorary Doctor at KTH". KTH. Retrieved 2014-06-24.
  35. "Elected AAAI Fellows". AAAI. Retrieved 2024-01-04.