Sinewave synthesis, or sine wave speech, is a technique for synthesizing speech by replacing the formants (main bands of energy) with pure tone whistles. The first sinewave synthesis program (SWS) for the automatic creation of stimuli for perceptual experiments was developed by Philip Rubin at Haskins Laboratories in the 1970s. This program was subsequently used by Robert Remez, Philip Rubin, David Pisoni, and other colleagues to show that listeners can perceive continuous speech without traditional speech cues, i.e., pitch, stress, and intonation. This work paved the way for a view of speech as a dynamic pattern of trajectories through articulatory-acoustic space. [1]
Additive synthesis is a sound synthesis technique that creates timbre by adding sine waves together.
Perception is the organization, identification, and interpretation of sensory information in order to represent and understand the presented information or environment.
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech computer or speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech.
In the branch of experimental psychology focused on sense, sensation, and perception, which is called psychophysics, a just-noticeable difference or JND is the amount something must be changed in order for a difference to be noticeable, detectable at least half the time. This limen is also known as the difference limen, difference threshold, or least perceptible difference.
Auditory phonetics is a branch of phonetics concerned with the hearing of speech sounds and with speech perception. It thus entails the study of the relationships between speech stimuli and a listener’s responses to such stimuli as mediated by mechanisms of the peripheral and central auditory systems, including certain cortical areas of the brain.
Speech perception is the process by which the sounds of language are heard, interpreted and understood. The study of speech perception is closely linked to the fields of phonology and phonetics in linguistics and cognitive psychology and perception in psychology. Research in speech perception seeks to understand how human listeners recognize speech sounds and use this information to understand spoken language. Speech perception research has applications in building computer systems that can recognize speech, in improving speech recognition for hearing- and language-impaired listeners, and in foreign-language teaching.
Haskins Laboratories, Inc. is an independent 501(c) non-profit corporation, founded in 1935 and located in New Haven, Connecticut, since 1970. Upon moving to New Haven, Haskins entered in to formal affiliation agreements with both Yale University and the University of Connecticut; it remains fully independent, administratively and financially, of both Yale and UConn. Haskins is a multidisciplinary and international community of researchers which conducts basic research on spoken and written language. A guiding perspective of their research is to view speech and language as emerging from biological processes, including those of adaptation, response to stimuli, and conspecific interaction. The Laboratories has a long history of technological and theoretical innovation, from creating systems of rules for speech synthesis and development of an early working prototype of a reading machine for the blind to developing the landmark concept of phonemic awareness as the critical preparation for learning to read an alphabetic writing system.
Philip E. Rubin is an American cognitive scientist, technologist, and science administrator known for raising the visibility of behavioral and cognitive science and neuroscience at a national level. During his research career he was best known for his pioneering development of articulatory synthesis and sinewave synthesis, and their use in studying complex temporal events, including understanding the biological bases of speech and language. He is the Chief Executive Officer emeritus and a member of the Board of Directors of Haskins Laboratories in New Haven, Connecticut, where he was also a senior scientist. In addition, he is a Professor Adjunct in the Department of Surgery, Otolaryngology at the Yale University School of Medicine, a Research Affiliate in the Department of Psychology at Yale University, a Fellow at Yale's Trumbull College, and a Trustee of the University of Connecticut. He is also President-Elect of the Federation of Associations in Behavioral and Brain Sciences (FABBS).
Carol Ann Fowler is an American experimental psychologist. She was President and Director of Research at Haskins Laboratories in New Haven, Connecticut from 1992 to 2008. She is also a Professor of Psychology at the University of Connecticut and Adjunct Professor of Linguistics and Psychology at Yale University. She received her undergraduate degree from Brown University in 1971, her M.A University of Connecticut in 1973 and her Ph.D. in Psychology from the University of Connecticut in 1977.
The pattern playback is an early talking device that was built by Dr. Franklin S. Cooper and his colleagues, including John M. Borst and Caryl Haskins, at Haskins Laboratories in the late 1940s and completed in 1950. There were several different versions of this hardware device. Only one currently survives. The machine converts pictures of the acoustic patterns of speech in the form of a spectrogram back into sound. Using this device, Alvin Liberman, Frank Cooper, and Pierre Delattre were able to discover acoustic cues for the perception of phonetic segments. This research was fundamental to the development of modern techniques of speech synthesis, reading machines for the blind, the study of speech perception and speech recognition, and the development of the motor theory of speech perception.
Robert Remez, an American experimental psychologist and cognitive scientist, is Professor of Psychology at Barnard College, Columbia University and Chair of the Columbia University Seminar on Language & Cognition. His teaching focuses on the relationships between cognition, perception and language. He is best known for his theoretical and experimental work on perceptual organization. and speech perception.
Articulatory synthesis refers to computational techniques for synthesizing speech based on models of the human vocal tract and the articulation processes occurring there. The shape of the vocal tract can be controlled in a number of ways which usually involves modifying the position of the speech articulators, such as the tongue, jaw, and lips. Speech is created by digitally simulating the flow of air through the representation of the vocal tract.
Articulatory phonology is a linguistic theory originally proposed in 1986 by Catherine Browman of Haskins Laboratories and Louis M. Goldstein of Yale University and Haskins. The theory identifies theoretical discrepancies between phonetics and phonology and aims to unify the two by treating them as low- and high-dimensional descriptions of a single system.
Franklin Seaney Cooper was an American physicist and inventor who was a pioneer in speech research.
Katherine Safford Harris is a noted psychologist and speech scientist. She is Distinguished Professor Emerita in Speech and Hearing at the CUNY Graduate Center and a member of the Board of Directors of Haskins Laboratories. She is also the former President of the Acoustical Society of America and Vice President of Haskins Laboratories.
Catherine P. Browman (1945–2008) was an American linguist and speech scientist. She was a research scientist at Bell Laboratories in New Jersey and Haskins Laboratories in New Haven, Connecticut, from which she retired due to illness. While at Bell Laboratories, she was known for her work on speech synthesis using demisyllables. She was best known for development, with Louis Goldstein, of the theory of articulatory phonology, a gesture-based approach to phonological and phonetic structure. The theoretical approach is incorporated in a computational model that generates speech from a gesturally-specified lexicon. She received her Ph.D. in linguistics from UCLA in 1978 and was a founding member of the Association for Laboratory Phonology.
Elliot Saltzman is an American psychologist and speech scientist. He is a professor in the Department of Physical Therapy at Boston University and a Senior Scientist at Haskins Laboratories in New Haven, Connecticut. He is best known for his development, with J. A. Scott Kelso of "task dynamics ." He is also known for his contributions to the development of a gestural-computational model at Haskins Laboratories that combines task dynamics with articulatory phonology and articulatory synthesis. His research interests include application of theories and methods of nonlinear dynamics and complexity theory to understanding the dynamical and biological bases of sensorimotor coordination and control. He is the co-founder, with Philip Rubin, of the IS group.
Michael T. Turvey is the Board of Trustees' Distinguished Professor of Experimental Psychology at the University of Connecticut and a Senior Scientist at Haskins Laboratories in New Haven, Connecticut. He is best known for his pioneering work in ecological psychology and in applying dynamic systems approach for the study of motor behavior. He is the founder of the Center for the Ecological Study of Perception and Action. His research spans a number of areas including: dynamic touch and haptics, interlimb coordination, visual perception and optic flow, postural stability, visual word recognition and speech perception. Along with William Mace and Robert Shaw, he has been one of the leading explicators of the ecological psychology of J. J. Gibson. His pioneering work with J. A. Scott Kelso and Peter N. Kugler introduced the physical language of complex systems to the understanding of perception and action. He also helped to introduce the ideas of Russian motor control theorist, Nikolai Bernstein, and his colleagues to a larger audience. Working with Georgije Lukatela and other colleagues at Haskins Laboratories, he has exploited the dual nature of the Serbo-Croatian orthography to help understand word recognition.
Michael Studdert-Kennedy was an American psychologist and speech scientist 1927-2017.https://haskinslabs.org/news/michael-studdert-kennedy. He is well known for his contributions to studies of speech perception, the motor theory of speech perception, and the evolution of language, among other areas. He is a Professor Emeritus of Psychology at the University of Connecticut and a Professor Emeritus of Linguistics at Yale University. He is the former President (1986–1992) of Haskins Laboratories in New Haven, Connecticut. He is also a member of the Haskins Laboratories Board of Directors and was Chairman of the Board from 1988 until 2001.
Dichotic listening is a psychological test commonly used to investigate selective attention and the lateralization of brain function within the auditory system. It is used within the fields of cognitive psychology and neuroscience.