Sensory, Inc.

Last updated
Sensory, Inc.
Industry
Founded1994
Headquarters Santa Clara, California, U.S.
Website https://www.sensory.com/

Sensory, Inc. is an American company which develops software AI technologies for speech, sound and vision. [1] [2] It is based in Santa Clara, California.

Contents

Sensory’s technologies have shipped in over three billion products from hundreds of leading consumer electronics manufacturers including AT&T, Hasbro, Huawei, Google, Amazon, Samsung, LG, Mattel, Motorola, Plantronics, GoPro, Sony, Tencent, Garmin, LG, Microsoft, Lenovo, and more. Sensory has over 60 issued patents covering speech recognition in consumer electronics, biometric authentication, sensor/speech combinations, wake word technology, and more.

History

Sensory, Inc. was founded in 1994, originally as Sensory Circuits, by Forrest Mozer, Mike Mozer and Todd Mozer. The three had also co-founded ESS Technology years earlier. In 1999 Sensory acquired Fluent Speech Technologies, which was formed and started by a group of professors out of the Oregon Graduate Institute (formerly OGI, now OHSU). Fluent Speech Technologies developed high performance embedded speech engines, the technology from this acquisition is now the core technology used throughout Sensory's chip and software line. [3]

Company timeline

Technology and products

Sensory originally developed both hardware (Integrated Circuit - IC or "chip") and software platforms but migrated to software only around 2005 and added cloud and hybrid computing capabilities in 2021. Sensory's RSC-164 IC (Integrated Circuit or "chip") was used on NASA's Mars Polar Lander in the Mars Microphone on the Lander. [6] Speech Synthesis SC-6x chips – acquired some speech synthesis technology from Texas Instruments. [7]

Sensory’s embedded AI solutions include the following:

The cloud initiative, SensoryCloud.ai, is targeting Speech To Text (STT), Text To Speech (TTS), Wake Word verification, face and voice recognition, and sound identification.

Related Research Articles

Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also known as automatic speech recognition (ASR), computer speech recognition or speech-to-text (STT). It incorporates knowledge and research in the computer science, linguistics and computer engineering fields. The reverse process is speech synthesis.

Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech. The reverse process is speech recognition.

<span class="mw-page-title-main">Handsfree</span>

Handsfree is an adjective describing equipment that can be used without the use of hands or, in a wider sense, equipment which needs only limited use of hands, or for which the controls are positioned so that the hand can occupy themselves with another task without needing to hunt far afield for the controls.

<span class="mw-page-title-main">Iris recognition</span> Method of biometric identification

Iris recognition is an automated method of biometric identification that uses mathematical pattern-recognition techniques on video images of one or both of the irises of an individual's eyes, whose complex patterns are unique, stable, and can be seen from some distance. The discriminating powers of all biometric technologies depend on the amount of entropy they are able to encode and use in matching. Iris recognition is exceptional in this regard, enabling the avoidance of "collisions" even in cross-comparisons across massive populations. Its major limitation is that image acquisition from distances greater than a meter or two, or without cooperation, can be very difficult. However, the technology is in development and iris recognition can be accomplished from even up to 10 meters away or in a live camera feed.

Speaker recognition is the identification of a person from characteristics of voices. It is used to answer the question "Who is speaking?" The term voice recognition can refer to speaker recognition or speech recognition. Speaker verification contrasts with identification, and speaker recognition differs from speaker diarisation.

Synaptics Incorporated is a publicly traded San Jose, California-based developer of human interface (HMI) hardware and software, including touchpads for computer laptops; touch, display driver, and fingerprint biometrics technology for smartphones; and touch, video and far-field voice technology for smart home devices and automotives. Synaptics sells its products to original equipment manufacturers (OEMs) and display manufacturers.

Conexant Systems, Inc. was an American-based software developer and fabless semiconductor company that developed technology for voice and audio processing, imaging and modems. The company began as a division of Rockwell International, before being spun off as a public company. Conexant itself then spun off several business units, creating independent public companies which included Skyworks Solutions and Mindspeed Technologies.

Multimodal interaction provides the user with multiple modes of interacting with a system. A multimodal interface provides several distinct tools for input and output of data.

A voice-user interface (VUI) enables spoken human interaction with computers, using speech recognition to understand spoken commands and answer questions, and typically text to speech to play a reply. A voice command device is a device controlled with a voice user interface.

<span class="mw-page-title-main">ESS Technology</span> Former synthetic speech synthesizer company that is now known for its Sabre DAC chips

ESS Technology Incorporated is a private manufacturer of computer multimedia products, Audio DACs and ADCs based in Fremont, California with R&D centers in Kelowna, British Columbia, Canada and Beijing, China. It was founded by Forrest Mozer in 1983. Robert L. Blair is the CEO and President of the company.

AGNITIO S.L. was a voice biometrics technology company, headquartered in Madrid, Spain. Biometric authentication uses unique biological characteristics to verify an individual’s identity. It’s harder to spoof and considered more convenient for some users since they do not have to remember passwords or worry about passwords being stolen. Agnitio provides voice biometrics services for homeland security and corporate clients.

<span class="mw-page-title-main">Virtual assistant</span> Software agent

A virtual assistant (VA) is a software agent that can perform a range of tasks or services for a user based on user input such as commands or questions, including verbal ones. Such technologies often incorporate chatbot capabilities to simulate human conversation, such as via online chat, to facilitate interaction with their users. The interaction may be via text, graphical interface, or voice - as some virtual assistants are able to interpret human speech and respond via synthesized voices.

Forrest S. Mozer is an American experimental physicist, inventor, and entrepreneur known best for his pioneering work on electric field measurements in space plasma and for development of solid state electronic speech synthesizers and speech recognizers.

Audience was an American mobile voice and audio-processing company based in Mountain View, California, and was one of the 34 founding members of The Open Handset Alliance. The company went public in May 2012 on the NASDAQ exchange under the symbol ADNC. They specialized in improving voice clarity and noise suppression for a broad range of consumer products, including cellular phones, mobile devices and PCs. They were bought by Knowles for $130 Million in 3Q15 who changed their name to Knowles Intelligent Audio.

Adobe VoCo is an unreleased audio editing and generating prototype software by Adobe that enables novel editing and generation of audio. Dubbed "Photoshop-for-voice", it was first previewed at the Adobe MAX event in November 2016. The technology shown at Adobe MAX was a preview that could potentially be incorporated into Adobe Creative Cloud. It was later revealed that Voco was never meant to be released and was meant to be a research prototype.

WaveNet is a deep neural network for generating raw audio. It was created by researchers at London-based AI firm DeepMind. The technique, outlined in a paper in September 2016, is able to generate relatively realistic-sounding human-like voices by directly modelling waveforms using a neural network method trained with recordings of real speech. Tests with US English and Mandarin reportedly showed that the system outperforms Google's best existing text-to-speech (TTS) systems, although as of 2016 its text-to-speech synthesis still was less convincing than actual human speech. WaveNet's ability to generate raw waveforms means that it can model any kind of audio, including music.

Alice is a Russian intelligent personal assistant for Android, iOS and Windows operating systems and Yandex's own devices developed by Yandex. Alice was officially introduced on 10 October 2017. Aside from common tasks, such as internet search or weather forecasts, it can also run applications and chit-chat. Alice is also the virtual assistant used for the Yandex Station smart speaker.

<span class="mw-page-title-main">Speechmatics</span>

Speechmatics is a technology company based in Cambridge, England, which develops automatic speech recognition software (ASR) based on recurrent neural networks and statistical language modelling. Speechmatics was originally named Cantab Research Ltd when founded in 2006 by speech recognition specialist Dr. Tony Robinson.

<span class="mw-page-title-main">Voice computing</span> Discipline in computing

Voice computing is the discipline that develops hardware or software to process voice inputs.

Cerence Inc. is an American multinational software company that develops artificial intelligence (AI) assistant technology primarily for automobiles.

References

  1. "Sensory, Inc". Bloomberg Businessweek . Archived from the original on 3 December 2012.
  2. TCZ Webmaster (21 August 2006). "Sensory Inc". The Commodore Zone. Archived from the original on 14 February 2012.
  3. Rae-Dupree, Janet (2 January 2005). "Smaller, cheaper voice chip speaks loudly for future uses". San Jose Business Journal. Archived from the original on 16 May 2008.
  4. "Best Mobile Technology Breakthrough".
  5. Incorporated, Sensory (Jan 12, 2024). "Sensory Leads Automotive Technology Innovation With Groundbreaking AI Platform Unveiling At CES 2024". PRWeb . Retrieved 2024-04-04.
  6. "Mars Microphone". The Planetary Society. Archived from the original on 27 January 2012.
  7. Quan, Margaret (14 June 2001). "TI will exit dedicated speech-synthesis chips, transfer products to Sensory". EE Times. Archived from the original on 28 May 2012.
  8. "Automotive AI". Sensory. Retrieved 2024-04-04.