PlainTalk

Last updated

PlainTalk is the collective name for several speech synthesis (MacinTalk) and speech recognition technologies developed by Apple Inc. In 1990, Apple invested a lot of work and money in speech recognition technology, hiring many researchers in the field. The result was "PlainTalk", released with the AV models in the Macintosh Quadra series from 1993. It was made a standard system component in System 7.1.2, and has since been shipped on all PowerPC and some 68k Macintoshes.

Contents

Software

Speech synthesis

Technology

Apple's text-to-speech uses diphones. Compared to other methods of synthesizing speech, it is not very resource-intensive, but limits how natural the speech synthesis can be. American English and Spanish versions have been available, but since the advent of Mac OS X, Apple has shipped only American English voices, relying on third-party suppliers such as Acapela Group to supply voices for other languages (in OS X 10.7, Apple licensed a lot of third-party voices and made them available for download within the Speech control panel).

An application programming interface known as the Speech Manager enables third-party developers to use speech synthesis in their applications. There are various control sequences that can be used to fine-tune the intonation and rhythm. The volume, pitch and rate of the speech can be configured as well, allowing for singing.

Input to the synthesizer can be controlled explicitly using a special phoneme alphabet.

Original MacinTalk

MacinTalk 1 demo

The initial Macintosh text-to-speech engine, MacinTalk (named by Denise Chandler), was used by Apple in the 1984 introduction of the Macintosh in which the computer announced itself to the world (and poked fun at the weight of an IBM computer). While it was incorporated into the Macintosh's operating system, it was not officially supported by Apple (though programming information was made available through an Apple Technical Note [1] [2] ). MacinTalk was developed by Joseph Katz and Mark Barton who later founded SoftVoice, Inc. which currently markets TTS engines for Windows, Linux and embedded platforms. MacinTalk used direct access to the original Macintosh sound hardware and all attempts to license the source code by Apple to update it for newer Macs failed. [3] [4]

MacinTalk 2

MacinTalk 2 demo featuring the Mr. Hughes and Marvin voices

Eventually, Apple released a supported speech synthesis system, called MacinTalk 2. It supports any Macintosh running System Software 6.0.7 or later. It remained the recommended version for slower machines even after the release of MacinTalk 3 and Pro.

MacinTalk 3, Pro

MacinTalk 3 introduced a great variety of voices. Apart from the standard adult voices "Ralph", "Fred" and "Kathy", and children's voices like "Princess" (renamed "Superstar" in macOS Ventura) and "Junior", various novelty voices were included, like "Whisper", "Zarvox" (a robotic voice with melodic background sounds, with a similar voice called "Trinoids" also included), "Cellos" (a voice that sang its text to an Edvard Grieg tune, otherwise known as "In the Hall of the Mountain King" with similarly singing voices like "Good News", "Bad News", "Pipe Organ"), "Albert" (a hoarse-sounding voice), "Bells", "Boing", "Bubbles", and others.

Each of these voices came with its own example text, that would be spoken when one hit the "Test" button in the Speech control panel. Some would just say their name, language and the version of MacinTalk they were introduced with. Others would say funny things, like "I sure like being inside this fancy computer", "I have a frog in my throat... No, I mean a real frog!", "We must rejoice in this morbid voice" (a parody of Western church hymnody with organ music), or "The light you see at the end of the tunnel is the headlamp of a fast approaching train". These voices are still in macOS today. (A few of the voice names and their test texts were changed with macOS Ventura, and then all their test texts were changed in macOS Sonoma to "Hello, my name is [voice name].")

With the increase in computing power that the AV Macs and PowerPC based Macintoshes provided, Apple could afford to increase the quality of the synthesis. MacinTalk 3 required a 33 MHz 68030 processor and MacinTalk Pro required a 68040 or better and at least 1 MB of RAM. Each synthesizer supported a different set of voices.

Text-to-speech in Mac OS X

Text-to-speech has been a part of every Mac OS X (later macOS) version. The Victoria voice was enhanced significantly in Mac OS X v10.3, and added as Vicki (Victoria was not removed). Its size was almost 20 times greater, because of the higher-quality diphone samples used.

A new, much more natural-sounding voice, called "Alex" has been added to the Mac text-to-speech roster with the release of Mac OS X 10.5 Leopard. [5]

With Mac OS X 10.7 Lion, voices are available in additional U.S. English and other English accents, as well as 21 other languages. [6]

The Speak selected text when key is pressed feature allows selected text from any application to be read via a key combination. From Mac OS X 10.1 to Mac OS X 10.6, the feature would copy the selected text to the clipboard and read it from there. From Mac OS X 10.7 to Mac OS X 10.10, a new implementation of the feature required software developers to implement a speech synthesis API into their applications. [7] [8] This prevented the clipboard from being overwritten, but also meant that, for applications that did not use the API, the feature would not function as expected, reading the title bar rather than the selected text. [9] [10]

In macOS Sierra 10.12, Siri was introduced for the Mac, however, the voice was not available as a System Voice, which meant that the Siri voices could be only used in Siri. Siri was made available as a System voice in macOS Catalina 10.15, so that it would work for any text. The Siri voices work in a completely different way and the say command remains unable to use Siri.

In the macOS Big Sur 11.3 update, gender references to all voices were removed, coinciding with the change in Siri voices on iOS 14.5 and macOS 11.3 and later, as part of Apple's efforts to promote gender inclusivity.

Speech recognition

Apple hired many speech recognition researchers in 1990. After about a year, they demonstrated a technology codenamed Casper. It was released as part of the PlainTalk package in 1993. Although available for all PowerPC Macintoshes and AV 68k machines (it was one of the few applications that made use of the DSP in the Centris 660AV and Quadra 840AV), it was not part of the default system install prior to Mac OS X, requiring the user to perform a custom OS installation to get speech recognition capabilities.

In Mac OS X 10.7 Lion and earlier, Apple's speech recognition was voice-command oriented only, i.e. not intended for dictation. It can be configured to listen for commands when a hot key is pressed, after being addressed with an activation phrase such as "Computer", or "Macintosh", or without prompt. A graphical status monitor, often in the form of an animated character, provides visual and textual feedback about listening status, available commands and actions. It can also communicate back with the user using speech synthesis.

Early versions of the speech recognition provided full access to the menus. This support was later removed, since it required too many resources and made recognition less reliable, only to be re-added in Mac OS X 10.3 as a "universal access technology" called spoken user interface.

The user can launch items located in a special folder, called "Speakable Items", simply by speaking their name (while the system is in listening mode). Apple shipped a number of AppleScripts in this folder, but aliases, documents and folders can be opened in the same way.

Additional functionality is provided by individual applications. An application programming interface lets programs define and modify an available vocabulary. For example, the Finder provides a vocabulary for manipulating files and windows.

In OS X 10.8 Mountain Lion, Apple introduced "Dictation, [11] " intended for general text. Originally, it required the sending of audio data to Apple servers for processing. In OS X 10.9 Mavericks, Apple added the option to download support for dictation without an Internet connection. As of OS X 10.9.3, eight languages (19 dialects) are supported.

Hardware

Apple produced two microphones under the product name "Apple PlainTalk Microphone". [ citation needed ] The first shipped inclusive with Macintosh LC and early Performa models, and was circular in appearance. It was designed to sit in a holder attached to the side of a CRT display, and be lifted out and held by the mouth when talking.[ citation needed ] The second model was introduced alongside the AV models in the Macintosh Quadra series in 1993 but was also sold separately. It was designed to be positioned on top of the screen and to be sensitive to sound from the front. Both models had a longer connector, the tip of which was used to provide the microphone with bias voltage.

Related Research Articles

macOS Operating system for Apple computers

macOS is an operating system developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac computers. Within the market of desktop and laptop computers, it is the second most widely used desktop OS, after Microsoft Windows and ahead of Linux.

Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech. The reverse process is speech recognition.

The history of macOS, Apple's current Mac operating system formerly named Mac OS X until 2011 and then OS X until 2016, began with the company's project to replace its "classic" Mac OS. That system, up to and including its final release Mac OS 9, was a direct descendant of the operating system Apple had used in its Mac computers since their introduction in 1984. However, the current macOS is a UNIX operating system built on technology that had been developed at NeXT from the 1980s until Apple purchased the company in early 1997.

<span class="mw-page-title-main">Mac OS 9</span> Ninth and last release of the Classic Mac OS system

Mac OS 9 is the ninth and final major release of Apple's classic Mac OS operating system which was succeeded by Mac OS X in 2001. Introduced on October 23, 1999, it was promoted by Apple as "The Best Internet Operating System Ever", highlighting Sherlock 2's Internet search capabilities, integration with Apple's free online services known as iTools and improved Open Transport networking. While Mac OS 9 lacks protected memory and full pre-emptive multitasking, lasting improvements include the introduction of an automated Software Update engine and support for multiple users.

<span class="mw-page-title-main">System 7</span> Apple Macintosh operating system released from 1991–1997

System 7, codenamed "Big Bang", and later also known as Mac OS 7, is a graphical user interface-based operating system for Macintosh computers and is part of the classic Mac OS series of operating systems. It was introduced on May 13, 1991, by Apple Computer It succeeded System 6, and was the main Macintosh operating system until it was succeeded by Mac OS 8 in 1997. Current for more than six years, System 7 was the longest-lived major version series of the classic Macintosh operating system. Features added with the System 7 release included virtual memory, personal file sharing, QuickTime, QuickDraw 3D, and an improved user interface.

<span class="mw-page-title-main">Screen reader</span> Assistive technology that converts text or images to speech or Braille

A screen reader is a form of assistive technology (AT) that renders text and image content as speech or braille output. Screen readers are essential to people who are blind, and are useful to people who are visually impaired, illiterate, or have a learning disability. Screen readers are software applications that attempt to convey what people with normal eyesight see on a display to their users via non-visual means, like text-to-speech, sound icons, or a braille device. They do this by applying a wide variety of techniques that include, for example, interacting with dedicated accessibility APIs, using various operating system features, and employing hooking techniques.

<span class="mw-page-title-main">Newton OS</span> Discontinued operating system by Apple Inc.

Newton OS is a discontinued operating system for the Apple Newton PDAs produced by Apple Computer, Inc. between 1993 and 1997. It was written entirely in C++ and trimmed to be low power consuming and use the available memory efficiently. Many applications were pre-installed in the ROM of the Newton to save on RAM and flash memory storage for user applications.

<span class="mw-page-title-main">VoiceOver</span> Screen reader developed by Apple

VoiceOver is a screen reader built into Apple Inc.'s macOS, iOS, tvOS, watchOS, and iPod operating systems. By using VoiceOver, the user can access their Macintosh or iOS device based on spoken descriptions and, in the case of the Mac, the keyboard. The feature is designed to increase accessibility for blind and low-vision users, as well as for users with dyslexia.

Speakable items is part of the speech recognition feature in the classic Mac OS and macOS operating systems. It allows the user to control their computer with natural speech, without having to train the computer beforehand. The commands must be present in the Speakable items folder though but can be created with something as simple as a shortcut, AppleScript, keyboard command, or Automator workflows.

A voice-user interface (VUI) enables spoken human interaction with computers, using speech recognition to understand spoken commands and answer questions, and typically text to speech to play a reply. A voice command device is a device controlled with a voice user interface.

<span class="mw-page-title-main">AmigaOS</span> Operating system for Amiga computers

AmigaOS is a family of proprietary native operating systems of the Amiga and AmigaOne personal computers. It was developed first by Commodore International and introduced with the launch of the first Amiga, the Amiga 1000, in 1985. Early versions of AmigaOS required the Motorola 68000 series of 16-bit and 32-bit microprocessors. Later versions were developed by Haage & Partner and then Hyperion Entertainment. A PowerPC microprocessor is required for the most recent release, AmigaOS 4.

<span class="mw-page-title-main">BootX (Apple)</span> Boot loader developed by Apple Inc.

BootX is a software-based bootloader designed and developed by Apple Inc. for use on the company's Macintosh computer range. BootX is used to prepare the computer for use, by loading all required device drivers and then starting-up Mac OS X by booting the kernel on all PowerPC Macintoshes running the Mac OS X 10.2 operating system or later versions.

The classic Macintosh startup sequence includes hardware tests which may trigger the startup chime, Happy Mac, Sad Mac, and Chimes of Death. On Macs running macOS Big Sur or later the startup sound is enabled by default, but can be disabled by the user within System Preferences or System Settings (Ventura).

Game Sprockets is a collection of application programming interfaces (APIs) supporting gaming on the classic Mac OS. It consisted of four main parts, DrawSprocket, InputSprocket, SoundSprocket and NetSprocket, each providing a library of pre-rolled routines for common gaming tasks. SpeechSprocket was a relabelled version of the Speech Recognition Manager that provided speech recognition support, and QuickDraw 3D RAVE provided 3D hardware acceleration.

The following outline of Apple Inc. is a topical guide to the products, history, retail stores, corporate acquisitions, and personnel under the purview of the American multinational corporation Apple Inc.

<span class="mw-page-title-main">Classic Mac OS</span> Original operating system of Apple Mac (1984–2001)

Mac OS is the series of operating systems developed for the Macintosh family of personal computers by Apple Computer from 1984 to 2001, starting with System 1 and ending with Mac OS 9. The Macintosh operating system is credited with having popularized the graphical user interface concept. It was included with every Macintosh that was sold during the era in which it was developed, and many updates to the system software were done in conjunction with the introduction of new Macintosh systems.

macOS Sierra Thirteenth major release of macOS

macOS Sierra is the thirteenth major release of macOS, Apple Inc.'s desktop and server operating system for Macintosh computers. The name "macOS" stems from the intention to unify the operating system's name with that of iOS, iPadOS, watchOS and tvOS. Sierra is named after the Sierra Nevada mountain range in California and Nevada. Its major new features concern Continuity, iCloud, and windowing, as well as support for Apple Pay and Siri.

macOS Catalina 16th major version of the macOS operating system

macOS Catalina is the sixteenth major release of macOS, Apple Inc.'s desktop operating system for Macintosh computers. It is the successor to macOS Mojave and was announced at WWDC 2019 on June 3, 2019 and released to the public on October 7, 2019. Catalina is the first version of macOS to support only 64-bit applications and the first to include Activation Lock. It is also the last version of macOS to have the major version number of 10; its successor, Big Sur, released on November 12, 2020, is version 11. In order to increase web compatibility, Safari, Chromium and Firefox have frozen the OS in the user agent running in subsequent releases of macOS at 10.15.7 Catalina.

References

  1. Ginger Jernigan; Jim Reekes (June 1989) [April 1985]. "Technical Note #019: How To Produce Continuous Sound Without Clicking". Apple Computer Inc. Retrieved 18 September 2019.
  2. Jim Reekes (February 1, 1990). "Technical Note PT22, a.k.a. #268: MacinTalk—The Final Chapter by MacinTalk—The Final Chapter". Apple Computer Inc. Retrieved 18 September 2019. The outcome of this work was MacinTalk. MacinTalk is a file that can be placed into the System Folder of an ordinary Macintosh computer and allow text to be transformed into speech for the introduction in 1984. It was felt to be an interesting piece of software, so Apple made it available to developers. Interfaces to MacinTalk were published and Apple Software Licensing allowed it to be included with developers' products. The original project was to get a speech driver for the Macintosh, but it did not include obtaining the source code to this driver. Apple only has exactly what it gives to developers: a file to be copied into the System Folder, and this file cannot be changed since Apple does not have the source code. [The original] MacinTalk works by using a VBL task to write data directly to the sound hardware of the Macintosh Plus and SE logic boards—a method which Apple does not support. It has only been through the efforts of the Sound Manager that software that writes directly to this sound hardware continues to work. MacinTalk continues to write to the hardware addresses of the Macintosh 128K logic board, but the Sound Manager and the Apple Sound Chip work together to allow programs like MacinTalk to continue working on newer machines. The Sound Manager and the Apple Sound Chip [ASC] were introduced with the Macintosh II. The Sound Manager watches the hardware addresses that used to be present on the Macintosh. When the Sound Manager detects activity at one of these addresses, it goes into a "compatibility" mode. In this mode, it routes the data to the real sound hardware, but while this is happening, proper Sound Manager code cannot run—even the Sound Manager's _SysBeep does not work when MacinTalk is in use. Furthermore, the compatibility mode cannot be turned off until the application requiring it calls _ExitToShell. Even an application that uses sound properly, with correct code, does not work if another application opens the MacinTalk driver. There are no solutions to this incompatibility.... In other words, if you find MacinTalk interesting and entertaining—go ahead and purchase it. Write some code and enjoy. However, be warned that MacinTalk should not be included as part of any commercial product. Apple Computer, Inc. provides no support for MacinTalk other than what is purchased with the package itself, and there will be no support in the future. Apple is committed to providing the developer community with an array of speech technologies integrated with the Sound Manager... Nothing more will be done [with the original MacinTalk]. It is a compatibility risk... causes the Sound Manager to fail... will not work with the new Sound Manager planned for System 7.0... may not work at all with future versions of the Macintosh hardware. ....#000: About Macintosh Technical Notes.... We place no restrictions on copying Technical Notes, with the exception that you cannot resell them, so read, enjoy, and share. We hope Macintosh Technical Notes will provide you with lots of valuable information while you are developing Macintosh hardware and software. Alt URL
  3. "Macintalk".
  4. "MacinTalk".
  5. "Accessibility - OS X". Apple. Retrieved 2016-04-27.
  6. "Apple - OS X Lion - Universal Access". Archived from the original on September 24, 2011. Retrieved July 23, 2011.
  7. "Introduction to Speech Synthesis Programming Guide". Developer.apple.com. 2006-09-05. Retrieved 2016-04-27.
  8. "Speech Synthesis in OS X". Developer.apple.com. 2006-09-05. Retrieved 2016-04-27.
  9. "[Solved] Text to speech only reads the document title (View topic) • Apache OpenOffice Community Forum". Forum.openoffice.org. Retrieved 2016-04-27.
  10. "scottmartin/speak-selected-text-sublime: A plugin to use the Mac's text to speech from Sublime Text 2". GitHub.com. Retrieved 2016-04-27.
  11. "Use your voice to enter text on your Mac - Apple Support". Support.apple.com. 2016-04-05. Retrieved 2016-04-27.