Festival Speech Synthesis System

Last updated
Festival
Developer(s) Centre for Speech Technology Research (CSTR) of the University of Edinburgh
Stable release
2.5 / December 2017;5 years ago (2017-12)
Written in C++
Operating system Cross-platform
Type Speech synthesizer
License Similar to MIT License (Free software)
Website www.cstr.ed.ac.uk/projects/festival/

The Festival Speech Synthesis System is a general multi-lingual speech synthesis system originally developed by Alan W. Black, Paul Taylor and Richard Caley [1] at the Centre for Speech Technology Research (CSTR) at the University of Edinburgh. Substantial contributions have also been provided by Carnegie Mellon University and other sites. It is distributed under a free software license similar to the BSD License.

Contents

It offers a full text to speech system with various APIs, as well as an environment for development and research of speech synthesis techniques. It is written in C++ with a Scheme-like command interpreter for general customization and extension. [2]

Festival is designed to support multiple languages, and comes with support for English (British and American pronunciation), Welsh, and Spanish. Voice packages exist for several other languages, such as Castilian Spanish, Czech, Finnish, Hindi, Italian, Marathi, Polish, Russian and Telugu.

Festvox

The Festvox project aims to make the building of new synthetic voices more systematic and better documented, [3] making it possible for anyone to build a new voice. It is distributed under a free software license similar to the MIT License.

Festvox is a suite of tools by Alan W. Black and Kevin Lenzo for building synthetic voices for Festival. It includes a step-by-step tutorial with examples in document called "Building Synthetic Voices". [4]

Flite

Flite is a small run-time speech synthesis engine developed at Carnegie Mellon University, derived from both Festival and Festvox. [5]

Linux compatibility

There is a Festival plug-in for GStreamer. Festival is pre-packaged for several Linux distributions.

See also

Related Research Articles

Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech. The reverse process is speech recognition.

<span class="mw-page-title-main">Carnegie Mellon University</span> Private research university in Pittsburgh, Pennsylvania, U.S.

Carnegie Mellon University (CMU) is a private research university in Pittsburgh, Pennsylvania. The institution was originally established in 1900 by Andrew Carnegie as the Carnegie Technical Schools. In 1912, it became the Carnegie Institute of Technology and began granting four-year degrees. In 1967, it became the current-day Carnegie Mellon University through its merger with the Mellon Institute of Industrial Research, founded in 1913 by Andrew Mellon and Richard B. Mellon and formerly a part of the University of Pittsburgh.

<span class="mw-page-title-main">Carnegie Mellon School of Computer Science</span> School for computer science in the United States

The School of Computer Science (SCS) at Carnegie Mellon University in Pittsburgh, Pennsylvania, US is a school for computer science established in 1988. It has been consistently ranked among the top computer science programs over the decades. As of 2022 U.S. News & World Report ranks the graduate program as tied for second with Stanford University and University of California, Berkeley. It is ranked second in the United States on Computer Science Open Rankings, which combines scores from multiple independent rankings.

<span class="mw-page-title-main">Raj Reddy</span> Indian-American computer scientist (born 1937)

Dabbala Rajagopal "Raj" Reddy is an Indian-born American computer scientist and a winner of the Turing Award. He is one of the early pioneers of artificial intelligence and has served on the faculty of Stanford and Carnegie Mellon for over 50 years. He was the founding director of the Robotics Institute at Carnegie Mellon University. He was instrumental in helping to create Rajiv Gandhi University of Knowledge Technologies in India, to cater to the educational needs of the low-income, gifted, rural youth. He is the chairman of International Institute of Information Technology, Hyderabad. He is the first person of Asian origin to receive the Turing Award, in 1994, known as the Nobel Prize of Computer Science, for his work in the field of artificial intelligence.

FreeTTS is an open source speech synthesis system written entirely in the Java programming language. It is based upon Flite. FreeTTS is an implementation of Sun's Java Speech API.

The Andrew Project was a distributed computing environment developed at Carnegie Mellon University beginning in 1982. It was an ambitious project for its time and resulted in an unprecedentedly vast and accessible university computing infrastructure. The project was named after Andrew Carnegie and Andrew Mellon, the founders of the institutions that eventually became Carnegie Mellon University.

<span class="mw-page-title-main">Mahadev Satyanarayanan</span> Indian experimental computer scientist

Mahadev "Satya" Satyanarayanan is an Indian experimental computer scientist, an ACM and IEEE fellow, and the Carnegie Group Professor of Computer Science at Carnegie Mellon University (CMU).

<span class="mw-page-title-main">Carnegie Mellon Silicon Valley</span> Branch campus in California

Carnegie Mellon Silicon Valley is a degree-granting branch campus of Carnegie Mellon University located in the heart of Silicon Valley in Mountain View, California. It was established in 2002 at the NASA Ames Research Center in Moffett Field.

<span class="mw-page-title-main">David S. Touretzky</span>

David S. Touretzky is a research professor in the Computer Science Department and the Center for the Neural Basis of Cognition at Carnegie Mellon University. He received a BA in Computer Science at Rutgers University in 1978, and earned a master's degree and a Ph.D. (1984) in Computer Science at Carnegie Mellon University. Touretzky has worked as an Internet activist in favor of freedom of speech, especially what he perceives as abuse of the legal system by government and private authorities. He is a notable critic of Scientology.

CMU Sphinx, also called Sphinx for short, is the general term to describe a group of speech recognition systems developed at Carnegie Mellon University. These include a series of speech recognizers and an acoustic model trainer (SphinxTrain).

As of the early 2000s, several speech recognition (SR) software packages exist for Linux. Some of them are free and open-source software and others are proprietary software. Speech recognition usually refers to software that attempts to distinguish thousands of words in a human language. Voice control may refer to software used for communicating operational commands to a computer.

The CMU Pronouncing Dictionary is an open-source pronouncing dictionary originally created by the Speech Group at Carnegie Mellon University (CMU) for use in speech recognition research.

Kevin Lenzo is an American computer scientist. He wrote the initial infobot, founded The Perl Foundation and the Yet Another Perl Conferences (YAPC)., released CMU Sphinx into Open source, founded Cepstral LLC, and has been a major contributor to the Festival Speech Synthesis System, FestVox, and Flite. His voice is the basis for a number of synthetic voices, including FreeTTS, Flite, and the cmu_us_kal_diphone Festival voice. He has also contributed Perl modules to CPAN. Kevin was also a founding member of the 1980s funk band "Leftover Funk"

<span class="mw-page-title-main">Alan W. Black</span> Scottish computer scientist

Alan W Black is a Scottish computer scientist, known for his research on speech synthesis. He is a professor in the Language Technologies Institute at Carnegie Mellon University in Pittsburgh, Pennsylvania.

Cepstral is a provider of speech synthesis technology and services. It was founded in June 2000 by scientists from Carnegie Mellon University including the computer scientists Kevin Lenzo and Alan W. Black. It is a privately held corporation with headquarters in Pittsburgh, Pennsylvania.

Lessac Technologies, Inc. (LTI) is an American firm which develops voice synthesis software, licenses technology and sells synthesized novels as MP3 files. The firm currently has seven patents granted and three more pending for its automated methods of converting digital text into human-sounding speech, more accurately recognizing human speech and outputting the text representing the words and phrases of said speech, along with recognizing the speaker's emotional state.

<span class="mw-page-title-main">Eric Xing</span>

Eric Poe Xing is an American computer scientist whose research spans machine learning, computational biology, and statistical methodology. Xing is founding President of the world’s first artificial intelligence university, Mohamed bin Zayed University of Artificial Intelligence (MBZUAI).

<span class="mw-page-title-main">Carnegie Mellon University Computational Biology Department</span>

The Computational Biology Department (CBD) is a division within the School of Computer Science at Carnegie Mellon University in Pittsburgh, Pennsylvania, United States. It is located in the Gates-Hillman Center. Established in 2007 by Robert F. Murphy as the Lane Center for Computational Biology with funding from Raymond J. Lane and Stephanie Lane, CBD became a department within the School of Computer Science in 2016.

Roni Rosenfeld is an Israeli-American computer scientist and computational epidemiologist, currently serving as the head of the Machine Learning Department at Carnegie Mellon University. He is an international expert in machine learning, infectious disease forecasting, statistical language modeling and artificial intelligence.

References

  1. Taylor, Paul. "The Architecture of the Festival Speech Synthesis System" (PDF). Carnegie Mellon University.
  2. "Festival". www.cstr.ed.ac.uk. Retrieved 2019-03-25.
  3. "Festvox: Home". www.festvox.org. Retrieved 2019-03-25.
  4. "Building Synthetic Voices". www.festvox.org. Retrieved 2019-03-25.
  5. "CMU Flite: Speech Synthesizer". www.festvox.org. Retrieved 2019-03-25.