CSLU Toolkit

Last updated March 26, 2024

The CSLU Toolkit is a software library comprising a comprehensive suite of tools that enable exploration, learning, and research into speech and human-computer interaction. It is developed by the Center for Spoken Language Understanding at the OGI School of Science and Engineering, a school of the Oregon Health & Science University.

The tools include:

Audio
Display
Speech recognition
Speech generation
Animated faces

External links

toolkit : about, CSLU Toolkit

This computer-library-related article is a stub. You can help Wikipedia by expanding it.

Related Research Articles

wxWidgets is a widget toolkit and tools library for creating graphical user interfaces (GUIs) for cross-platform applications. wxWidgets enables a program's GUI code to compile and run on several computer platforms with minimal or no code changes. A wide choice of compilers and other tools to use with wxWidgets facilitates development of sophisticated applications. wxWidgets supports a comprehensive range of popular operating systems and graphical libraries, both proprietary and free, and is widely deployed in prominent organizations.

The following outline is provided as an overview of and topical guide to software engineering:

The Natural Language Toolkit, or more commonly NLTK, is a suite of libraries and programs for symbolic and statistical natural language processing (NLP) for English written in the Python programming language. It supports classification, tokenization, stemming, tagging, parsing, and semantic reasoning functionalities. It was developed by Steven Bird and Edward Loper in the Department of Computer and Information Science at the University of Pennsylvania. NLTK includes graphical demonstrations and sample data. It is accompanied by a book that explains the underlying concepts behind the language processing tasks supported by the toolkit, plus a cookbook.

<span class="mw-page-title-main">Centre for Development of Advanced Computing</span> Autonomous scientific society

The Centre for Development of Advanced Computing (C-DAC) is an Indian autonomous scientific society, operating under the Ministry of Electronics and Information Technology.

Julius is a speech recognition engine, specifically a high-performance, two-pass large vocabulary continuous speech recognition (LVCSR) decoder software for speech-related researchers and developers. It can perform almost real-time computing (RTC) decoding on most current personal computers (PCs) in 60k word dictation task using word trigram (3-gram) and context-dependent Hidden Markov model (HMM). Major search methods are fully incorporated.

As of the early 2000s, several speech recognition (SR) software packages exist for Linux. Some of them are free and open-source software and others are proprietary software. Speech recognition usually refers to software that attempts to distinguish thousands of words in a human language. Voice control may refer to software used for communicating operational commands to a computer.

The CMU Pronouncing Dictionary is an open-source pronouncing dictionary originally created by the Speech Group at Carnegie Mellon University (CMU) for use in speech recognition research.

HTK is a proprietary software toolkit for handling HMMs. It is mainly intended for speech recognition, but has been used in many other pattern recognition applications that employ HMMs, including speech synthesis, character recognition and DNA sequencing.

Speaker diarisation is the process of partitioning an audio stream containing human speech into homogeneous segments according to the identity of each speaker. It can enhance the readability of an automatic speech transcription by structuring the audio stream into speaker turns and, when used together with speaker recognition systems, by providing the speaker’s true identity. It is used to answer the question "who spoke when?" Speaker diarisation is a combination of speaker segmentation and speaker clustering. The first aims at finding speaker change points in an audio stream. The second aims at grouping together speech segments on the basis of speaker characteristics.

Database preservation usually involves converting the information stored in a database to a form likely to be accessible in the long term as technology changes, without losing the initial characteristics of the data.

RWTH ASR is a proprietary speech recognition toolkit.

Voice Elements is a Microsoft Cloud Service and Calling Plan for Microsoft Teams. Voice Elements were released by Inventive Labs Corporation in 2008, based on their original CTI32 toolkit. Software developers who used C#, VB.NET or Delphi use Voice Elements to write telephony-based applications, such as Interactive Voice Response systems, Voice dialers, Auto Attendants, Call centers and more.

Apache cTAKES: clinical Text Analysis and Knowledge Extraction System is an open-source Natural Language Processing (NLP) system that extracts clinical information from electronic health record unstructured text. It processes clinical notes, identifying types of clinical named entities — drugs, diseases/disorders, signs/symptoms, anatomical sites and procedures. Each named entity has attributes for the text span, the ontology mapping code, context, and negated/not negated.

Kaldi is an open-source speech recognition toolkit written in C++ for speech recognition and signal processing, freely available under the Apache License v2.0.

C3D Toolkit is a proprietary cross-platform geometric modeling kit software developed by Russian by C3D Labs. It's written in C++. It can be licensed by other companies for use in their 3D computer graphics software products. The most widely known software in which C3D Toolkit is typically used are computer aided design (CAD), computer-aided manufacturing (CAM), and computer-aided engineering (CAE) systems.

<span class="mw-page-title-main">Steve Young (software engineer)</span> British researcher (born 1951)

Stephen John Young is a British researcher, Professor of Information Engineering at the University of Cambridge and an entrepreneur. He is one of the pioneers of automated speech recognition and statistical spoken dialogue systems. He served as the Senior Pro-Vice-Chancellor of the University of Cambridge from 2009 to 2015, responsible for planning and resources. From 2015 to 2019, he held a joint appointment between his professorship at Cambridge and Apple, where he was a senior member of the Siri development team.

openSMILE is source-available software for automatic extraction of features from audio signals and for classification of speech and music signals. "SMILE" stands for "Speech & Music Interpretation by Large-space Extraction". The software is mainly applied in the area of automatic emotion recognition and is widely used in the affective computing research community. The openSMILE project exists since 2008 and is maintained by the German company audEERING GmbH since 2013. openSMILE is provided free of charge for research purposes and personal use under a source-available license. For commercial use of the tool, the company audEERING offers custom license options.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

CSLU Toolkit

See also

External links

Related Research Articles