The Human Speechome Project ("speechome" as an approximate rhyme for "genome") is an effort to closely observe and model the language acquisition of a child over the first three years of life.
The project was conducted at the Massachusetts Institute of Technology's Media Laboratory by the Associate Professor Deb Roy [1] with an array of technology that is used to comprehensively but unobtrusively observe a single child – Roy's own son [2] – with the resulting data being used to create computational models to yield further insight into language acquisition. [3]
Most studies of human speech acquisition in children have been done in laboratory settings and with sampling rates of only a couple of hours per week. The need for studies in the more natural setting of the child's home, and at a much higher sampling rate approaching the child's total experience, led to the development of this project concept. [3]
Just as the Human Genome Project illuminates the innate genetic code that shapes us, the Speechome project is an important first step toward creating a map of how the environment shapes human development and learning.
Frank Moss, director of the Media Lab [4]
A digital network consisting of eleven video cameras, fourteen microphones, and an array of data capture hardware was installed in the home of the subject. [4] A cluster of ten computers and audio samplers is located in the basement of the house to capture the data. Data from the cluster is moved manually to the MIT campus as necessary for storage in a one-million-gigabyte (one-petabyte) storage facility. [3]
To provide control of the observation system to the occupants of the house, eight touch-activated displays were wall-mounted throughout the house to allow for stopping and starting video and or audio recording, and also erase any number of minutes permanently from the system. Audio recording was turned off throughout the house at night after the child was asleep. [5]
Data was gathered at an average rate of 200 gigabytes per day, necessitating the development of sophisticated data-mining tools to reduce analysis efforts to a manageable level, and transcribing significant speech added a labor-intensive dimension. [3]
Bioinformatics is an interdisciplinary field of science that develops methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, chemistry, physics, computer science, computer programming, information engineering, mathematics and statistics to analyze and interpret biological data. The subsequent process of analyzing and interpreting data is referred to as computational biology.
In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information. Typically, a device that performs data compression is referred to as an encoder, and one that performs the reversal of the process (decompression) as a decoder.
MP3 is a coding format for digital audio developed largely by the Fraunhofer Society in Germany under the lead of Karlheinz Brandenburg, with support from other digital scientists in other countries. Originally defined as the third audio format of the MPEG-1 standard, it was retained and further extended—defining additional bit rates and support for more audio channels—as the third audio format of the subsequent MPEG-2 standard. A third version, known as MPEG-2.5—extended to better support lower bit rates—is commonly implemented but is not a recognized standard.
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech. The reverse process is speech recognition.
Digital audio is a representation of sound recorded in, or converted into, digital form. In digital audio, the sound wave of the audio signal is typically encoded as numerical samples in a continuous sequence. For example, in CD audio, samples are taken 44,100 times per second, each with 16-bit sample depth. Digital audio is also the name for the entire technology of sound recording and reproduction using audio signals that have been encoded in digital form. Following significant advances in digital audio technology during the 1970s and 1980s, it gradually replaced analog audio technology in many areas of audio engineering, record production and telecommunications in the 1990s and 2000s.
Lawrence Berkeley National Laboratory (LBNL) is a federally funded research and development center in the hills of Berkeley, California, United States. Established in 1931 by the University of California (UC), the laboratory is sponsored by the United States Department of Energy and administered by the UC system. Ernest Lawrence, who won the Nobel prize for inventing the cyclotron, founded the Lab and served as its Director until his death in 1958. Located in the Berkeley Hills, the lab overlooks the campus of the University of California, Berkeley.
A DNA microarray is a collection of microscopic DNA spots attached to a solid surface. Scientists use DNA microarrays to measure the expression levels of large numbers of genes simultaneously or to genotype multiple regions of a genome. Each DNA spot contains picomoles of a specific DNA sequence, known as probes. These can be a short section of a gene or other DNA element that are used to hybridize a cDNA or cRNA sample under high-stringency conditions. Probe-target hybridization is usually detected and quantified by detection of fluorophore-, silver-, or chemiluminescence-labeled targets to determine relative abundance of nucleic acid sequences in the target. The original nucleic acid arrays were macro arrays approximately 9 cm × 12 cm and the first computerized image based analysis was published in 1981. It was invented by Patrick O. Brown. An example of its application is in SNPs arrays for polymorphisms in cardiovascular diseases, cancer, pathogens and GWAS analysis. It is also used for the identification of structural variations and the measurement of gene expression.
William Daniel Hillis is an American inventor, entrepreneur, and computer scientist, who pioneered parallel computers and their use in artificial intelligence. He founded Thinking Machines Corporation, a parallel supercomputer manufacturer, and subsequently was Vice President of Research and Disney Fellow at Walt Disney Imagineering.
Computer Science and Artificial Intelligence Laboratory (CSAIL) is a research institute at the Massachusetts Institute of Technology (MIT) formed by the 2003 merger of the Laboratory for Computer Science (LCS) and the Artificial Intelligence Laboratory. Housed within the Ray and Maria Stata Center, CSAIL is the largest on-campus laboratory as measured by research scope and membership. It is part of the Schwarzman College of Computing but is also overseen by the MIT Vice President of Research.
MUSIC-N refers to a family of computer music programs and programming languages descended from or influenced by MUSIC, a program written by Max Mathews in 1957 at Bell Labs. MUSIC was the first computer program for generating digital audio waveforms through direct synthesis. It was one of the first programs for making music on a digital computer, and was certainly the first program to gain wide acceptance in the music research community as viable for that task. The world's first computer-controlled music was generated in Australia by programmer Geoff Hill on the CSIRAC computer which was designed and built by Trevor Pearcey and Maston Beard. However, CSIRAC produced sound by sending raw pulses to the speaker, it did not produce standard digital audio with PCM samples, like the MUSIC-series of programs.
IBM Research is the research and development division for IBM, an American multinational information technology company headquartered in Armonk, New York, with operations in over 170 countries. IBM Research is the largest industrial research organization in the world and has twelve labs on six continents.
George McDonald Church is an American geneticist, molecular engineer, chemist, serial entrepreneur, and pioneer in personal genomics and synthetic biology. He is the Robert Winthrop Professor of Genetics at Harvard Medical School, Professor of Health Sciences and Technology at Harvard University and Massachusetts Institute of Technology, and a founding member of the Wyss Institute for Biologically Inspired Engineering at Harvard University.
R/V Roger Revelle is a Thomas G. Thompson-class oceanographic research ship operated by Scripps Institution of Oceanography under charter agreement with Office of Naval Research as part of the University-National Oceanographic Laboratory System (UNOLS) fleet. The ship is named after Roger Randall Dougan Revelle, who was essential to the incorporation of Scripps into the University of California San Diego.
The Texas Advanced Computing Center (TACC) at the University of Texas at Austin, United States, is an advanced computing research center that is based on comprehensive advanced computing resources and supports services to researchers in Texas and across the U.S. The mission of TACC is to enable discoveries that advance science and society through the application of advanced computing technologies. Specializing in high performance computing, scientific visualization, data analysis & storage systems, software, research & development and portal interfaces, TACC deploys and operates advanced computational infrastructure to enable the research activities of faculty, staff, and students of UT Austin. TACC also provides consulting, technical documentation, and training to support researchers who use these resources. TACC staff members conduct research and development in applications and algorithms, computing systems design/architecture, and programming tools and environments.
Illumina, Inc. is an American biotechnology company, headquartered in San Diego, California, and it serves more than 155 countries. Incorporated on April 1, 1998, Illumina develops, manufactures, and markets integrated systems for the analysis of genetic variation and biological function. The company provides a line of products and services that serves the sequencing, genotyping and gene expression, and proteomics markets.
The Max Planck Institute for Psycholinguistics is a research institute situated on the campus of Radboud University Nijmegen located in Nijmegen, Gelderland, the Netherlands. The institute was founded in 1980 by Pim Levelt, and is unique for being entirely dedicated to psycholinguistics, and is also one of the few institutes of the Max Planck Society to be located outside Germany. The Nijmegen-based institute currently occupies 2nd position in the Ranking Web of World Research Centers among all Max Planck institutes. It currently employs about 235 people.
CTD stands for conductivity, temperature, and depth. A CTD instrument is an oceanography sonde used to measure the electrical conductivity, temperature, and pressure of seawater. The pressure is closely related to depth. Conductivity is used to determine salinity.
Bluefin Labs was a Cambridge, MA-based social TV analytics company that used publicly available social media commentary from Twitter, Facebook and blogs to measure viewer engagement with television shows and ads at scale – historically a costly and complex problem for TV and marketing industries to solve.
The Earth Microbiome Project (EMP) is an initiative founded by Janet Jansson, Jack Gilbert and Rob Knight in 2010 to collect natural samples and to analyze the microbial community around the globe.
Deb Roy is a Canadian scientist, tenured professor at Massachusetts Institute of Technology (MIT), and the director of the MIT Center for Constructive Communication.