Musipedia

Last updated
Musipedia
Type of site
Web search engine
URL musipedia.org OOjs UI icon edit-ltr-progressive.svg
CommercialNo

Musipedia is a search engine for identifying pieces of music. This can be done by whistling a theme, playing it on a virtual piano keyboard, [1] tapping the rhythm on the computer keyboard, or entering the Parsons code. Anybody can modify the collection of melodies and enter MIDI files, bitmaps with sheet music (possibly generated by the Musipedia server after entering LilyPond or abc source code), lyrics or some text about the piece, or the melodic contours as Parsons Code. [2] Certain features on the site may no longer work due to reliance on flash which became defunct in 2020.

Contents

Search principles

Musipedia offers three ways of searching: Based on the melodic contour, based on pitches and onset times, or based on the rhythm alone. For the first two, users can draw notes, play them on a keyboard, or type out an ASCII version of a melody.

Melody

The melodic contour search uses an editing distance. Because of this, the search engine finds not only entries with exactly the contour that is entered as a query, but also the most similar ones among the contours that are not identical. The similarity is measured by determining the editing steps (inserting, deleting, or replacing a character) that are needed for converting the query contour into that of the search result. Since only the melodic contour is relevant, one can find melodies even if the key, rhythm, or the exact intervals are unknown.

Pitch and rhythm

The pitch and onset time-based search takes more properties of the melody into account. This search method, which is used by default, is still transposition-invariant and tempo-invariant, but it takes rhythm and intervals into account. The melody can be entered in various ways, for example by clicking on a virtual keyboard on the screen. The search engine then segments the query, converts each segment into a set of points in the two-dimensional space of onset time and pitch, and, by using the Earth Mover's Distance, compares each point set to pre-computed point sets representing segments of melodies from the database. As with the contour search, little alterations of the query will lead to correspondingly small changes in the results, which makes the search method somewhat error-tolerant.

Tapping a rhythm

The "query by tapping" method that only takes the rhythm into account uses the same algorithm as the pitch and onset time method, but assumes all pitches to be the same. As a result, the algorithm can be used for tapped queries that only contain onset times.

This search method is also very easy to do for those with limited accessibility options or using very small input devices.

Indexing

Both search algorithms are made faster with indices that are based on vantage objects. Instead of calculating the distance between the query and every single database entry, Musipedia just calculates the distance between the query and each of a few vantage objects. For every vantage object, the distance to each database entry is known. Since the triangle inequality holds for both the editing distance for contours and the variant of the Earth Mover's Distance used by Musipedia, the search algorithm needs, in a second step, to take a closer look at only those database entries whose distances to the vantage objects are similar to the distances between the query and the vantage objects.

Comparisons with other audio search tools

Musipedia's search engine works differently from that of search engines such as Shazam. The latter can identify short snippets of audio (a few seconds taken from a recording), even if it is transmitted over a phone connection. Shazam uses Audio Fingerprinting for that, a technique that makes it possible to identify recordings. Musipedia, on the other hand, can identify pieces of music that contain a given melody. Shazam finds exactly the recording that contains a given snippet, but no other recordings of the same piece.

Musipedia is included in some library catalogs on music-finding, which include other papers and online resources. [3]

History

Musipedia was started by Rainer Typke in 1997 and has been developed by him since then. Before the Wikipedia-like collaboration features were added (editing and deleting existing entries has been possible only since 2004), he called the music search engine "Melodyhound". [4]

Since 2006, the Musipedia search engine can also be used for searching the World Wide Web for MIDI files. Musipedia locates the MIDI files that go into its search index by using the Alexa Web Search service, which has been available since December 2005, through a partnership with Alexa. [5] [6] [7]

See also

Related Research Articles

<span class="mw-page-title-main">Percussion instrument</span> Type of musical instrument that produces a sound by being hit

A percussion instrument is a musical instrument that is sounded by being struck or scraped by a beater including attached or enclosed beaters or rattles struck, scraped or rubbed by hand or struck against another similar instrument. Excluding zoomusicological instruments and the human voice, the percussion family is believed to include the oldest musical instruments. In spite of being a very common term to designate instruments, and to relate them to their players, the percussionists, percussion is not a systematic classificatory category of instruments, as described by the scientific field of organology. It is shown below that percussion instruments may belong to the organological classes of ideophone, membranophone, aerophone and cordophone.

Music information retrieval (MIR) is the interdisciplinary science of retrieving information from music. Those involved in MIR may have a background in academic musicology, psychoacoustics, psychology, signal processing, informatics, machine learning, optical music recognition, computational intelligence or some combination of these.

Ear training or aural skills is a music theory study in which musicians learn to identify pitches, intervals, melody, chords, rhythms, solfeges, and other basic elements of music, solely by hearing. The application of this skill is analogous to taking dictation in written/spoken language. As a process, ear training is in essence the inverse of sight-reading, the latter being analogous to reading a written text aloud without prior opportunity to review the material. Ear training is typically a component of formal musical training and is a fundamental, essential skill required in music schools.

<span class="mw-page-title-main">Content-based image retrieval</span> Method of image retrieval

Content-based image retrieval, also known as query by image content (QBIC) and content-based visual information retrieval (CBVIR), is the application of computer vision techniques to the image retrieval problem, that is, the problem of searching for digital images in large databases. Content-based image retrieval is opposed to traditional concept-based approaches.

<span class="mw-page-title-main">R-tree</span> Data structures used in spatial indexing

R-trees are tree data structures used for spatial access methods, i.e., for indexing multi-dimensional information such as geographical coordinates, rectangles or polygons. The R-tree was proposed by Antonin Guttman in 1984 and has found significant use in both theoretical and applied contexts. A common real-world usage for an R-tree might be to store spatial objects such as restaurant locations or the polygons that typical maps are made of: streets, buildings, outlines of lakes, coastlines, etc. and then find answers quickly to queries such as "Find all museums within 2 km of my current location", "retrieve all road segments within 2 km of my location" or "find the nearest gas station". The R-tree can also accelerate nearest neighbor search for various distance metrics, including great-circle distance.

In text retrieval, full-text search refers to techniques for searching a single computer-stored document or a collection in a full-text database. Full-text search is distinguished from searches based on metadata or on parts of the original texts represented in databases.

The Parsons code, formally named the Parsons code for melodic contours, is a simple notation used to identify a piece of music through melodic motion – movements of the pitch up and down. Denys Parsons developed this system for his 1975 book The Directory of Tunes and Musical Themes. Representing a melody in this manner makes it easier to index or search for pieces, particularly when the notes' values are unknown. Parsons covered around 15,000 classical, popular and folk pieces in his dictionary. In the process he found out that *UU is the most popular opening contour, used in 23% of all the themes, something that applies to all the genres.

<span class="mw-page-title-main">Transcription (music)</span>

In music, transcription is the practice of notating a piece or a sound which was previously unnotated and/or unpopular as a written music, for example, a jazz improvisation or a video game soundtrack. When a musician is tasked with creating sheet music from a recording and they write down the notes that make up the piece in music notation, it is said that they created a musical transcription of that recording. Transcription may also mean rewriting a piece of music, either solo or ensemble, for another instrument or other instruments than which it was originally intended. The Beethoven Symphonies transcribed for solo piano by Franz Liszt are an example. Transcription in this sense is sometimes called arrangement, although strictly speaking transcriptions are faithful adaptations, whereas arrangements change significant aspects of the original piece.

<span class="mw-page-title-main">Imitation (music)</span>

In music, imitation is the repetition of a melody in a polyphonic texture shortly after its first appearance in a different voice. The melody may vary through transposition, inversion, or otherwise, but retain its original character. The intervals and rhythms of an imitation may be exact or modified; imitation occurs at varying distances relative to the first occurrence, and phrases may begin with voices in imitation before they freely go their own ways.

A vantage-point tree is a metric tree that segregates data in a metric space by choosing a position in the space and partitioning the data points into two parts: those points that are nearer to the vantage point than a threshold, and those points that are not. By recursively applying this procedure to partition the data into smaller and smaller sets, a tree data structure is created where neighbors in the tree are likely to be neighbors in the space.

A metric tree is any tree data structure specialized to index data in metric spaces. Metric trees exploit properties of metric spaces such as the triangle inequality to make accesses to the data more efficient. Examples include the M-tree, vp-trees, cover trees, MVP trees, and BK-trees.

Multimedia search enables information search using queries in multiple data types including text and other multimedia formats. Multimedia search can be implemented through multimodal search interfaces, i.e., interfaces that allow to submit search queries not only as textual requests, but also through other media. We can distinguish two methodologies in multimedia search:

Computer audition (CA) or machine listening is the general field of study of algorithms and systems for audio interpretation by machines. Since the notion of what it means for a machine to "hear" is very broad and somewhat vague, computer audition attempts to bring together several disciplines that originally dealt with specific problems or had a concrete application in mind. The engineer Paris Smaragdis, interviewed in Technology Review, talks about these systems — "software that uses sound to locate people moving through rooms, monitor machinery for impending breakdowns, or activate traffic cameras to record accidents."

In linguistics, speech synthesis, and music, the pitch contour of a sound is a function or curve that tracks the perceived pitch of the sound over time. Pitch contour may include multiple sounds utilizing many pitches, and can relate the frequency function at one point in time to the frequency function at a later point.

An audio search engine is a web-based search engine which crawls the web for audio content. The information can consist of web pages, images, audio files, or another type of document. Various techniques exist for research on these engines.

Pop music automation is a field of study among musicians and computer scientists with a goal of producing successful pop music algorithmically. It is often based on the premise that pop music is especially formulaic, unchanging, and easy to compose. The idea of automating pop music composition is related to many ideas in algorithmic music, Artificial Intelligence (AI) and computational creativity.

Tunebot is a music search engine developed by the Interactive Audio Lab at Northwestern University. Users can search the database by humming or singing a melody into a microphone, playing the melody on a virtual keyboard, or by typing some of the lyrics. This allows users to finally identify that song that was stuck in their head.

In computer science, M-trees are tree data structures that are similar to R-trees and B-trees. It is constructed using a metric and relies on the triangle inequality for efficient range and k-nearest neighbor (k-NN) queries. While M-trees can perform well in many conditions, the tree can also have large overlap and there is no clear strategy on how to best avoid overlap. In addition, it can only be used for distance functions that satisfy the triangle inequality, while many advanced dissimilarity functions used in information retrieval do not satisfy this.

Search by sound is the retrieval of information based on audio input. There are a handful of applications, specifically for mobile devices that utilize search by sound. Shazam (service), Soundhound, Axwave, ACRCloud and others have seen considerable success by using a simple algorithm to match an acoustic fingerprint to a song in a library. These applications take a sample clip of a song, or a user-generated melody and check a music library/music database to see where the clip matches with the song. From there, song information will be queried and displayed to the user.

A Dictionary of Musical Themes is a music reference book by Sam Morgenstern and Harold Barlow.

References

  1. "Musipedia: Find Song Without Knowing Its Name Or Words". MUO. 2009-10-01. Retrieved 2021-07-04.
  2. "Site Reviews: Musipedia | Education World". www.educationworld.com. 2009. Retrieved 2021-07-04.
  3. Hughes, Kathleen. "Research Guides: Music Theory & Composition: Music Websites". montclair.libguides.com. Retrieved 2021-07-04.
  4. Logan, Robert K. (2010). Understanding New Media: Extending Marshall McLuhan. Peter Lang. ISBN   978-1-4331-1126-6.
  5. "Amazon puts the web up for rent". 2005-12-15. Retrieved 2021-07-04.
  6. "Planet - Kroonjuwelen Amazon voor iedereen te huur". 2007-03-11. Archived from the original on 2007-03-11. Retrieved 2021-07-04.
  7. Mangalindan, Mylene (2005-12-13). "Amazon Revs Its Search Engine". Wall Street Journal. ISSN   0099-9660 . Retrieved 2021-07-04.