MusicBrainz

Last updated
MusicBrainz
MusicBrainz Logo with text (2016).svg
MusicBrainz homepage.png
MusicBrainz homepage
Type of site
Online music encyclopedia [1]
Available inEnglish
Owner MetaBrainz Foundation
Created byRobert Kaye
URL musicbrainz.org
CommercialNo
RegistrationOptional (required for editing data)
Users Over 2 million registered accounts
LaunchedJuly 17, 2000;23 years ago (2000-07-17) [2]
Current statusOnline
Content license
Part Creative Commons Zero (open data) and part CC BY-NC-SA (not open); commercial licensing available
Written in Perl with PostgreSQL database

MusicBrainz is a MetaBrainz project that aims to create a collaborative music database that is similar to the freedb project. MusicBrainz was founded in response to the restrictions placed on the Compact Disc Database (CDDB), a database for software applications to look up audio CD information on the Internet. MusicBrainz has expanded its goals to reach beyond a CD metadata (this is information about the performers, artists, songwriters, etc.) storehouse to become a structured online database for music. [3] [4]

Contents

MusicBrainz captures information about artists, their recorded works, and the relationships between them. Recorded works entries capture at a minimum the album title, track titles, and the length of each track. These entries are maintained by volunteer editors who follow community written style guidelines. Recorded works can also store information about the release date and country, the CD ID, cover art, acoustic fingerprint, free-form annotation text and other metadata. As of October 2023, MusicBrainz contains information on roughly 2.2 million artists, 3.9 million releases, and 30.4 million recordings. [5] End-users can use software that communicates with MusicBrainz to add metadata tags to their digital media files, such as ALAC, FLAC, MP3, Ogg Vorbis or AAC.

Cover Art Archive

Logo of Cover Art Archive Cover Art Archive Logo with text (2020).svg
Logo of Cover Art Archive

MusicBrainz allows contributors to upload cover art images of releases to the database; these images are hosted by Cover Art Archive (CAA), a joint project between Internet Archive and MusicBrainz started in 2012. Internet Archive provides the bandwidth, storage and legal protection for hosting the images, while MusicBrainz stores metadata and provides public access through the Web and via an API for third parties to use. As with other contributions, the MusicBrainz community is in charge of maintaining and reviewing the data. [6] Until May 16, 2022, [7] cover art was also provided for items on sale at Amazon.com and some other online resources, but CAA is now preferred, because it gives the community more control and flexibility for managing the images. As of October 2023, over 4.6 million images exist in the archive. [8]

Fingerprinting

Screenshot of MusicBrainz Picard MusicBrainz Picard 2.7 screenshot.png
Screenshot of MusicBrainz Picard

Besides collecting metadata about music, MusicBrainz also allows looking up recordings by their acoustic fingerprint. A separate application, such as MusicBrainz Picard, is used to do this.

Proprietary services

In 2000, MusicBrainz started using Relatable's patented TRM (a recursive acronym for TRM Recognizes Music) for acoustic fingerprint matching. This feature attracted many users and allowed the database to grow quickly. However, by 2005 TRM was showing scalability issues as the number of tracks in the database had reached the millions. This issue was resolved in May 2006 when MusicBrainz partnered with MusicIP (now AmpliFIND), replacing TRM with MusicDNS. [9] TRMs were phased out and replaced by MusicDNS in November 2008.

In October 2009 MusicIP was acquired by AmpliFIND. [10] Sometime after the acquisition, the MusicDNS service began having intermittent problems.[ citation needed ]

AcoustID and Chromaprint

Since the future of the free identification service was uncertain, a replacement for it was sought. The Chromaprint acoustic fingerprinting algorithm, the basis for AcoustID identification service, was started in February 2010 by a long-time MusicBrainz contributor Lukáš Lalinský. [11] While AcoustID and Chromaprint are not officially MusicBrainz projects, they are closely tied with each other and both are open source. Chromaprint works by analyzing the first two minutes of a track, detecting the strength in each of 12 pitch classes, storing these eight times per second. Additional post-processing is then applied to compress this fingerprint while retaining patterns. [12] The AcoustID search server then searches from the database of fingerprints by similarity and returns the AcoustID identifier along with MusicBrainz recording identifiers, if known.

Licensing

Since 2003, [13] MusicBrainz's core data (artists, recordings, releases, and so on) are in the public domain, and additional content, including moderation data (essentially every original content contributed by users and its elaborations), is placed under the Creative Commons CC BY-NC-SA-2.0 license. [14] The relational database management system is PostgreSQL. The server software is covered by the GNU General Public License. The MusicBrainz client software library, libmusicbrainz, is licensed under the GNU Lesser General Public License, which allows use of the code by proprietary software products.

In December 2004, the MusicBrainz project was turned over to the MetaBrainz Foundation, a non-profit group, by its creator Robert Kaye. [15] On 20 January 2006, the first commercial venture to use MusicBrainz data was the Barcelona, Spain-based Linkara in their "Linkara Música" service. [16]

On 28 June 2007, BBC announced that it had licensed MusicBrainz's live data feed to augment their music web pages. The BBC online music editors would also join the MusicBrainz community to contribute their knowledge to the database. [17]

On 28 July 2008, the beta of the new BBC Music site was launched, which publishes a page for each MusicBrainz artist. [18] [19]

ListenBrainz

Logo of ListenBrainz ListenBrainz Logo (2020).svg
Logo of ListenBrainz

ListenBrainz is a free and open source project that aims to crowdsource listening data from digital music and release it under an open license. [20] It is a MetaBrainz Foundation project tied to MusicBrainz. It aims to re-implement Last.fm features that were lost following that platform's acquisition by CBS. [21] [22]

ListenBrainz takes submissions from media players and services such as Music Player Daemon, Spotify, and Rhythmbox in the form of listens. ListenBrainz can also import Last.fm and Libre.fm scrobbles in order to build listening history. As listens are released under an open license, ListenBrainz is useful for music research for industry and development purposes. [23] [24] [25] [26] [27]

ListenBrainz can also generate recommendations and playlists based on individual listening. [28]

In December 2021, the Year in Music reports feature was added, allowing users to find out and share their top tracks, albums, and artists for the year. [29]

See also

Related Research Articles

Freedb was a database of user-submitted compact disc track listings, where all the content was under the GNU General Public License. To look up CD information over the Internet, a client program calculated a hash function from the CD table of contents and used it as a disc ID to query the database. If the disc was in the database, the client was able to retrieve and display the artist, album title, track list and some additional information.

Last.fm Limited is a music website founded in the United Kingdom in 2002. Using a music recommender system called "Audioscrobbler", Last.fm builds a detailed profile of each user's musical taste by recording details of the tracks the user listens to, either from Internet radio stations, or the user's computer or portable music devices. This information is transferred ("scrobbled") to Last.fm's database either via the music player or via a plug-in installed into the user's music player. The data is then displayed on the user's profile page and compiled to create reference pages for individual artists.

The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a protocol developed for harvesting metadata descriptions of records in an archive so that services can be built using metadata from many archives. An implementation of OAI-PMH must support representing metadata in Dublin Core, but may also support additional representations.

<span class="mw-page-title-main">Amarok (software)</span> Free and open source music player

Amarok is a free and open-source music player, available for Unix-like, Windows, and macOS systems. Although Amarok is part of the KDE project, it is released independently of the central KDE Software Compilation release cycle. Amarok is released under the terms of the GPL-2.0-or-later.

RDFa or Resource Description Framework in Attributes is a W3C Recommendation that adds a set of attribute-level extensions to HTML, XHTML and various XML-based document types for embedding rich metadata within Web documents. The Resource Description Framework (RDF) data-model mapping enables its use for embedding RDF subject-predicate-object expressions within XHTML documents. It also enables the extraction of RDF model triples by compliant user agents.

<span class="mw-page-title-main">Tag editor</span> Software for editing the metadata of media files

A tag editor is an app that can add, edit, or remove embedded metadata on multimedia file formats. Content creators, such as musicians, photographers, podcasters, and video producers, may need to properly label and manage their creations, adding such details as title, creator, date of creation, and copyright notice.

AmpliFIND is an acoustic fingerprinting service and a software development kit developed by the US company MusicIP.

<span class="mw-page-title-main">Jaikoz</span> Java tagging program

Jaikoz is a Java program used for editing and mass tagging music file tags.

<span class="mw-page-title-main">Gracenote</span> American data company

Gracenote, Inc. is a company and service that provides music, video and sports metadata and automatic content recognition (ACR) technologies to entertainment services and companies, worldwide. Formerly CDDB, Gracenote maintains and licenses an Internet-accessible database containing information about the contents of audio compact discs and vinyl records. From 2008 to 2014, it was owned by Sony, later sold to Tribune Media, and has been owned since 2017 by Nielsen Holdings.

<span class="mw-page-title-main">DBpedia</span> Online database project

DBpedia is a project aiming to extract structured content from the information created in the Wikipedia project. This structured information is made available on the World Wide Web using OpenLink Virtuoso. DBpedia allows users to semantically query relationships and properties of Wikipedia resources, including links to other related datasets.

<span class="mw-page-title-main">Metadata</span> Data about data

Metadata is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including:

Freebase was a large collaborative knowledge base consisting of data composed mainly by its community members. It was an online collection of structured data harvested from many sources, including individual, user-submitted wiki contributions. Freebase aimed to create a global resource that allowed people to access common information more effectively. It was developed by the American software company Metaweb and run publicly beginning in March 2007. Metaweb was acquired by Google in a private sale announced on 16 July 2010. Google's Knowledge Graph is powered in part by Freebase.

BBC Redux was a BBC Research & Development system that digitally recorded television and radio output in the United Kingdom produced by the British Broadcasting Corporation. It operated from 2007 to 2022 and contains several petabytes of recordings and subtitle data. It is notable for being the proof of concept for the Flash video streaming version of the BBC iPlayer.

<span class="mw-page-title-main">Puddletag</span> Tag editor for Unix-like operating systems

Puddletag is a graphical audio file metadata editor ("tagger") for Unix-like operating systems.

AcoustID is a webservice for the identification of music recordings based on the Chromaprint acoustic fingerprint algorithm. It can identify entire songs but not short snippets.

<span class="mw-page-title-main">Kid3</span> Tag editor

Kid3 is an open-source cross-platform audio tag editor for many audio file formats. It supports DSF, MP3, Ogg, FLAC, MPC, MPEG-4 (mp4/m4a/m4b), AAC, Opus, SPX, TrueAudio, APE, WavPack, WMA, WAV, AIFF, tracker modules.

Druid is a column-oriented, open-source, distributed data store written in Java. Druid is designed to quickly ingest massive quantities of event data, and provide low-latency queries on top of the data. The name Druid comes from the shapeshifting Druid class in many role-playing games, to reflect that the architecture of the system can shift to solve different types of data problems.

<span class="mw-page-title-main">FAIR data</span> Data compliant with the terms of the FAIR Data Principles

FAIR data are data which meet principles of findability, accessibility, interoperability, and reusability (FAIR). The acronym and principles were defined in a March 2016 paper in the journal Scientific Data by a consortium of scientists and organizations.

<span class="mw-page-title-main">GigaMesh Software Framework</span> Software framework for processing and analyzing 3D mesh data

The GigaMesh Software Framework is a free and open-source software for display, editing and visualization of 3D-data typically acquired with structured light or structure from motion.

SecondHandSongs is a collaborative website that maintains a global database of mainly cover versions of original works. It also contains information about adaptations and samples. The website allows performers and volunteer curators to add songs and update their metadata. It includes links to freely accessible recordings of the covers, and external identifiers for those works and performances in other databases.

References

  1. "About". MusicBrainz. MetaBrainz. Archived from the original on 2015-05-08. Retrieved 4 May 2015.
  2. "WHOIS Lookup". ICANN. Archived from the original on 2015-04-02. Retrieved 23 March 2015.
  3. Highfield, Ashley. "Keynote speech given at IEA Future Of Broadcasting Conference Archived 2008-04-22 at the Wayback Machine ", BBC Press Office, 2007-06-27. Retrieved on 2008-02-11.
  4. Swartz, A. (2002). "MusicBrainz: A semantic Web service" (PDF). IEEE Intelligent Systems. 17: 76–77. CiteSeerX   10.1.1.380.9338 . doi:10.1109/5254.988466. Archived (PDF) from the original on 2015-04-03. Retrieved 2015-08-28.
  5. "Database Statistics". MusicBrainz. Retrieved 2023-10-10.
  6. Fabian Scherschel (10 October 2012). "MusicBrainz and Internet Archive create cover art database". The H. Archived from the original on 7 December 2013.
  7. "MetaBrainz Blog". MetaBrainz Blog. Retrieved 2022-08-04.
  8. "Database Statistics – Cover Art". MusicBrainz. Retrieved 2023-10-10.
  9. "New fingerprinting technology available now!" (Press release). MusicBrainz community blog. 2006-03-12. Archived from the original on 2008-08-07. Retrieved 2006-08-03.
  10. AmpliFIND Music Services: News Archived 2013-09-21 at the Wayback Machine
  11. "Introducing Chromaprint – Lukáš Lalinský". Oxygene.sk. 2010-07-24. Archived from the original on 2018-10-10. Retrieved 2018-04-10.
  12. Jang, Dalwon; Yoo, Chang D; Lee, Sunil; Kim, Sungwoong; Kalker, Ton (2011-01-18). "How does Chromaprint work? – Lukáš Lalinský". IEEE Transactions on Information Forensics and Security. 4 (4): 995–1004. doi:10.1109/TIFS.2009.2034452. S2CID   1502596 . Retrieved 2018-04-10.
  13. "MusicBrainz Licenses". Archived from the original on April 13, 2003. Retrieved 2015-10-23.
  14. MusicBrainz License as of 13-11-2010.
  15. Kaye, Robert (2006-03-12). "The MetaBrainz Foundation launches!" (Press release). MusicBrainz community blog. Archived from the original on 2011-05-19. Retrieved 2006-08-03.
  16. Kaye, Robert (2006-01-20). "Introducing: Linkara Musica". MusicBrainz. Archived from the original on 2008-09-07. Retrieved 2006-08-12.
  17. Kaye, Robert (2007-06-28). "The BBC partners with MusicBrainz for Music Metadata". MusicBrainz. Archived from the original on 2007-06-30. Retrieved 2007-07-10.
  18. Shorter, Matthew (2008-07-28). "BBC Music Artist Pages Beta". BBC. Archived from the original on 2009-01-24. Retrieved 2009-02-12.
  19. MusicBrainz and the BBC Archived 2018-02-20 at the Wayback Machine as of 2013-03-16
  20. "ListenBrainz Goals". ListenBrainz. Retrieved 13 February 2021.
  21. O'Brien, Danny (3 June 2021). "Organizing in the Public Interest: MusicBrainz". Electronic Frontier Foundation. Retrieved 9 December 2023.
  22. Vigliensoni, Gabriel; Fujinaga, Ichiro (23 October 2017). "The Music Listening Histories Dataset". Proceedings of the 18th International Society for Music Information Retrieval Conference. Suzhou, China: ISMIR: 96–102. doi:10.5281/zenodo.1417499 . Retrieved 17 February 2024.
  23. Singh, Param; Kamlesh, Dutta; Kaye, Robert; Garg, Suyash (2020). "Music Listening History Dataset Curation and Distributed Music Recommendation Engines Using Collaborative Filtering". Proceedings of ICETIT 2019. Lecture Notes in Electrical Engineering. Vol. 605. pp. 623–632. doi:10.1007/978-3-030-30577-2_55. ISBN   978-3-030-30576-5. S2CID   204103568 . Retrieved 13 February 2021.
  24. Yadav, Naina; Singh, Anil (December 2020). "Bi-directional Encoder Representation of Transformer model for Sequential Music Recommender System". Forum for Information Retrieval Evaluation. pp. 49–53. doi:10.1145/3441501.3441503. ISBN   9781450389785. S2CID   231628582 . Retrieved 13 February 2021.
  25. Schedl, Markus; Knees, Peter; McFee, Brian; Bogdanov, Dmitry (22 November 2021). "Music Recommendation Systems: Techniques, Use Cases, and Challenges". Recommender Systems Handbook. pp. 927–971. doi:10.1007/978-1-0716-2197-4_24. ISBN   978-1-0716-2196-7 . Retrieved 9 December 2023.
  26. Pocaro, Lorenzo; Gómez, Emilia; Castillo, Carlos (12 July 2023). "Assessing the Impact of Music Recommendation Diversity on Listeners: A Longitudinal Study". ACM Transactions on Recommender Systems. arXiv: 2212.00592 . doi:10.1145/3608487. S2CID   254125611 . Retrieved 17 February 2024.
  27. Ray, Brian (6 December 2019). "Build a useful ML Model in hours on GCP to Predict The Beatles' listeners". Towards Data Science. Towards Data Science Inc. Retrieved 17 February 2024.
  28. Porter, Alastair (24 December 2020). "Playlists and personalised recommendations in ListenBrainz". MetaBrainz Blog. MetaBrainz Foundation. Retrieved 13 February 2021.
  29. "ListenBrainz presents: Your Year in Music". MetaBrainz Blog. 2021-12-16. Retrieved 2023-12-08.

Further reading