Announcing the AcousticBrainz project

MetaBrainz and the Music Technology Group at Universitat Pompeu Fabra are pleased to announce the first public release of the AcousticBrainz project.

http://acousticbrainz.org/

What is AcousticBrainz?
The AcousticBrainz project aims to crowd source acoustic information for all of the music in the world and make it available to the public. The goal of AcousticBrainz is to provide music technology researchers and open source hackers with a massive database of information about music.

AcousticBrainz uses a state of the art research project called Essentia (http://essentia.upf.edu/), developed over the last 10 years at the Music Technology Group.

Data generated from processing audio files with Essentia is collected by the AcousticBrainz project and made available to the public under the CC0 license (public domain). In 6 weeks since its inception, AcousticBrainz contributors have already submitted data for 650,000 audio tracks using pre-release software.

Today we are releasing client programs to submit data to the AcousticBrainz server and our first public release containing audio features for over 650,000 audio files.

What data does it have?
AcousticBrainz contains information called audio features. This acoustic information describes the acoustic characteristics of music and includes low-level spectral information such as tempo, and additional high level descriptors for genres, moods, keys, scales and much more. These features are explained in more detail at http://acousticbrainz.org/sample-data

How can I get it?
You can access AcousticBrainz data via our API. See details at http://acousticbrainz.org/api
We also provide downloadable dumps of the whole dataset. You can download it (all 13 gigabytes!) at http://acousticbrainz.org/download

What can I do with it?
We hope that this database will spur the development of new music technology research and allow music hackers to create new and interesting recommendation and music discovery engines. Here are some ideas of things we would like to see:

  • Music discovery
  • Playlist generation
  • Improving the state of the art in genre recognition
  • Analytics on the musical structure of popular music
  • and more!

This is one of the largest datasets of this kind available for research, and the only one of this size that we know of which contains both freely available data as well as the reference source code used to compute the data.

How can I contribute?
If you are a music researcher, you can help us by contributing to the essentia project. Go to the essentia homepage to see how you can do this. If you do something cool with the data let us know. We’d like to start a “made with AcousticBrainz” page where we will showcase interesting projects.

If you have any audio files, we would love for you to contribute audio features to our project. You can do this by downloading our submission clients from http://acousticbrainz.org/download. We provide clients for Windows, Mac, and Linux.

If you find any bugs or errors in the AcousticBrainz stack please let us know! Report issues to http://tickets.musicbrainz.org/browse/AB.

We can’t wait to see what kind of things you will make with our data.

The AcousticBrainz team.

Schema change update, 2014-11-17

We’re back with the schema change release, as promised! We only have a small collection of tickets, but several big things:

  • Pre-gap tracks and data tracks for CDs (where neither contribute to the discid, and pre-gap tracks have position of 0)
  • Collections can now be marked with types such as “owned” and “wishlist”, plus some special new types mentioned below.
  • CDStub data is now replicated.
  • All entities (except URLs) should now support tagging, as areas, instruments, and series were made taggable.
  • Events! And, additionally, event collections. All (non-deleted) users should have had an “Attending” and a “Maybe Attending” collection created, with the corresponding collection types.

Upgrade instructions will come in another blog post, though they should be substantially unchanged from past releases. Specifically, we’d like to confirm everything’s working correctly with a specific git commit, and make a new tag, before we post a recommendation, since there’s already been some problems discovered. Some slacker must not have tested this carefully enough (author whistles in an innocent-sounding fashion).

The git commit for this release (sans small fixes that have happened since release earlier today and any others that may need fixing) is v-2014-11-17-schema-change.

Bug

  • [MBS-7638] – CreateIndexes for instruments wrongly looks at label tables

Improvement

  • [MBS-967] – Support for hidden pre-gap tracks
  • [MBS-1059] – Types of list/collection
  • [MBS-7551] – Add folksonomy tag support to areas, instruments, and series
  • [MBS-7784] – Support for data tracks in tracklists

New Feature

Task

  • [MBS-7883] – Make sure delete_unused_url doesn’t remove URLs used in edits

Announcing libmusicbrainz release 5.1.0

I have released a new version of libmusicbrainz. The main changes in this release are the removal of ‘non-free’ XML parsing code, replacing it with libxml2.

N.B. Due to the ABI change, the soname of this library has been bumped. Existing applications will need to be recompiled against the new version.

The following are the main changes in this release:

  • Fix LMB-33 – Handle ‘ended’ element in ‘relation’
  • Fix LMB-34 – Remove non-free XML parser and replace with libxml2
  • Add support for cross-compilation and building out of tree

The release is available here:

libmusicbrainz-5.1.0.tar.gz
(
MD5 checksum: 4cc5556aa40ff7ab8f8cb83965535bc3)

Documentation for the new version is available under

http://metabrainz.github.com/libmusicbrainz/

Downtime for fall schema change

Our next schema change version will be released on Monday, 17 November, 2014 around Noon PST/3pm EST/20:00 GMT/21:00 CET. We expect that MusicBrainz will be unavailable for 30 – 60 minutes during this time. We will put up the downtime notification on the site and tweet from @musicbrainz right before the release.

Sadly, our backup database server suffered a hardware failure and we ran out of time to get a replicated database setup after the hardware was fixed. This means that we won’t be able to put the site into read-only mode and will require us to take a full-downtime.

It sucks and we’re not happy about it either, but there is only so much we can accomplish with our limited resources. :(

Sorry for any troubles this may cause you.

Style update, 2014-11-03

As mentioned when the new style process was announced, at a similar time to every server release post we’ll be publishing a list of what’s changed in style during that period.

The first period of two weeks (or three, in this case) under the process has passed, and these are all the style-related issues that have been accepted and implemented during it. Most of them are very small (mostly adding sites to the whitelist for the Other Databases relationship) although a couple are new relationships or relationships being extended to more entities.

No changes to the guidelines themselves have happened during this period.

  • [STYLE-211] – Allow new allmusic.com release links
  • [STYLE-250] – Add finnmusic.net to the other databases whitelist
  • [STYLE-251] – Add pomus.net to whitelist
  • [STYLE-256] – Add fono.fi to the other databases whitelist
  • [STYLE-269] – Add mixing and mastering to area relationships
  • [STYLE-307] – グラスレ(grass thread/yunisan) as Artist and Label “Other DB” relationship
  • [STYLE-308] – ジャパメタ(japameta) as Artist “Other DB” relationship
  • [STYLE-312] – Add “Deutsche Nationalbibliothek” to whitelist for other databases
  • [STYLE-328] – Add leader/concertmaster artist-release/recording relationships
  • [STYLE-337] – Add IMSLP relationship to artists
  • [STYLE-338] – Add Stage48 Wiki to the other databases whitelist
  • [STYLE-339] – Add CiNii to the other databases whitelist
  • [STYLE-340] – Add NDL to the other databases whitelist

Server update, 2014-11-03

This release was pushed back a week due to scheduling around the GSoC summit and upcoming schema change release, but here it finally is. Editors can take note that more edit types are now auto-edits: adding recording-work relationships, adding/editing aliases, setting track durations, and editing cover artwork. We’ve also added a “Make all edits votable” checkbox to allow edits that are normally always applied automatically (like capitalization changes) to be left open for voting if there’s any dispute or uncertainty. Auto-editors should be aware that this checkbox replaces the one that previously toggled their auto-editor privileges (so it should be left unchecked wherever it was previously left checked).

As part of this release, we’ve deployed some changes that should hopefully prevent slow /ws/2 searches from tying up too many perl processes on our frontends. This may reduce the number of 502s we’ve been seeing lately, but it’s a bit early to pronounce any results. Thanks go to kepstin for suggesting the nginx trickery used here.

More thanks go to chirlu, nikki, and ianmcorvidae for their hard work on today’s release.

The git tag is v-2014-11-03 and the full changelog is below.

Bug

  • [MBS-1444] – Make ‘i’ <-> ‘ı’ an auto-edit.
  • [MBS-5961] – BC dates in search results are parsed incorrectly
  • [MBS-6545] – Medium added without tracknumbers magically gains them
  • [MBS-7399] – Can’t link to TheSession.org’s artist pages
  • [MBS-7927] – Text strings on fingerprints tab are not correctly escaped
  • [MBS-7935] – Can’t add URL examples to relationship documentation
  • [MBS-7938] – Track parser resets track artists to the release artist when “lines contain track artists” is unchecked
  • [MBS-7960] – The link to the mailing lists doc page in the menu doesn’t work
  • [MBS-7961] – Wikidocs redirects are broken
  • [MBS-7962] – Links on /doc/ pages are no longer converted

Improvement

  • [MBS-1479] – Make it possible to leave a capitalization-diacritics edit open
  • [MBS-6011] – Strip out LRM/RLM characters in text with no RTL characters
  • [MBS-7880] – Release editor should have options to copy only titles or only artists to recordings, instead of always both
  • [MBS-7902] – Use nginx X-Accel-Redirect to handle Webservice search requests rather than doing it in perl
  • [MBS-7934] – Add IMSLP autoselect for artists
  • [MBS-7944] – Make recording-work add relationship edits auto-edits
  • [MBS-7945] – Make add/edit alias edits auto-edits
  • [MBS-7946] – Make set track duration edits auto-edits
  • [MBS-7947] – Make edit cover art edits auto-edits
  • [MBS-7953] – Extend normalise_strings to cover more punctuation

Task

  • [MBS-7939] – Add autoselect and validation for release-level Allmusic
  • [MBS-7942] – Add a bunch of sites to the whitelist for dbs