Tag Archives: AcousticBrainz

Another AcousticBrainz update and a survey

Last year we started working on features to improve data produced from information about recordings that you submit to AcousticBrainz. First part of it was a way to create datasets that are used to train high-level models. The next were dataset creation challenges.

We already have a significant number of datasets created by AcousticBrainz community. The list of public datasets is available at https://beta.acousticbrainz.org/datasets/list. A couple of days ago our experimental challenge has concluded. It was related to classifying music with and without vocals. You can see final results at https://beta.acousticbrainz.org/challenges/14095b3b-4469-4e4d-984e-ef5f1a55962c.

Your feedback on high-level data

The latest addition to AcousticBrainz is a way to provide feedback about high-level output that you can see on summary pages for recordings. After a model is applied to all of the AcousticBrainz data we can understand how well it performs on a larger scale. This should help us make further improvements to models and underlying datasets. Keep in mind that you need to be logged in with your MusicBrainz account to see this.

Survey about new features

To help us understand how well new features work for you, we created a survey for you to participate in. If you have used AcousticBrainz, please fill out the survey here: https://goo.gl/forms/Oh3a9INBCCsW2I1i1. It shouldn’t take more than 5 minutes. We’ll keep it open for about a week.

Your feedback is very much appreciated. Especially considering that we don’t have a lot of ways to collect it from people. Some come to IRC and tell us about issues they are having, some comment on blog posts or create tickets in JIRA. But at this point we need a better overview of the current state of the project.

Thank you! 🎶

We’re actually really going to take the HTTPS plunge!

Closing in on three years after stating that “We’re going to take the HTTPS plunge!”, we’re actually really going to do it now. 🙂

Most of our sites have forced HTTPS for some time (metabrainz.org, critiquebrainz.org, bookbrainz.org, listenbrainz.org), but there are still a couple of stragglers, notably musicbrainz.org and acousticbrainz.org.

For MusicBrainz, our beta site is now all HTTPS, web service and all. The main, non-beta musicbrainz.org will be going HTTPS-only except for what’s under /ws/ (ie., the web service) to allow taggers and other programs not currently using HTTPS some transition time. We do not currently have an ETA for when we will make the final jump to HTTPS-only on the MusicBrainz web service, as that partly depends on feedback from our web service users, which leads me to:

If you’re currently using the MusicBrainz web service, please try and switch your program to using beta.musicbrainz.org and see whether your program breaks or not and let us know the status of it. We are aware that some Python versions and MusicBrainz libraries do not support our setup, so while your program may fail now, it might simply be because of dependencies of your program not being updated yet and you might not need to do anything specifically on your end – however, some programs/libraries might need some updates, so the more people test and report back, the better we’ll be able to judge when we can go all-HTTPS-only on musicbrainz.org.

For AcousticBrainz, we now have a shiny new Let’s Encrypt certificate on https://acousticbrainz.org thanks to our systems administrator Zas! As a result, we are going to start redirecting all HTTP traffic to HTTPS on the AcousticBrainz website, including API queries.

In order to give everyone time to verify that their scripts correctly recognise and validate our Let’s Encrypt certificate, we are going to delay the redirect until July 1, 2016. On this date, any HTTP query will automatically be redirected to HTTPS. We will also enable HSTS, so that compliant browsers will redirect to HTTPS on the client-side.

If you have any questions about either the MusicBrainz or the AcousticBrainz transition, please ask.

AcousticBrainz Update

It’s been over a year since we last posted about AcousticBrainz, but a lot of work has been going on in the background. This post will give an overview about some of the things that we’ve achieved in the last year.

Data contributions

Our last blog post was neatly titled “What do 650,000 audio files look like, anyway?” Back then, we thought that this was a lot of submissions. Little did we know… I’m glad to report that we now have over 3.5 million submissions, of which almost 2 million are for unique MBIDs. This is a great contribution and we’d like to thank everyone who submitted data to us.

Dataset and model building

MusicBrainz coder Gentlecat returned to participate in Google Summer of Code last year and developed a new tool to let us create datasets and create new computational models. We’re really excited about how this can allow community members to help us increase the quality of the semantic information we provide in AcousticBrainz. We will make another blog post soon explaining how it works.

We presented an academic overview of AcousticBrainz (PDF) at the 16th International Society for Music Information Retrieval (ISMIR) conference in Malaga, Spain. The feedback from the academic community was very encouraging. Many people were interested in the data and wanted to know what they could do with it. We hope that there will be some new projects announced using the data at this year’s conference.

Integration with other data sources

MusicBrainz and AcousticBrainz don’t exist in a vacuum. One important thing that we need to make sure we do is interact with other researchers and products in the same field. To that end, we started AcousticBrainz Labs, a showcase of some of the experiments that we’re working on in AcousticBrainz. The first thing we have published is a mapping between AcousticBrainz and the Million Song Dataset, that we hope people will use to compare these two datasets.

Database upgrades and Data format changes

We’ve just upgraded to PostgreSQL 9.5 (from 9.3), which allows us to use the new jsonb datatype introduced in PostgreSQL 9.4. This change lets us store feature data more efficiently. We also made some changes to the database schema to let us start creating new data from datasets and computation models.

One result of this is that we are creating a new complete data dump, and stopping the old incremental dumps. We are also taking the opportunity to automate this incremental dump process, which is something that a number of people have asked for.

Another change is that the format of the high level JSON data is changing. This is to better reflect some of the complexities that exist in hosting such a large and varied dataset.

Contribute to AcousticBrainz development

We’re always interested in help from other people to contribute data, code, and ideas to AcousticBrainz. Once again, MetaBrainz is participating in Google’s Summer of Code, and AcousticBrainz is a possible project to work on. If you’re not a student you’re still welcome to work with us.

Write to us in a comment, in IRC, or in our new Discourse category and say hi.

Wrapping up Google Code-in 2015

The Google Code-in Google Code‐in is pretty much over for this time, and we’ve had a blast in our first year with the competition in MetaBrainz with a total of 116 students completing tasks. In the end we had to pick five finalists from these, and two of these as our grand prize winners getting a trip to the Googleplex in June. It was a really, really tough decision, as we have had an amazing roster of students for our first year. In the end we picked Ohm Patel (US) and Caroline Gschwend (US) as our grand prize winners, closely followed by Stanisław Szcześniak (Poland), Divya Prakash Mittal (India), and Nurul Ariessa Norramli (Malaysia). Congratulations and thank you to all of you, as well as all our other students! We’ve been very excited to work with you and look forwards to seeing you again before, during, and after coming Google Code-ins as well! 🙂

Rayna Kanapuram MusicBrainz presentation

Indian student Rayne presenting MusicBrainz to her classmates.

In all we had 275 tasks completed during the Google Code-in. These tasks were divided among the various MetaBrainz projects as well as a few for beets. We ended up having 29 tasks done for BookBrainz, 124(!) tasks for CritiqueBrainz, 95 tasks for MusicBrainz, 1 task for Cover Art Archive, 6 tasks for MusicBrainz Picard, 3 tasks for beets, and 17 generic or MetaBrainz related tasks.

Some examples of the tasks that were done include:

Ariessa MetaBrainz infographic

Finalist Nurul Ariessa Norramli’s MetaBrainz infographic.

In all, I’m really darn happy with the outcome of this Google Code-in and how some of our finalists continue to be active on IRC and help out. Stanisław is continuing work on BookBrainz, including having started writing a Python library for BB’s API/web service, and Caroline is currently working on a new icon set for the MusicBrainz.org redesign that can currently be seen at beta.MusicBrainz.org.

Again, congratulations to our winners and finalists, and THANK YOU! to all of the students having worked on tasks for MetaBrainz. It’s really been an amazing ride and we’re definitely looking forward to our next foray into Google Code-in!

Announcing the AcousticBrainz project

MetaBrainz and the Music Technology Group at Universitat Pompeu Fabra are pleased to announce the first public release of the AcousticBrainz project.

http://acousticbrainz.org/

What is AcousticBrainz?
The AcousticBrainz project aims to crowd source acoustic information for all of the music in the world and make it available to the public. The goal of AcousticBrainz is to provide music technology researchers and open source hackers with a massive database of information about music.

AcousticBrainz uses a state of the art research project called Essentia (http://essentia.upf.edu/), developed over the last 10 years at the Music Technology Group.

Data generated from processing audio files with Essentia is collected by the AcousticBrainz project and made available to the public under the CC0 license (public domain). In 6 weeks since its inception, AcousticBrainz contributors have already submitted data for 650,000 audio tracks using pre-release software.

Today we are releasing client programs to submit data to the AcousticBrainz server and our first public release containing audio features for over 650,000 audio files.

What data does it have?
AcousticBrainz contains information called audio features. This acoustic information describes the acoustic characteristics of music and includes low-level spectral information such as tempo, and additional high level descriptors for genres, moods, keys, scales and much more. These features are explained in more detail at http://acousticbrainz.org/sample-data

How can I get it?
You can access AcousticBrainz data via our API. See details at http://acousticbrainz.org/api
We also provide downloadable dumps of the whole dataset. You can download it (all 13 gigabytes!) at http://acousticbrainz.org/download

What can I do with it?
We hope that this database will spur the development of new music technology research and allow music hackers to create new and interesting recommendation and music discovery engines. Here are some ideas of things we would like to see:

  • Music discovery
  • Playlist generation
  • Improving the state of the art in genre recognition
  • Analytics on the musical structure of popular music
  • and more!

This is one of the largest datasets of this kind available for research, and the only one of this size that we know of which contains both freely available data as well as the reference source code used to compute the data.

How can I contribute?
If you are a music researcher, you can help us by contributing to the essentia project. Go to the essentia homepage to see how you can do this. If you do something cool with the data let us know. We’d like to start a “made with AcousticBrainz” page where we will showcase interesting projects.

If you have any audio files, we would love for you to contribute audio features to our project. You can do this by downloading our submission clients from http://acousticbrainz.org/download. We provide clients for Windows, Mac, and Linux.

If you find any bugs or errors in the AcousticBrainz stack please let us know! Report issues to http://tickets.musicbrainz.org/browse/AB.

We can’t wait to see what kind of things you will make with our data.

The AcousticBrainz team.