Author Archive

A Request for Feedback on the Upcoming “Changed MBID” Service

Tuesday, February 26th, 2013

A common problem for users of MusicBrainz is that of synchronizing a local collection against the main MusicBrainz servers. Our current rate limit stipulates that you make at most 1 request per second, which we understand is extremely limiting – especially if you’re trying to fetch thousands of releases! During our first hack weekend, we created the beginnings of a service to allow you to get a list of MBIDs that have been updated. We have finished the preliminaries of this service, and now we need to hear from you how you’d want to utilize this.

Change Logs

The most basic data we currently gather is a JSON document containing a list of MBIDs that have changed per hour. For each of our data replication packets, we generate a JSON packet that summarizes all of the MBIDs that have changed, either directly on indirectly (such as the addition of more relationships).

A “What’s Changed?” Service

The first piece of feedback we received was that people were not really interested in consuming this data stream, but would rather have a service that allows them to query what data has changed in a given window of time. Having to manually fetch packets and perform set intersections is not particularly difficult, but the more hoops people have to jump through, the less likely they are to even use the service. We’ve been pondering how best to implement this service, and we would like feedback on the following options:

  1. Filter a list of MBIDs

    The service would allow you to POST a set of MBIDs, and would in turn return the subset of these MBIDs that have been changed. You are able to specify any date and have all changes since that date. For example, you could find all changes to all releases in your library since you last checked 2 weeks ago.

    Because every MBID would take 36 bytes to submit, there will be a limit on the amount of MBIDs that can be submitted in order to preserve bandwidth.

  2. Provide client libraries

    Rather than having people craft their own web service requests, MusicBrainz should provide a library to do this. This will allow us to use more advanced techniques (for example, Bloom filters) to both conserve bandwidth, and allow for larger queries. In this scheme the web service will be documented, but users are not expected to consume it directly.

  3. Support Both!

    MusicBrainz could offer a simplified API, which is based on option 1, while also supporting larger queries through option 2. For example, we might limit option 1 to have a maximum of 4000 MBIDs per request/response, while the service that depends on our client libraries could handle many more.

  4. Allow filtering based on collections

    MusicBrainz already has the concepts of collections, which have an associated unique identifier, so these will be used to filter the list of changes. This limits the service to only deal with releases, and will require people set up collections before they can do queries. Again, due to the possibility of large collections, there will likely be pagination on responses – though the per-page limit will probably be fairly high.

These are the ideas that we’ve been debating, and we’d love to know which of these would work for you. If you have other ideas, we’re also very interested in hearing what those are!

Housecleaning part 2: Moving our mailing lists

Monday, February 11th, 2013

Part 2 in our housecleaning series concerns our mailing lists. Hosting mailing lists is quite a pain and we’d rather leave this pain to people who specializein mailing lists. So, we are proposing to do the following things:

  1. Remove the under-utilized list musicbrainz-italian.
  2. Remove the musicbrainz-commits mailing list. Github (and similar sites) have better notification systems, so we don’t really need this list anymore.
  3. Ask the Xiph Foundation to find a new home for the XSPF Playlist mailing list.
  4. Remove the under-utilized musicbrainz-users list since the forums are predominantly used for end-user discussion. We’ll point people to the forums for those.

Finally, we would like to get some suggestions and feedback on where we should host our mailing lists. We’re considering:

  • Nabble: This has gotten mixed reviews from various users.
  • Librelist: This site is quite new and UI reservations have been noted about it.
  • Savannah: This site has many more features than just mailing lists. We’re not certain if we can move only our mailing lists here.
  • Google Groups: We’ve heard complaints about spam and spam fighting tools. Has this improved recently?

If you have any comments on any of these solutions or proposed list consolidation ideas, please let us know. Also, if you know of a cheap/free/good list provider that we didn’t list, please let us know!

Housecleaning part 1: Please help us create a new theme for our blog

Monday, February 11th, 2013

We have one aging machine (scooby) that has been in continuous service since 2006. Back then we didn’t have as many options for hosting source code, mailing lists and blogs. Today, we have a lot more choice and we’re opting to host fewer things so that we can focus our energy on hosting MusicBrainz and not a bunch of ancillary stuff. Our goal is to retire scooby soon and move the services that run on that server elsewhere.

Our blog is the first thing to move: We’re moving it to wordpress.com and we’re nearly done with the move. But, we dont have a decent wordpress MusicBrainz theme for our blog. If anyone is interested in taking an existing wordpress theme and making it a custom MusicBrainz theme, we would love your help!

If you’re interested, please leave a comment and we’ll get in touch with you to coordinate this process.

Thanks!

Please welcome AOL Music into the MetaBrainz ecosystem!

Thursday, February 7th, 2013

The continued economic turmoil persisted in 2012 and thus it was a slow year for adding new customers for MetaBrainz. However, we did add one high profile customer in 2012: AOL Music.

For a number of reasons we felt that it was prudent to get MusicBrainz integrated into AOL before making public news about it. Now the time is finally right to talk about our relationship with AOL and Winamp. I had been talking to Geno Yoham (GM of Winamp) and Lisa Namerow (GM of AOL Music) about MusicBrainz at various conferences for several years. Forging relationships with large companies take a quite a long time and the formation of our relationship was really no different. At the end of 2011 Geno, Lisa and team were ready to take action and surprised me by pledging a sizeable donation to the MetaBrainz Foundation. This donation was received early in 2012 about the same time that we signed the data license contract. And just last week we received another donation for 2012!! Thanks AOL and Winamp!

Early in 2012 AOL launched updated services underpinned by MusicBrainz data:

  • The Now Playing feature in Winamp allows a user to find out more about the artist that is currently playing in Winamp.
  • The AOL Music Artist pages also use MusicBrainz data to display discography information and to provide some of the links for the other content shown on those pages.

Our relationship with AOL follows a similar pattern to our relationship to the BBC. The BBC has done wonders for highlighting and lending credibility to MusicBrainz and I expect that our relationship with AOL will bring about similar benefits for MusicBrainz.

Thank you team AOL and especially to Geno Yoham and Lisa Namerow for believing in us!

We have a new community calendar

Friday, February 1st, 2013

We’ve been scheduling more meetings for discussing various complex topics, but communication about those dates has not been clear. In order to fix this, we’ve created a community curated calendar:

http://calendar.musicbrainz.org

reosarevok, nikki, ian, ollie, warp and myself can put things onto the calendar. If you have something you’d like to have added to the calendar, please ask one of these folks.

Preparing for the May 15th schema change release

Friday, February 1st, 2013

It it time for us to start the process towards the next schema change release. Starting today and for the next two weeks, we’re going to seek people to be the champion (sponsor) of a ticket. If you feel strongly about a schema change ticket getting taken care of, you should consider championing this ticket. Once you’ve decided to do adopt a ticket, you should assign the ticket to yourself.

Then, over the next two weeks it will be up to you to do the following:

  1. Drive consensus around the core concept of the ticket. If you go through the process of working up a ticket, but no one agrees with what you’re proposing, you’ve wasted your time. Make sure that you get buy in from others in the community. For instance, if Nikki doesn’t like it, chances are its not going to fly. :-)
  2. Each schema change feature requires two tickets: 1) An SQL ticket that implements the actual changes to the database and defines the queries used to fetch the data. 2) A UI change ticket that implements the UI portions of the schema change ticket.
  3. Ensure that the ticket clearly states what needs to be done to implement the ticket. The ticket should essentially become or link to a requirements document. This requirements document should explain what the new feature should do. It should not explain how it should be done — we should leave the how to our developers who are going to implement the feature.
  4. Provide as much supporting documentation as you can. Mock-ups for UIs are deeply appreciated (even if they delve into the how realm of things) and very useful for meaningfully discussing these tickets.
  5. Have the ticket reviewed by a developer for clarity and completeness, then address any issues said developer may raise.

On 15 February, we’re going to look at the list of tickets that people have taken on and choose the ones that are clear enough to move forward. If you’ve done all the work outlined above, the chances are good that your ticket will be chosen to move forward. If your ticket is chosen to move forward, there will be more questions that the developers will raise — hopefully those can be tackled in the space of a week. After that we will take all of the well defined tickets and schedule them for implementation. All the other tickets that are not clear to implement will be rejected and will have to make another pass though this process in the autumn.

If you’re still interested, here is the list of schema change tickets that should be considered for this.

We’re going to follow the this schedule:

  • 1 Feb: Schema change ticket selection starts
  • 15 Feb: Select schema change tickets for implementation, start making tickets fully actionable
  • 1 March: Tickets must be fully actionable. Tickets that are not actionable will be dropped from the 15 May release.
  • 15 March: SQL tickets must be fully implemented.
  • 1 May: UI tickets must be fully implemented, start final ticket testing phase
  • 15 May: Release day

All of these dates have been added to our new community calendar.

IMPORTANT: Proposed changes to the data returned by our web service

Thursday, January 31st, 2013

Our current web service at the /ws/2 endpoint returns too much data in a lot of cases and in many cases we suspect that the programs making the calls to the service don’t actually consume all of that data. We’d like to reduce the amount of unused data our web service returns, in order to reduce our bandwidth costs. We propose that:

  • The web service will no longer includes aliases and tags in relation elements. Regardless of what entity you may request, if the results of your request includes a relation element, any alias or tag elements that are currently returned will no longer be returned.
  • The web service no longer includes aliases and tags in for the Various Artists artist anywhere, unless you specifically request the Various Artist from the /ws/2/artist endpoint.

We’ve mocked up these changes in the following XML files:

We think that this will have a minimal impact on our web service users. If you use our web service, please tell us what you think about this. If you know someone who is using our web service, but may not read this blog, please forward a link to this post to them.

For more background on our research into this topic, please take a look at this document.

Privacy policy inconsistencies

Tuesday, January 22nd, 2013

Recently we’ve received two bug reports that point out two inconsistencies in our privacy policy:

  • MBS-5708: It’s not possible to disable the display of cover art but the privacy policy claims it is. There are two possible options for fixing this; fix the privacy policy or make a new preference. Which would you prefer?
  • MBS-5709: Inclusion of Google Analytics is in violation of the privacy policy. This one is more tricky, since we link to other third parties (archive.org, gravatar, captcha) that are also not mentioned in the policy. And changing the policy for each time we add a new third party becomes cumbersome. No clear solutions have formed around this issue, so we would like your feedback on this.

If you care about our privacy policy, please take a moment to read these bugs and comment on them. Thanks!

7digital & The Echo Nest have become MusicBrainz customers

Thursday, January 17th, 2013

I’m pleased to announce that 7digital and The Echo Nest have become our latest customers!

7digital enables a lot of digital music stores and provides a lot of services for mobile operators. 7digital has relationships with many labels and thus faces complex metadata issues. I’m quite pleased that 7digital has chosen to partner with MusicBrainz to fix these metadata issues.

The Echo Nest provides tons of digital music services and is a driving force behind Music Hack Days here in the States. The Echo Nest also created project Rosetta Stone, a service that translates to/from MusicBrainz IDs from/to other ID spaces like the Echo Nest IDs or Rdio IDs.

Welcome to the MusicBrainz ecosystem!

A sad day for the Internet: RIP Aaron Swartz

Saturday, January 12th, 2013

As you’ve probably seen around the net today, Aaron Swartz, Internet Hero has committed suicide.

Aaron Swartz has spent most of life working to improve the Internet and to preserve freedom on the net. Many people are speaking to his awesome accomplishments in the last 10 years of his life, but I’d like to take a minute and reflect on his earlier years.

I was one of the fortunate people who met Aaron when he was still 15 — we first met up in Washington DC for O’Reilly’s P2P conference. Since he was a minor, his mom was accompanying him. Never mind that she had a broken leg at the time — she was so dedicated to her son that she traveled with him to allow him to participate in things that most minors couldn’t even imagine.

Before I met him in person, Aaron was an active contributor to MusicBrainz. When I started my first mis-guided attempts to create an RDF based web-service, he worked with me to improve the schema. He helped me understand RDF (damn that RDF spec!) and helped me fix the schema until it actually worked properly. Aaron was always looking for new and interesting things to do, so once his mission with MusicBrainz was done, he moved on to bigger and better things. And the things he did — simply amazing that one person can accomplish so much in so little time.

Aaron and I shared one passion — making data open and accessible. His means were always more aggressive than mine; he often chose the faster, more risky approach. I usually favor the slow-and-steady-will-win approach. Regardless, the events that led up to his suicide leave me deeply unsettled about the current state of affairs.

Aaron, thank you for being the instigator, shit-stirrer, advocate and dissident you were. I appreciate everything you’ve done during your short stay here in this troubled planet. May your next journey be more peaceful!

Thank you to Cory, Larry and Brewster for your kind words.

UPDATE: Here is a link to the paper Aaron wrote about MusicBrainz.