Matthew Exon asked a few great questions about the MetaBrainz Foundation launch, and I wanted to share my answers with everyone here on this blog:
On Apr 19, 2005, at 3:58 AM, Matthew Exon wrote:
> Congratulations! This is certainly a big step for MusicBrainz, and I’m
> sure it gives all of us a greater feeling of confidence and
> responsibility about the whole project.
Thanks — I’m glad to hear that. I’m trying to project that on a much larger scale. To the point where corporations get that sense too.
> The teasers on the test MetaBrainz page over the last couple of weeks
> have raised some questions for me, so this is my chance to ask if you
> could clarify some of them for me. I don’t expect full dissertations
> turn up on the MetaBrainz site overnight, and I guess it would be
> prudent for you not to respond to some things here, but you might be
> interested to know what questions occur to this punter in the street.
Do you mind if I post this response to the MB blog? Your questions a excellent and until I can have a fully articulated position on the licenses, I’d like to have something to refer to.
> First of all, there’s the big “licence” word in there, and I’d like to
> see a bit more detail about precisely what kind of agreements you have
> in mind. For example, will licensees be prohibited (or encouraged!) to
> further distribute the data, either on a commercial or non-commercial
> basis? Are you selling data, or are you in effect selling bandwidth to
> the MusicBrainz servers? How do you intend to balance bandwidth
> licensees and the general public? These questions are far too complex
> to be answered straight away, especially when you don’t actually *have*
> any licensees yet, but it’d be interesting to see a guess.
I think I can answer all of your questions — they are great questions!
First off, license may be a big and scary word, but its been a part of
MusicBrainz from early on. For more information on this topic and how
we arrived at our current license scheme, please check out:
Currently MusicBrainz divides the dataset into two chunks: the core
data is in the Public Domain (read: no license, do with it as you
please) and the ancillary data (search index, moderation, etc) is
released under a Creative Commons license for non-commercial use. For
more details, see: http://musicbrainz.org/about/licenses.html
The live data-feed doesn’t change any of this. The live-data feed is a
stream of hourly chunks of data being generated by the project. This
data feed contains both PD and CC data, and thus the whole is licensed
under the most restrictive license, the CC share alike, non-commercial
license. This means that anyone who wants to have an MB server in a
non-commercial setting, can have a server that is no more than 70
minutes out of date with the main server.
So, any commercial entity that wants to use MB data, can now:
1. Download the twice weekly snapshot and use it without paying a
2. Arrange to license the live data feed and their data stays up to
So, no one is going to pay for the data itself. Commercial customers
are going to pay for the privilege of having bite-size chunks of data
applied to their own server. If a customer is running a 24/7 service,
option #1 can be a real pain — taking down your servers to import
fresh data sucks.
Commercial licensee’s will be able to use the live data stream in any
which way they want. Once the data resides on their own server, they
can use it as many times in their organization as they want. With as
many copies as they want — they will not be required to share their
data with anyone else. However, they will not be allowed to offer our
live data feed to others — we don’t want our commercial customers
competing with us. That makes no sense.
Balancing bandwidth between users and commercial customers may not
really be much of an issue. Downloading a live data feed amounts to the
same bandwidth in a day as one or two people using the tagger to tag
their music collections. Not a big deal.
Should the day come when it does become a big deal, I will set up a
another server, that will be paid for by the data licenses and its only
purpose will be to farm out the data to commercial customers, leaving
the rest of the bandwidth to our contributors.
> BTW, licencing the data is certainly the way to go, and I don’t have
> reason to worry about my right to access it: but this is the kind of
> thing that makes contributors nervous, so it’s worth thinking about
> from the tinfoil hat point of view.
Understood. I’ve been listening to the community for over 5 years and
I’m very aware of people being very sensitive about me ‘pulling a
gracenote’. I think my solution is a pretty good one — everyone has
access to the data, yet we can license the data for commercial use and
not sell out.
Another point that should put people at ease is that with the
non-profit in place, the data is officially owned (as much as you can
own PD data) by the MetaBrainz Foundation. Property of a non-profit
cannot be sold to a for profit entity — it must be destroyed or
donated to another non-profit.
I *can’t* pull a GraceNote.
> Second, I’d like to be clear on what the relationship with Amazon is.
> Somewhere I got the impression that you were waiting until this launch
> to become an Amazon associate.
We had to wait until we were a recognized non-profit. Late last year we
turned on the amazon ids and started collecting associate fees from
them. Over the last 6 or so months, we’ve taken in around $50. Not much
to write home about, but having the cover art is great — and having
some beers with friends on Amazon’s tab seems like a nice cherry on
top. 🙂 I’ll probably spend the referral fees on beer in London.
> This seems like the simplest and
> quickest way to start raking in cash. Can we expect an announcement
> about this too?
> Are you going to develop a full-on web-services based
We have a full on web service — the tagging applications use it. We
had a web-service since before the term was coined. 🙂
> You could potentially manipulate users’ shopping baskets and
> stuff on the MusicBrainz server side, and turn MusicBrainz into a
> shopping site as well as a metadata site. In fact, MusicBrainz seems
> me to have the potential to be the most important Amazon web services
> partner they’ve ever had. I mean, MB is kinda the music equivalent of
> IMDB, and Amazon *bought* IMDB…
I agree. However the fundamental reality is that you tend to make very
little from associate’s fees. There are a number of companies that
tried to make business models out of this and failed in spectacular
ways. I believe that one customer paying one month’s of full license
fees will be more income that associate’s fees for the entire year.
Over time that is likely to change, but if our data licenses grow at
the same rate as our overall usage, then the associate’s fees will
never catch up to the data license fees.
> So OK, here’s my $64,000 question (possibly literally): are you hoping
> to licence the data to Amazon for use in their online catalog? I can’t
> imagine any potential licensee bigger than Amazon, but maybe I’m
I’d love that and I’ve asked Jeff Bezos that myself. I did that two
years ago, when our data was less mature, and the answer was no — not
surprisingly. Even today that will likely not be any better, since we
do not have rights to cover art. The cover art is a big deal for
Amazon, and we simply cannot offer that.
> Lastly, is MusicBrainz now effectively a commercial competitor to
In a sense yes — allmusic supplies the data for Amazon, I believe.
While allmusic still has more data than we do, I think we’re going to
catch up pretty fast. If the answer is not yes today, in 6-12 months it
> If so, do you expect this to be a rather rough ride? I’m
> thinking in terms of FUD, patent war, allegations of copyright
> infringement, and so on. Again, maybe paranoid, but we’ve all seen
> nastier behaviour before…
Not paranoid at all — someone recently called MB’s competitors ‘enemy
combatants’. So there is some truth to that fear. FUD I think we can
deal with — much in the same sense the Linux deals with M$’s FUD.
Patent wars are going to be an issue — for now we’re staying clear of
other’s patents. That just makes sense for us. Copyright violations? As
long as we’re vigilant and make sure that no one starts importing data
from anther source, we should be fine. It never has been a problem for
us and now that our database is more mature, there will be even less of
a chance of this being a problem for us.
Now, my take on copyright violations is a conservative one — MB should
never accept data from any other commercial sources — only from
people’s brainz. However, the actual legal position is much more in our
favor. In the data-license white paper I talk about the Feist vs Rural
telephone company Supreme Court case. This case found that facts (like
the title of a CD or the name of an artist) are not copyrightable.
So, in theory, someone could copy data from allmusic directly into
MusicBrainz and have it be legal (as long as everything happened in the
US). Allmusic’s lawyers may think differently about this, and that is
why MB cannot accept data from any other database.
> Anyway, again congratulations, and try not to work yourself too hard!
> suspect you’re going to be spending too much time in the next few
> talking about MetaBrainz and associated stuff to even think about
> 😦 It’s important work though!
Au contraire — I hope that all the hard non-profit work is behind me.
I certainly have some work to do on licensing our data, but Picard is
quickly moving to the top of my todo list. I hope to spend serious time
on Picard in May.
Let me know if I can post this on the blog!