Category Archives: Uncategorized

Schema change update

We’ve finally completed the schema update and things are returning to normal. We need to get a new data dump out and then we will provide upgrade instructions tomorrow. As you might be able to guess, unless you are already on Postgres 9.5, we are going to recommend a clean data import, rather than a migration, if you have a replicated slave.

And, if anyone even dare ask (within the next week) when an updated VM will be released, you owe the whole development team each 2 bars of high quality chocolate.

Feeling lucky, punk?

P.S. Can you tell we’ve been up too long?🙂

Server capacity update

Zas and I have been working hard to improve the capacity and stability of the site. In the last week, we’ve identified and fixed at least 3 problems with the search servers and we’ve added a timeout function that times out queries that take longer than 3 seconds. We think that the main cause of trouble was that queries were piling up after a slow query ran too long and that the servers never recovered from that and consequently crashed.

We won’t go as far as saying that the search servers are fixed — every time we have a smidgen of hope that things are improving, they crash again. Seemingly out of spite! So, the search servers are better.😉

Zas has also made a number of changes to the gateways and how we rate limit our incoming traffic. The rate limiting is now being done in a smarter way that reduces the overall traffic on our web servers. Well done!

We’ve also increased our bandwidth budget by 4mbits per second, which makes the site feel considerably more responsive.

Let me put these improvement into numbers: About a week ago were were struggling to keep up 250 requests per second and the site felt very sluggish. Now we can handle 500 requests a second and the site feels considerably faster. For large chunks of the day we are managing to handle all the traffic we should handle. And, the search servers haven’t crashed in 4 days!

We hope that this will give us a solid base from which to release the scheme upgrade tomorrow. Then once that is complete, we will start work on moving to the new hosting company.

Thanks for being patient with us!

Sophie Goossens joins the MetaBrainz board of directors (and more!)

I’m pleased to announce that Sophie Goossens, an attorney in London, has joined the board of directors of the MetaBrainz Foundation. Sophie specializes in intellectual property law and has ties to the European Commission, which makes her a great addition to our board of directors.

Welcome to our board of directors, Sophie!

Sophie replaces Carol Smith who decided to move on from the board after leaving her position as the head of Google’s Summer of Code program. Carol joined us in late 2009 and has held the position as treasurer & secretary since then. Two years after joining us, she became a full director in early 2011.

Thank you for everything you’ve done for MetaBrainz in the past 6+ years, Carol!

Last, but not least, we needed to fill the Secretary/Treasurer slots that were vacated by Carol. Luckily for us, our business development manager Christina Smith stepped up to those duties and was voted onto the board back in February. (Now that all of these changes are complete, we can publicly speak about them.)

Thank you for taking on these two positions, Christina. I’m also quite happy that we’ve preserved the balance of people with the last name Smith in our board.🙂

Thanks Sophie, Carol and Christina!

Help! Is there a Lucene doctor in the house?

UPDATE: Thanks to user selckin in the #lucene IRC channel for quickly solving this for us! Hopefully we can put this fix into production later today!

As our regular readers may know, we’ve been having lots of troubles with our lucene based search servers. Over the past few days we’ve spent a fair amount of time, tuning, debugging and otherwise trying to troubleshoot our setup. We’ve fixed and identified a number of problems, but most importantly we feel that we’ve identified the core issue: Our servers are simply overloaded.

Under normal conditions we find our servers loaded to about 25% – 35% CPU — things look good and we don’t think we have a capacity problem with our servers. Then a slow query comes in that starts to slow things down. Much like a traffic jam that evolves out of thin air, one slow query can make a giant mess for everyone.

We’ve started timing our queries and most of the time, they can be measured in milliseconds. However, when things get bad, they may take up to 7-8 seconds. Our upstream web servers time out on the search request after about 5 seconds in order to prevent traffic from getting backed-up. What we need to do next is to limit the duration that a lucene query can run and terminate it after the timeout.

I’ve started looking at this and quickly realized that this is much more of a job than adding a simple timeout parameter to the search call. We’re currently using this search function from IndexSearcher:

  public TopDocs search(Query query,  int n);

Ideally I would like to add a way to timeout queries after 3 seconds. So far, I’ve discovered that we could use

  public void search(Query query, Collector results)

with a TimeLimitedCollector. The old call returns TopDocs and our code assumes that we have a TopDocs object from which to cull our search results. Having stared at the docs for lucene for a while, I haven’t found an way to convert the data in TimeLimitedCollector and convert it to TopDocs. It doesn’t make sense to me.😦

How does one do this? Sadly, we have no Java programmers on our team, so we’re quite a bit out of our league here. Is there an easier way to do this? Would someone be willing to write this code for us and submit a PR? We’d find some really good chocolate and send it to you if you do!

More info on our project:

We are using Lucene 4.10.4 on a custom codebase that pre-dates SOLR — we have a new SOLR project to replace this one, but it isn’t quite done yet. (Again, not having Java programmers is a bit of a problem for us).

Any tips, explanations or pull requests would be deeply appreciated! Chocolate reward offered!

Thank you!

Important: Schema change delayed to May 23

With our ongoing hosting issues due to massive traffic increases and failing hardware we’ve been too distracted trying to manage those issues to finish all of the testing for the schema change release that was scheduled for today.

We deeply regret having to do this, but we’re going to delay the schema change release by a week. It is now scheduled for May 23, 2016. This week long delay will give us a chance to further tweak our server configuration (more on this in the next blog post) and to test the schema change release in much more detail.

We are, however, going to upgrade our database server to Postgres 9.5 either later today or tomorrow. During this upgrade we are going to employ a back-up database server and keep MusicBrainz running in read-only mode with a slightly reduced overall capacity (I’m sure everyone know what that means by now). This upgrade should have no other effects on our downstream data users.

We will give people plenty of notice before we start the postgres upgrade via our site banner and via our Twitter account (@musicbrainz).

Sorry for the continued drama affecting our services — we’re working hard to keep things together!

Important information about the May 16 schema change release

In the past few weeks we’ve been hit with massive increases in traffic and a couple of hardware failures. Trying to maintain a decent service quality in light of both of these events have taken a lot of time of our team and we don’t feel 100% confident about the schema change release tomorrow.

Fortunately, the entire team will be together in one place tomorrow. The first thing we’re going to do is review the current state of affairs and decide how to tackle the Postgres upgrade and the release. As soon as we have our plan put together, we will post an updated blog entry with all of the needed details. But, we may very well delay the release by 24 hours.

However, we found that we ran out of time on one feature: MBS-6024: Support more than one barcode on same release. This one ticket will not be included in the upcoming release. We’re really sorry for letting that one issue slip — sorry for any inconvenience this may cause you.

6 degrees of Vince Gill

I’m not sure that we’ve talked about this cool project yet, so I’ll catch up on that now. The new site Six Degrees of Vince Gill allows you to enter an artist name and see how many degrees of separation there are between your artist and Vince Gill. This project comes from Universal Music’ Nashville group — I’m happy to see our data get used in interesting ways like this!

Now, if you want to see someone relate to Vince Gill in seven degrees, have a look at how I relate to him.🙂

Screen Shot 2016-03-16 at 17.45.29.