Category Archives: Uncategorized

Upgrading Postgres for MusicBrainz Live Data Feed users

We’re slowly approaching that time of year: Schema change release time. After skipping our fall update to focus on some internal tasks, we’re ready to have another schema change release in the spring: May 16, 2016

We have started the process to collect features we wish to release for this schema change release and we’ll be publishing that list in the coming weeks. However, we’re contemplating the impact of one more change we’d like to make: Upgrading to a more recent version of Postgres.

Internally we are going upgrade to Postgres 9.5, which was recently released, so we expect that the Postgres team will have worked out the most significant kinks before we’re ready to move to it. However, even though we are moving to 9.5, we are considering the impact on our downstream users/customers who need to make the same or similar change.

While we are moving to version 9.5 of Postgres, we have the option of only adopting features from Postgres 9.4, which means that our downstream users may continue to use Postgres 9.4. However, Postgres 9.5 has some nice features we’d like to use (e.g. UPSERT), so we’re pondering if it is possible for us to require Postgres 9.5 from all of ours Live Data Feed users starting on May 16, 2016. 

We have already informally queried a few of ours users and so far it seems that requiring Postgres 9.5 is feasible. If you are a Live Data Feed user and feel that this requirement of Postgres 9.5 is too much for your and your organization by May 16, 2016, please leave a comment to this blog post!

BookBrainz February 2016 Release

BookBrainz_logo_solo

Welcome, readers, to the first blog post from the BookBrainz team! I’m Ben (AKA LordSputnik), one of the two guys leading the BookBrainz project to create the most complete and thorough database of literature in the world. Or, in other words, doing for books what MusicBrainz does for music. In this post, I’m going to talk to you about the February 2016 release of BookBrainz, what we’ve been working on, and the current direction of the project.

Unit Testing

One of the biggest areas of work in this update is the new unit testing for the web service. Unit testing allows us to check that our functions work as we expect them to, and help to find and prevent bugs in the code. This is something we’ve been pushing back for months now, and it really needed doing. Luckily, the Google Code-in (GCI) happened, and one of our students, Stanisław Szcześniak, stepped up to the challenge of writing our test suite.

4000 lines of code and several test classes later, our test coverage (the proportion of web service code checked by the tests) has increased from 40% to 70%, and we’ve found about 10 bugs which we’ll be fixing for the next release. Stanisław is still helping out, now focusing his efforts on a BookBrainz plugin for Calibre (like Picard, but for eBooks) and the new BookBrainz client library.

Python Client Library

We’re planning the Python client library to be a key component of a couple of applications we’ll be writing in the future to increase the amount of data in BookBrainz. It’ll also allow outside developers to programmatically access and modify the information in BookBrainz through our web service. At the moment, it’s still in the early stages of development, with Stanisław playing about to find a clean and elegant architecture.

Reactification!

Another area we’re continuing to look into is changing our existing web page templates (written in Jade) to use the React JavaScript library. This helps us by allowing the same code to be used for templating in the browser and the server, and also allows us to use third-party libraries to simplify our user interface code (for example, React-Bootstrap and React-fontawesome).

So far, we’ve converted 9 pages, including login, registration, search and revision display, with a little help from our GCI students. An added benefit of this is that we’ve been able to apply the idea of progressive enhancement to allow JS-enabled browsers to refresh search results in real-time, while keeping the previous functionality for older or limited browsers.

Browser Compatibility

Since the last release, we’ve established a list of supported browsers, and signed up to a really useful automated test site called BrowserStack. This allows us to get screenshots of key pages of the site in many different browsers, and see where things are breaking. Although we’ve been mainly been working on back-end code for the last couple of months, there are a few issues that we’ve found out about in older browsers which will hopefully be fixed soon™. If there are any issues that you’ve spotted when using the site, be sure to let us know in the BookBrainz JIRA.

Improving Error Messages

Working on how we display errors to users is another front-end issue that we’ve sadly been neglecting for about half a year now, in favor of improving the back-end. A good example of the problem is logging in – right now, the site sometimes displays a vague error message about not being able to log the user in, and sometimes it spits out the really unhelpful message “Internal Server Error”.

ISE

So informative, BookBrainz…

Part of the work we’ll be doing on error messages in the next few months will involve creating custom error pages and trying to eliminate unfriendly technical error messages. Giving the user a better idea of what to do when something does go wrong is also important, and we’ll be trying to achieve this along with making error display more consistent across the site.

Direct Database Access

The largest change we’ve been working on over the last couple of months is having the site access the database directly, rather than obtaining data through the web service, as we’ve been doing up until now. Originally, we decided to put the web service in between the site and database to ensure that the web service had good data editing functionality and that data representations were the same as those in the site.

However, this led to us having to effectively define our schema three times – once in the database, once for structuring web service data, and once to define the data models used in the site code. We found this to be a bad situation, because it’s easy to forget to keep the three schemas consistent. Last autumn, I tried to improve the web service to automatically generate web service data structures from the site data models, but eventually decided that this would be overly complex and time-consuming.

Instead, we made the decision to migrate all of our code to Node.js, currently used by the site, and then use a shared data model package for both the site code and web service code. With this change, we would only have to define the schema twice – once in the database, and once in JavaScript – better, but not ideal. Thankfully, the Node.js library bookshelf provides database reflection, which means that we can automatically generate the Node.js data models from the database schema, finally removing the need to define the schema in multiple places.

Now, we’re about two-thirds of the way through updating the site to use the new data models – you can see our progress in the GitHub repository. Due to a new emphasis on code quality and implementing tests as we go along, progress has been slow but steady, but this should hopefully result in a more stable site when we complete this upgrade within the next couple of months.

New Web Service

Following the direct database site update, our schema will have changed in subtle ways which will make it impossible to keep using the existing web service code. Ideally, we would have kept the schema unchanged while we moved to Node.js, but partly due to ORM limitations and partly due to a desire to make things better, we’ve been tinkering with it as we’ve gone along.

This means that the web service will be unavailable for a few months while we rewrite it in Node.js. However, the new web service should be a big improvement on the old one, with a more carefully planned design learning from the mistakes of the current iteration. If you have any suggestions for what you think would be a good feature for the new web service, please let us know in the comments!
That’s it from me for now. I hope you’ve found it interesting to get an insight into the things we’ve been working on! For a more specific list of changes in the February 2016 release, please see our change log. If you have any suggestions for our future blog posts, I’d love to hear your feedback.

Server update, 2016-02-08

This update hopefully fixes some issues with “Edit Medium” edits that, in rare cases, resulted in an incorrect track listing. Sometimes tracks were being inexplicably deleted. The git tag for today’s release is v-2016-02-08.

Bug

  • [MBS-8752] – Database inconsistencies when updating medium
  • [MBS-8765] – instrument_annotation should not be backed up in mbdump.tar.bz2
  • [MBS-8770] – Banner not displayed

Task

  • [MBS-7475] – Get rid of Algorithm::Merge

Server update, 2016-01-25

Backwards-Incompatible JSON Web Service Changes

In an effort to get our JSON Web Service out of “beta” status, we’ve made some backwards-incompatible changes to it in this release:

  • The video flag on recordings is now outputted as true or false instead of 1 or 0.
  • Empty relations arrays are not outputted for linked entities anymore, since linked entities never include relationships.
  • The iso_3166_1_codes, iso_3166_2_codes, and iso_3166_3_codes properties have been renamed to iso-3166-1-codes, iso-3166-2-codes, and iso-3166-3-codes, respectively. This only applies to lookup and browse requests; search requests already outputted these with hyphens.
  • The iso-3166- properties mentioned in the previous point are not outputted if they’re empty.

Some other changes to the web service have been made, but are considered additions (not changes to existing output), so hopefully shouldn’t cause any problems. You can review them in the changelog below.

Other Changes

An issue where entities deleted from the database (but still present in the cache) remained visible has hopefully been fixed. There are several other miscellaneous bug fixes linked below. Thanks again to Ulrich Klauer for his contributions. The git tag is v-2016-01-25.

Bug

  • [MBS-5676] – JSON relationships output doesn’t include target-type
  • [MBS-6166] – Deleted accounts can still have details edited
  • [MBS-7241] – Non-transactional cache means the cache can sometimes fail to delete entities that are gone at the database level
  • [MBS-7735] – ws/2: recording’s “video” flag inconsistent between xml and json
  • [MBS-7921] – Internal server error when requesting /ws/2/isrc as JSON
  • [MBS-8367] – ws2 JSON incorrectly returns non-included field as null value
  • [MBS-8396] – JSON output has no ordering key attribute for release group series
  • [MBS-8563] – Release & Release Group browse requests without type/status filters return results which contradicts the documentation
  • [MBS-8688] – Random tagged entity type display inconsistency in personal tag page
  • [MBS-8722] – Edit stuck trying to change the gender of a group
  • [MBS-8726] – Replicated updates don’t invalidate cache entries on slave servers
  • [MBS-8730] – Reordering of sub work parts causes unwanted reordering of main work parts
  • [MBS-8746] – JSON web service doesn’t distinguish between relationships not existing vs. not being loaded

Server update, 2016-01-11

Our first release of 2016 consists mainly of data-display fixes by Ulrich Klauer and a couple small improvements by Google Code-In students Caroline Gschwend and Ohm Patel. Notably, internationalized domain names are now displayed in decoded form: https://musicbrainz.org/url/2de1616a-7ca0-4688-92cc-0a8373190ede

Thanks once again to the above contributors. :) The git tag for today’s release is v-2016-01-11 and the changelog is below.

Bug

  • [MBS-4575] – Old add release label edit does not display
  • [MBS-5205] – Text diff incorrectly highlights first word that didn’t change
  • [MBS-7844] – Name variation marker not used for artists in tracklists in “edit medium” edits
  • [MBS-8012] – Release dates/countries are displayed strangely in edit release label edits
  • [MBS-8161] – Medium titles have no diff highlighting when displaying edits
  • [MBS-8210] – Multiple “Remove ISRC/ISWC” edits on one page interfere
  • [MBS-8330] – Another name variation check after HTML entity conversion
  • [MBS-8413] – Removed URLs in edits are badly encoded
  • [MBS-8692] – Expired Catalyst sessions remain (partially) in Redis
  • [MBS-8698] – Content negotiation for JSON-LD representation does not work with multiple MIME types in Accept header

Improvement

  • [MBS-6407] – Add username to our verification mails
  • [MBS-8683] – Display internationalized domain names in decoded form
  • [MBS-8709] – Mark up removed entities as usual in add medium/edit medium edits
  • [MBS-8713] – Block SoundCloud search and tags URLs

One month of Google Code-in

So today it is a month ago since the Google Code-in competition started and 18 days until it is ending. I wanted to take this opportunity to talk a bit about some of the things that have happened so far and where we’re at.

Google Code‐inSince December 7th when Google Code-in started, we have been in touch with 107 students on the Google Code-in site, of which 70 have completed at least one task and thus earned a digital certificate from Google. 11 students have so far earned themselves a t-shirt from Google by completing 3 or more tasks. The student with the highest number of completed tasks right now sits at 17 tasks, followed by one at 16 and another at 15 completed tasks. The student with the 10th most tasks completed has 3 tasks to their name.

Stanisław Szcześniak presenting about MusicBrainz

Stanisław Szcześniak, GCI student from Poland, presenting about MusicBrainz.

We have had 7 students do presentations on MusicBrainz in at least India, Romania, England, and Poland; about 50 reviews written for CritiqueBrainz with a few more in progress; a couple of MusicBrainz how to’s written for the wiki; one video tutorial made (which hasn’t been uploaded yet); a bunch of tests written for BookBrainz; updated and have had made a bunch of icons/logos in various places; a bunch of code patches and tests written for almost all our projects, as well as for beets (a 3rd party music file tagger and organiser heavily using MB data).

We have also had to report 3 students for plagiarising leading to their disqualification. :( However, compared to the amount of work and number of students, I think it’s a decently small number.

Overall, I am (still!) really excited about MetaBrainz finally being a part of Google Code-in, and I definitely think the lack of sleep the first week and newbie questions on IRC and on the GCI tasks are worth it. We’re getting some great stuff done, that we may not have gotten around to in any reasonable time ourselves, and we get to help all these students learn about programming, open source, open data, licenses, and a bunch of other things. I’m happy and I’m not looking forward to picking only 5 finalists and only 2 winners. There are definitely more than that I would personally like to see in both categories. :)

Have you had any experiences with or thoughts on our Google Code-in participation so far? Please do share them with us in the comments!

Server update, 2015-12-14

  • The edit listings for your subscribed entities (and subscribed editors) now shows all edits entered within the past 7 days, by default (so, auto-edits are now visible, plus edits that may have passed by vote before you saw them). There’s a new toggle-able option on the page to only display open edits, to restore the previous behavior while voting. We’ll hopefully introduce some way to mark closed edits as reviewed/dismissed soon, to make the listings easier to digest.
  • Edit histories for large collections should hopefully load a bit faster now. Please comment on MBS-8368 if you have a collection that still consistently times out when viewing its edit history.

We’ve also made a slight change to our release process. Previously, our master branch was our “stable” branch which pointed to the most recent release (i.e., the code we run on musicbrainz.org itself). If this is what you expect, you should now be using the production branch instead. The master branch has become our main development branch, which means it may be slightly less stable from now on (of course, we’ll do our best to avoid that).

Thanks to Ulrich Klauer (chirlu) for his contributions to today’s release.

The git tag is v-2015-12-14 and the complete changelog is below.

Bug

  • [MBS-7012] – Join phrase cleanup not running
  • [MBS-8145] – Internal server error loading collection edits
  • [MBS-8368] – Edit queries for large collections time out
  • [MBS-8647] – “Edits for subscribed entites” doesn’t show auto-edits
  • [MBS-8660] – Editors can’t delete their location
  • [MBS-8661] – Adding and editing non-ended areas is broken

New Feature

  • [MBS-8664] – Show “last login/active date” somewhere on /user/ for account administrators