Monthly Archives: March 2016

Server update, 2016-03-21

The most noticeable change in this release is that we’re using Wikidata links to fetch images for any entity that has them. Thanks to Roman Tsukanov for working on this cool new feature! We’ve also made the header menus require clicks to open and close, and fixed several bugs listed below. The git tag is v-2016-03-21.

Bug

  • [MBS-7914] – alias list not included in track level artist credits when fetching release information
  • [MBS-8837] – Event edits fail to apply with ERROR: invalid input syntax for type time: “1970-01-01T19:00:00”
  • [MBS-8848] – Own private collections ignored on entity pages
  • [MBS-8858] – Edit medium edit stuck trying to use a deleted recording
  • [MBS-8862] – Search edits: release group condition is broken

Improvement

  • [MBS-6381] – Use Wikidata URLs to fetch images for entities
  • [MBS-8843] – Require clicking on the header menus to open them

New Feature

  • [MBS-6152] – Provide a better way to list private collections in /ws/2/release output

Task

  • [MBS-8805] – Drop the completely unused pre-NGS tables on totoro

6 degrees of Vince Gill

I’m not sure that we’ve talked about this cool project yet, so I’ll catch up on that now. The new site Six Degrees of Vince Gill allows you to enter an artist name and see how many degrees of separation there are between your artist and Vince Gill. This project comes from Universal Music’ Nashville group — I’m happy to see our data get used in interesting ways like this!

Now, if you want to see someone relate to Vince Gill in seven degrees, have a look at how I relate to him. 🙂

Screen Shot 2016-03-16 at 17.45.29.

 

 

May 2016 schema change release details

In about two months time we’ll have the next schema change release: May 16, 2016. Even after skipping the fall schema change release, this release is going to have few changes that will impact our downstream users. Most of the tickets in this release will make minor improvements to database indexes and edit tables. If you are one of the few users of our edit data, then you should delve deeper into the list of tickets in this release. For everyone else, I will summarize the tickets with a greater impact.

In a previous blog post we also talked about upgrading the minimum required version of postgres. We received no real feedback requesting for us to upgrade to 9.4, but we did receive some feedback that some people would prefer 9.5, which is our preference as well. Based on that feedback, we’re going to make PostgreSQL version 9.5 the minimum required version. If you’d like to run a MusicBrainz replicated instance via our Live Data Feed, you will need to run Postgres 9.5!

The official minimum supported Ubuntu release as of now is still Ubuntu 10.04 LTS (Lucid Lynx) which reached end-of-life a year ago. We will upgrade that to Ubuntu 14.04 LTS (Trusty Tahr) at the schema change release. In particular, this means that we might start using Perl 5.18 features in the MusicBrainz Server code (as opposed to Perl 5.10 currently).

We understand that this is potentially a lot of work for some of our users, but occasionally we need to upgrade our requirements. We try and limit these sorts of upgrades as much as possible, so please bear with us.

Finally onward to the details of the release. Please take a look at the list of issues that will be addressed in this release. The few tickets worth discussing in details are:

  • MBS-8838 – “Add gids to all *_type tables“. This ticket adds MBIDs (GIDs in schema lingo) to all of our tables that define a type for some database element. Given that we recommend that external users never reference our data by row ids, we really need to provide proper permanent MBIDs to all elements of our database.
  • MBS-6024 – “Support more than one barcode on same release (SQL edition)“. This ticket adds the ability for the database to contain more than one barcode for a given release. However, this ticket does not include the user interface portions of this feature. The team will add the user interface/edit portions of this feature in a later, non schema change release.
  • MBS-4501 – “Alternative tracklists“. This ticket creates a new feature that would allow an alternative tracklist to be used for a given release. This is a better solution for handling conflicts between our style guidelines and how the data appears on the release. It is also a more elegant solution for translations of releases into different languages.

As usual, we will post final details about the release shortly before the release happens. If you have any questions about this release, feel free to ask specific questions in the tickets or general questions in the comments below.

(Edited 2016-03-16 at 12:55 UTC to add the upgraded Ubuntu requirement.)

Wary of the Web Sheriff

On September 29th, we received an email from a company called Web Sheriff, urgently requesting us to remove the name Adetayo Ayowale Onile-Ere from the Taio Cruz artist page. We investigated this request and quickly found that there were other resources on the net referencing both names.  We also found other evidence of Web Sheriff working to change the name of this artist in other venues/sites.

One of the cornerstone’s of our music database, and our principles, is to keep the data as clean and accurate as possible.  We aim to edit with due diligence.

We declined to make the change, and made the following request: If you can provide us with a birth certificate that shows Taio Cruz as the birth name, we’d be happy to make the change. Web Sheriff agreed to do that and on October 13th we received a copy of this document:

JTC - Birth Certificate

 

I inspected the document and quickly felt something was amiss. The document purports to be a Birth Certificate from the Chelsea and Westminster Hospital in London, UK and the father’s occupation is listed as “Lawyer”. While I am not a legal expert, my understanding is that the term for someone who practices law in the UK is not lawyer, but rather attorney, barrister, or solicitor.  The use of “Lawyer” in this context seemed strange to me.

With this observation as my motivation, I rang up Her Majesty’s government to ask how I would go about verifying the validity of a birth certificate. I was told that the UK government could not verify the authenticity of a certificate, but that I could request a copy of the certificate myself since they are public record. For a fee of 9 pounds 25p and two weeks time I would receive a copy of the certificate in email. I thought that was a good way forward and I asked to order a copy. The lovely lady who helped me, painstakingly endeavoured to ensure that all the details related to my request were communicated correctly. I have no doubts that I relayed the data I had accurately.

And with that, I waited. On November 8th, I received mail from Her Majesty’s government informing me that no such birth certificate could be found and that my payment will soon be refunded.

This strongly suggests that the document provided to us by Web Sheriff was not a legitimate copy of a birth certificate for Jacob Taio Cruz.

Being coerced to make changes to facts in our database based on false pretences –  I find such things despicable. We felt harassed and intimidated by the efforts of Web Sheriff, pressuring us to make this change.

I hope that this blog post will remind others to be wary and vigilant, and not give in easily to coercion.  This is what I intend to do with Web Sheriff, and others, going forward and I hope you will do the same!

For more info on Web Sheriff see their Wikipedia page.

 

UPDATE (March 9, 2016):  This post has been edited for clarity.  These changes follow a specific request from Web Sheriff that we delete or amend the post:

we must insist that you withdraw your false allegation of forgery

UPDATE: Since posting this, sharp readers spotted more problems with the certificate:

  1. There is no “county of Westminster”. There is a city of Westminster.
  2. The hospital name is missing a “t”: Wesminster.
  3. This hospital may not have been called that in 1985. It seems to have existed in that form since about 1993.

Server update, 2016-03-07

This release fixes the web service to support all the new entity types that can be added to collections since the last release, including to support submission, and introduces a few new browse requests (browsing collections by entity gid or editor name; browsing entities by collection gid). These are documented at https://musicbrainz.org/doc/Development/XML_Web_Service/Version_2.

Area pages now have a “Users” tab where you can see other editors from that area, provided they’ve filled out that info in their profile.

A “single sign-on” endpoint has been added for logging into our new community forum, to replace the previous OAuth2 method. The main benefit over OAuth2 is that it now keeps peoples’ usernames and email addresses in sync with their MusicBrainz account.

Thanks to Roman Tsukanov (Gentlecat) and Frederik “Freso” S. Olesen for their work on today’s release. The git tag is v-2016-03-07 and the complete changelog is below.

Bug

  • [MBS-3125] – collection queries via the webservice are broken
  • [MBS-5323] – GET request for releases in a collection returns unordered results
  • [MBS-8459] – WS doesn’t have work collection endpoints
  • [MBS-8651] – collection add and delete endpoints
  • [MBS-8652] – PUT requests for invalid webservice collection URLs return the same content as a GET
  • [MBS-8825] – Links in the header force the default cursor via CSS

Improvement

  • [MBS-6120] – web service browse release where collection equals mbid

New Feature

  • [MBS-6152] – Provide a better way to list private collections in /ws/2/release output
  • [MBS-6511] – Show tab for “editors” on an area page when logged in
  • [MBS-8839] – Implement Discourse SSO endpoint

Task

AcousticBrainz Update

It’s been over a year since we last posted about AcousticBrainz, but a lot of work has been going on in the background. This post will give an overview about some of the things that we’ve achieved in the last year.

Data contributions

Our last blog post was neatly titled “What do 650,000 audio files look like, anyway?” Back then, we thought that this was a lot of submissions. Little did we know… I’m glad to report that we now have over 3.5 million submissions, of which almost 2 million are for unique MBIDs. This is a great contribution and we’d like to thank everyone who submitted data to us.

Dataset and model building

MusicBrainz coder Gentlecat returned to participate in Google Summer of Code last year and developed a new tool to let us create datasets and create new computational models. We’re really excited about how this can allow community members to help us increase the quality of the semantic information we provide in AcousticBrainz. We will make another blog post soon explaining how it works.

We presented an academic overview of AcousticBrainz (PDF) at the 16th International Society for Music Information Retrieval (ISMIR) conference in Malaga, Spain. The feedback from the academic community was very encouraging. Many people were interested in the data and wanted to know what they could do with it. We hope that there will be some new projects announced using the data at this year’s conference.

Integration with other data sources

MusicBrainz and AcousticBrainz don’t exist in a vacuum. One important thing that we need to make sure we do is interact with other researchers and products in the same field. To that end, we started AcousticBrainz Labs, a showcase of some of the experiments that we’re working on in AcousticBrainz. The first thing we have published is a mapping between AcousticBrainz and the Million Song Dataset, that we hope people will use to compare these two datasets.

Database upgrades and Data format changes

We’ve just upgraded to PostgreSQL 9.5 (from 9.3), which allows us to use the new jsonb datatype introduced in PostgreSQL 9.4. This change lets us store feature data more efficiently. We also made some changes to the database schema to let us start creating new data from datasets and computation models.

One result of this is that we are creating a new complete data dump, and stopping the old incremental dumps. We are also taking the opportunity to automate this incremental dump process, which is something that a number of people have asked for.

Another change is that the format of the high level JSON data is changing. This is to better reflect some of the complexities that exist in hosting such a large and varied dataset.

Contribute to AcousticBrainz development

We’re always interested in help from other people to contribute data, code, and ideas to AcousticBrainz. Once again, MetaBrainz is participating in Google’s Summer of Code, and AcousticBrainz is a possible project to work on. If you’re not a student you’re still welcome to work with us.

Write to us in a comment, in IRC, or in our new Discourse category and say hi.