Author Archives: reosarevok

Classical Clean Up #4: Hyperion

Hi all!

We almost got Dvořák fully cleaned, with only a page and a half of hard-to-fix recordings from compilations left. Which honestly is a great result (especially since most people won’t care that much about those compilations).

That said, I thought we could do something different this time, and hopefully avoid albums with no info available at all 🙂 The best way of doing that is to focus on a label instead of a composer: ideally, a label that offers (almost) all of their booklets for free so that everything can be checked without needing to have a copy of the release. One of the best examples of this is hyperion (official site), a British label that puts out all sorts of interesting stuff from medieval music to contemporary classical. So this February we’ll clean up hyperion (Hyperion? the logo is lowercase, anyway!) releases 🙂

Tools

  • Our user loujin has made a nice dashboard that shows our current hyperion (and related label) releases, and the ones from their complete catalog list that we seem to be missing. It matches by barcode, so if we’re missing the barcode, the release will still appear on the “we’re missing this” list – make sure we really are missing it before adding a new one! 🙂
  • A hyperion website importer has been written by loujin specifically for this cleanup.
  • My own Classical Editor’s Toolbox, especially if you’re a relatively new editor. You’ll definitely want to install most of the userscripts mentioned there.
  • The label website, of course.
  • Discogs pages for hyperion and helios. Usually, the label page will be better than these, but some old releases (especially vinyls) might be on Discogs and not in the official catalog.

How to use the Hyperion website

The website has a lot of info! Here’s an introduction, I’m sure I’m missing some stuff.

  • Choose the right label! In general, you can look at the catalog number: CDA = hyperion, CDH = helios (other sublabels and distributed labels are more obviously different).
  • For full booklets, click “Digital booklet (PDF)” under the cover art. It might not be always there but I can remember almost no cases where it wasn’t 🙂 All the booklets include a request not to upload them elsewhere, so let’s respect that: please do not upload the full booklets to the Cover Art Archive. Keep in mind when something has been re-released on Helios, the Helios booklet will also be linked from the old Hyperion version. It’s generally safe to follow this booklet, but of course if you know something was printed differently on the old tracklist you should keep it like it actually was 🙂
  • For a big cover image, click on the cover and then right-click + open image in new tab. These are ok to add to the Cover Art Archive: please do upload them! 🙂
  • For the release date (up to the month) see the box on the top right side of the page.
  • Barcodes are often not available on the release page itself for some reason, but you can get them from loujin’s list, from the full catalogue itself or by searching Amazon for the catalog number and looking at the back cover.
  • Hyperion often re-releases stuff on the budget sublabel Helios, or as part of collections. If you see “Superseded by CDH12345” (or whatever the catalog number) to the right of the cover under the title area, you’re in luck! You can fix two releases instead of one with just a bit more effort 😉 (if one of the two is missing, just fix the existing one, then create the missing one based on the now-fixed version). Remember, CDA = hyperion, CDH = helios.

Other hints

  • Remember you have the full liner notes. This is very helpful when trying to identify works! If in doubt, check what the liner notes say. If that still doesn’t help (say, you have one of Dowland’s “A Fancy” and no idea which one that is) just leave it unlinked, don’t guess the work.
  • The website will sometimes be more specific than the booklet about which performers perform on which works or work parts. If the booklet is not too clear, see if specific performers are printed under the track title on the website’s tracklist.
  • Recording dates are usually more exact on the booklet than the right-side box. Even if you see “Recording details” there, check the booklet first 🙂 Old booklets might have only recording dates but no locations – recent ones seem to include both pieces of info almost always.
  • When choosing release artists, I’d suggest following the cover, not the website (if the website says Johannes Brahms and the cover Brahms, use just “Brahms”).
  • The hyperion website entries can be linked either with “purchase for X” or “discography page” relationships. I’d suggest at least “discography page” (with the purchase ones on top if desired), but just linking it is already good – that’s the quickest route to booklets, after all! 🙂

What to work on

  • Take either loujin’s dashboard or the actual label pages in MusicBrainz, look at releases and see what seems to need work. An easy start is releases that still have the performers on the title rather than the artist field 🙂 You can also look at the data quality column: anything with “unset”, “low” or “normal” should be missing stuff (if not, go and change data quality to high!).
  • Add missing releases: in loujin’s dashboard you can see releases that haven’t matched to MB by barcode or catalog number. Make a quick check in case they’re in MusicBrainz but missing the info, but most of them are simply missing and need to be added!
  • If you’ve added all the info from the booklet (including engineers, copyright info and whatnot) and added the covers please set the release data quality to high from the sidebar. That way, other people can see that and not check the data again 🙂 If something is terribly entered and you don’t have time to fix it, feel free to set it to low quality to point the mess to others!

As always, if you have any doubts or questions or you just want to ask the community to help with something, you can post under this 🙂

PS: Thanks to Chhavi Shrivastava for the banner!

Classical Clean Up #3: Dvořák

Your favourite time of the year is at hand! No, I don’t mean Christmas, I (obviously) mean the Classical Community Clean Up. Debussy went very well, Mahler was fantastic, and it’s time for a third! Come join us in paying a little special attention to classical masters!

This time around the community has chosen (probably) the world’s only composer who is also a keyboard layout, the titan of Czech music Antonín Dvořák. We encourage you during this time to not only help the community clean up Dvořák’s metadata, but to learn more about Dvořák as well.

The clean up events officially last one month (but can be continued until they’re complete!) and are meant to utilize our community’s power to clean up our classical metadata. If you are new to MusicBrainz, to classical editing, or both, we have a whole tool box and plenty of advice, tips and tricks to share. We advise you bookmark the tool box—it’s quite helpful! Our team of classical music enthusiasts will also provide plenty of support on our forums, so come join us!

What we will work on:

  • Reviewing the existing works to make sure there are no duplicates and the information looks correct, and add any missing works (keep in mind while it is perfectly ok to add lost works, it’d be good to specify they’re lost so that people don’t accidentally use them on recordings).
  • Check the release list for anything that doesn’t follow the classical guidelines. Not only that should be fixed, but that’s a good sign of the recording and relationship info being incomplete too.
  • Check the recording list. The only recordings that should be here by the end of the cleanup are of Dvořák himself as a performer (probably none, and in any case very few). Anything else being here should have performer relationships added to it if missing, then the artist credits for the recording should be changed to list the main performers (you can use the relevant script for that). Try to fix the whole release the recording is on, even if it’s not all by Dvořák! But in the case of a very large compilation, it’s always acceptable to fix only the Dvořák content on it.
  • Add missing Dvořák recordings! If you have enough info to add a Dvořák release we’re missing, that’s always useful. Just make sure to try to add as much info as possible from the get go, so we don’t have to clean that addition up as well. 🙂

We recently had 2995 recordings, 781 works and 862 releases under Dvořák, and we’re expecting to have many fewer wrongly listed recordings and many more Dvořák releases by the end of the month. Don’t know where to begin? Join us and ask, let us help you find a jumping in point! Here’s to another great month of Classical Clean Up with Dvořák!

By the way, you can get the above poster and a wallpaper version courtesy of Chhavi, in case you feel like having Dvořák himself staring at you will motivate you further! 😉

Classical Clean Up #2: Mahler. The Conclusions!

As we published at the start of October, during the last month we’ve been trying to clean up our data for Gustav Mahler. October is over now, and you might be wondering how that went. Well, no need to wonder anymore, because our users have made a fantastic job not just of cleaning Mahler’s data up, but of showing us how clean it is!

Our editor stupidname took statistical snaps at the start, the midpoint and the end of the project:

Oct 1st Oct 18th Nov 2nd
Recordings 2361 66 (-2295) 11 (-2350)
Tracks 11866 14094 (+2228) 15228 (+3362)
Releases 924 1192 (+268) 1363 (+439)
Release Groups 720 871 (+151) 986 (+266)

As we can see, the existing recordings where mostly cleaned up 18 days in, but a lot of new releases kept being added up until the end of the month.

Additionally, stupidname also checked the amount of recordings for some of the main works by Mahler to see the changes over time (specifically, due to the way our works… err.. work, the data is for one movement of each work rather than the main work itself):

Oct 1st Oct 18th Nov 2nd
Symphony no. 1 95 115 (+20) 120 (+25)
Symphony no. 2 114 145 (+31) 149 (+35)
Symphony no. 3 108 141 (+33) 144 (+36)
Symphony no. 4 68 82 (+14) 85 (+17)
Symphony no. 5 92 93 (+1) 98 (+6)
Symphony no. 6 65 74 (+9) 87 (+11)
Symphony no. 7 76 86 (+10) 96 (+20)
Symphony no. 8 89 108 (+19) 106 (+17)
Symphony no. 9 125 141 (+16) 176 (+51)
Das Lied von der Erde 47 53 (+6) 55 (+8)
Kindertotenlieder 41 52 (+11) 62 (+21)
Lieder eines fahrenden Gesellen 54 63 (+9) 68 (+14)

This data is a bit less precise, because some of these recordings are partial (and the specific organization of Symphony no. 8 makes it especially tricky to count), but it is still a very nice view of how we’ve gotten extra recordings of basically everything!

Our editor loujin made graphs with the amount of edits per editor during the cleanup. There are too many editors for the legend to show them all, but the graph shows that the two biggest contributors by far were ListMyCDs.com (green) and stupidname (light blue), with a bunch of other editors making several hundred edits as well.

And finally, also thanks to loujin, you can see how the cleanup affected the amount of edits done on Mahler (no prizes for guessing which bar it is!):

Thanks to all this hard work, our entry on Mahler should be a particularly good example of the amount and quality of classical data you can get from MusicBrainz, and an inspiration for other composer pages! Thanks so much to everyone, and we’ll be back with more in December!

Mahler is impressed

Classical Community Cleanup #1: Debussy

The Metabrainz Classical Music Enthusiasts Team has kicked off to a strong start! If you are unaware about the formation and tasks at hand, you can read more about it on the forums.

It’s clear by the number of discussions and engagements in the forum that a community effort on classical music was long overdue! It’s thrilling and we are eager for the first mission: after some discussion and voting we decided that the first community effort would be a clean-up of all our data for Claude Debussy.

As a composer with a huge influence in 20th century music, yet with a relatively low amount of hard to edit compositions like operas, Debussy is a great first choice for the community of classical editors to start actively working together to improve the data. As such, if you’d like to help out, but are new to classical editing or not too active in the community yet, don’t hesitate to reach out and ask any questions. The classical community is active in its own forum category, and we’re hoping to see a lot of activity there with editors both asking and answering questions.

What will we be working on in this first classical cleanup project?

  • We will review the existing works and catalogues to make sure there are no duplicates and the info looks correct (several very active classical editors have already been working on this in preparation for this cleanup).
  • We will check the release list for anything that doesn’t follow the classical guidelines. Those should of course be fixed to follow the guidelines, and that’s usually a good sign of the recording and relationship info being incomplete as well.
  • We will work on the recording list. The only recordings that should be there by the end of the cleanup are of Debussy himself as a performer. Anything else currently there should have performer relationships added to it if missing, then the artist credits for the recording should be changed to list the main performers.
  • And we will add missing Debussy recordings! If you have enough info to add a release we’re missing that includes works by Debussy, that’s always useful. Just make sure to try to add as much info as possible from the get go, so we don’t have to clean that addition up as well!

Don’t know where to begin? Let us know and we can help find a starting point–or just jump in and help out! We can’t wait for Mr. Debussy to be a great example of how much information MusicBrainz can provide!

Simplification of the featured artist guideline

Hi everyone!

In the past couple years, we’ve steadily reduced the amount of standardisation we do, instead preferring to stick closer to the original release information (except for obvious errors) and let the end users do any standardisation as desired. The main rationale for this is that it’s generally easy to standardise automatically, but impossible to get the original information back: as such, standardisation is best done on demand over the original data. The last big step of this process is to simplify and soften the very restrictive featured artist guideline.

The guideline was originally written when we could only store one artist per track/release, as “add (feat. X, Y & Z) to the track title”. This allowed us to enter the featured artist information on the track title in a standard way in order to deal with it better in the future. And that worked fantastically: in fact, it allowed us to write a series of reports (for release groups, releases and recordings) that make it easier to find entities where the artists need to be moved to their rightful place in the artist field, now that we can!

When we started migrating the info to artist credits, we decided to basically keep the guideline as-is, but start adding the artists to the artist field. So we’d have “Song”, by “Artist feat. X, Y & Z”. This was a huge improvement, since the artists were now properly linked, but such strong standardisation was always a bit heavy handed, and at the same time quite restricted (since it only applied to variations on the word “featuring”, but not to any other linking phrase).

As such, we’ve decided to stop standardising credits using variations of “featuring” as well. The new version of the guideline can be found as part of the artist credit guidelines, and reads:

Featured artists should always be entered in the artist credit, not in the titles. You should generally enter the credit as it appears on the release, omitting any separators (like parentheses) that are intended to separate it from the track title. For example, if the tracklist has “Artist 1 – Song Name (featuring Artist 2)”, enter “Song Name”, by “Artist 1 featuring Artist 2”.

It also gives a few examples:

This decision was taken a few weeks ago, but to avoid disruption we waited until we had a Picard plugin available for anyone who still wants to standardise “featuring” in their own tags. That’s now available (thanks to Sambhav Kothari), so we feel it’s the time to make the change official. If you want to keep all variations of “featuring” as “feat.”, go ahead and install that plugin, and things should remain basically as they were!

One last note: I strongly encourage people not to vote against any edit moving featured artist information from the titles to the artist field, even if it doesn’t exactly follow the wording used on the cover. As always, any clear improvements like this one should be accepted, even if they’re not perfect, and the data can be improved further afterwards.

Style update, 2015-09-08

Hi everyone! Here we are with another very, very late style update for the last couple months.

Apart from quite a few smaller changes (full list below), we split the translator relationship so that it’s its own relationship rather than lyricist + attribute (so if any of you were using translator data at all you’ll want to change the way you query for it). Similarly, we got rid of the “transliterated” vs. “translated” difference for alternate tracklists: it often wasn’t clearly one or the other, and the information wasn’t particularly useful in any case without checking the language and script of the release, so now there’s only one relationship type without attributes.

We also demoted the “Do Not Cluster” guideline – while it’s a good thing to keep in mind when creating relationship types, it’s not really something users should be worrying about (and the cases where they might have to are already covered in the specific relationship documentation).

If anyone has any question about these or any other changes, feel free to ask in the comments! And if you want to propose other changes or additions, remember you can always do it from the STYLE section of our bug tracker.

Improvement

  • [STYLE-315] – Remove the option to add publisher relationships to recordings
  • [STYLE-474] – Introduce the option to add phono rights relationships to recordings

New Feature

  • [STYLE-481] – Add artist-release group “has dedication” relationship
  • [STYLE-505] – Add “written at” relationship between work and place/area
  • [STYLE-538] – “Arranged at” Place relationship
  • [STYLE-542] – Allow soundcloud/mixcloud etc. links on event series (festivals)
  • [STYLE-548] – Event – Release Group rel: Performed

Task

  • [STYLE-392] – Make “Do not cluster” more sensible
  • [STYLE-420] – Drop “transliterated” attribute from the “transliterated/translated tracklist” relationship type
  • [STYLE-435] – Split “Translator” into its own relationship
  • [STYLE-460] – Revisit the [dialogue] guideline for NGS
  • [STYLE-482] – Deprecate the release-URL samples IMDb entry relationship type
  • [STYLE-494] – Add dorian keys to the list of keys for works
  • [STYLE-520] – Clarify the soundtrack guidelines stance on VA usage
  • [STYLE-528] – Add Turkish Makam work attributes
  • [STYLE-539] – Specify that tribute albums are cover albums

Style update, 2015-06-02

Ok, so the “have a report every two weeks” thing didn’t work out very well lately. But things have been happening anyway, so here’s all that has happened in the last few months!

Guidelines for artists have been updated for what is and is not a different artist (since now relationships can also have credits) and for areas (main artist area is still somewhat fuzzy because it is a fuzzy concept, but at least there’s something now).

Also new (although most parts were just moved into it from existing guidelines where they didn’t really fit) are the guidelines for artist credits.

The English guidelines got cleaned up a bit, but without major changes (only the addition of a section on “O’Clock”).

The alias guidelines got updated to take into consideration the “primary for locale” option, and got a section for sort names (based on the old guidelines for label sort names, which were removed since labels themselves no longer have sort names).

Some indications on barcodes were added to the release guidelines.

The ability for work-work relationships (like “part of” and “version of”) to have dates was removed. Dates should be on the appropriate artist-work relationships, in most cases (like “arranger” and “translator”), and works which get new parts added / removed should be counted as different parent works to begin with.

“Different bootleg recordings of the same concert” was added to the list of things that should be in the same release group.

The guidelines mandating expanding abbreviations like “Vol.” and “Pt.” have been removed (the special exception of “feat.” has not changed). Titles should, in general, follow the release/track title. The only standardisation left is for series: see the series numbering guidelines.

Some basic work guidelines have been added, both for when to set a type and for when (and when not) to add a disambiguation comment.

Apart from that, a fair amount of relationships and release formats have been added and some sites whitelisted. See the full list below for details.

Bug

  • [STYLE-427] – Style/Artist hasn’t been updated for areas
  • [STYLE-524] – "Personal label" has the wrong cardinality

Improvement

  • [STYLE-149] – Update "Artists with multiple names"
  • [STYLE-203] – Add a "In homage to" relationship to works
  • [STYLE-228] – Allow grouping of multiple "identical" bootlegs w/ different titles
  • [STYLE-431] – Add "marketed by" release-label relationship
  • [STYLE-436] – Add castalbums.org to the Other Databases whitelist
  • [STYLE-447] – Add treble/boy soprano as a vocal type
  • [STYLE-453] – Add "Pathé disc" format
  • [STYLE-472] – Add operadis-opera-discography.org.uk to the Other Databases whitelist
  • [STYLE-487] – Remove honorary titles from artist names
  • [STYLE-492] – Add "printed in" release-area relationship
  • [STYLE-508] – Clarify the barcode field guideline when the scanned value differs from the numerical value
  • [STYLE-511] – Add SMDB to "Other Databases" whitelist
  • [STYLE-515] – Reorganise and clean up Style/English
  • [STYLE-522] – Create a basic Artist Credits style page
  • [STYLE-523] – Update Style/Aliases for "primary for locale"
  • [STYLE-527] – Work-work relationships shouldn’t allow dates

New Feature

  • [STYLE-437] – Add VHD as a media format
  • [STYLE-439] – Add Capacitance Electronic Disc (CED) as a media format
  • [STYLE-442] – Add classicalarchives.com to Other DBs whitelist
  • [STYLE-446] – Work-Event rel: Premiered at
  • [STYLE-450] – Add Copy Control CD (CCCD) as a media format
  • [STYLE-451] – Add Videogam.in to the Other Database whitelist
  • [STYLE-475] – Add artist-place "organist" relationship
  • [STYLE-479] – Add artist-artist teacher relationship
  • [STYLE-483] – Add "incidental music" work type
  • [STYLE-498] – URL whitelist request : mvdbase
  • [STYLE-512] – Add an artist-artist "composer in residence" relationship

Task

  • [STYLE-393] – Drop "EP" attribute from the "single from" relationship type
  • [STYLE-404] – Update series (volume number, etc) guidelines now that we have series
  • [STYLE-414] – Add Spirit of Rock to the Other Databases whitelist
  • [STYLE-441] – Add guideline which covers artists in work disambiguation comments
  • [STYLE-444] – Add a relationship type for linking recordings to their corresponding music videos
  • [STYLE-445] – Add a relationship type for artists featuring in videos
  • [STYLE-491] – Add guideline about where to set a work type
  • [STYLE-496] – Other DB: Traditional Tune Archive (tunearch.org)
  • [STYLE-497] – Other DB: FolkWiki; folkwiki.se
  • [STYLE-500] – Ability to link to bandsintown.com
  • [STYLE-501] – Add "has BookBrainz entry" relationship
  • [STYLE-513] – Add SHM-SACD as a medium format
  • [STYLE-514] – Clarify how "o’clock" should be capitalised in English
  • [STYLE-516] – Add a relationship for linking artists to their tours
  • [STYLE-517] – Update Theatre guidelines to remove link to deprecated Opera guidelines
  • [STYLE-518] – Approve MusixMatch as a lyrics source
  • [STYLE-521] – Label sortname guideline needs updating