UPDATE: This clearly going to be a major hassle, so we’ll spend the extra time coding a program that will sanitize the data before it goes into splunk.
Last week Google’s Summer of Code program started and my student Dániel Bali is ready to get busy combing through our massive logs and see what sorts of information he can mine from our logs.
We only have one minor problem — our logs contain the IP addresses of our users and some requests contain the user names of the person making the request. Removing this private information from the logs before Dániel sees them is quite a pain to do well.
I would like to propose that we:
Consider Dániel part of our core team for the summer and allow him to see IP addresses and all the requests in full.
Have Dániel sign a short statement stating that he will not divulge any private information.
Will fail him in his GSoC project if he does divulge any private information.
If this is not acceptable to you, please speak up soon. I would like to make this happen early next week so Dániel can continue his GSoc work.
UPDATE: The final output of Dániel’s work will not contain any private information. If we end up using any private data as input, we will sanitize it and remove private information before we publish the output.
Our forums were compromised a while ago and we had to undergo massive yak shaving in order to set up a new home for our forums. Hosting many different types of software on one server makes that server hard to administer — we felt that the proper solution was to create a Virtual server host and give each type of software a new Linux instance to live in. We started with the forums, but we’re going to be moving a lot more stuff over to this virtual host in the coming weeks/months. Hopefully we can update our blog and our wiki in this process.
In any case, the forums are back online and running with the old posts, but the latest version of PunBB. In the move we lost our customized MusicBrainz theme for PunBB — if someone feels strongly about having the theme, please take a look at the PunBB docs and create a new theme. I’ll be glad to install that new theme on our server.
Thanks and sorry the forums were offline for so long!
We’ve just finished pushing an update to our web servers. This is our first release since the schema change, and we’ve tried to address the problem of artist landing pages. As a temporary solution, we’ve split the page up by type a bit more now, which we hope is a step in the right direction. We’re currently discussing this at User:Reosarevok/Overview Options on the wiki, and your feedback is important. If you feel strongly about this, please have a read at the ideas on that page and feel free to comment/add your own.
This release features work from Ian McEwen, nikki, Joachim LeBlanc, Nicolás Tamargo and the rest of the MusicBrainz team. Thanks everyone for your hard work!
- [MBS-1121] – Disabled submit buttons have no distinctive style
- [MBS-3861] – IMDb links not fully normalized
- [MBS-4278] – Copy changes to recordings edits the wrong recordings if recording associations were changed
- [MBS-4336] – Release editor > removing a track resets manually changed track positions
- [MBS-4520] – Deleting tracks does not update track numbers
- [MBS-4569] – Full page titles aren’t shown in /doc
- [MBS-4621] – Don’t use empty <img> tags on pages with no images
- [MBS-4649] – YouTube channel autoselect is broken
- [MBS-4656] – “Video” option for “can be streamed for free at” relationship is listed under the “License” subheading.
- [MBS-4662] – ISE: Can’t edit a relationship attribute
- [MBS-4664] – Use of uninitialized value in sprintf at lib/DBDefs.pm line 314.
- [MBS-4681] – Edit maked as Applied, but it isn’t
- [MBS-4687] – Link to create new relationship types doesn’t work
- [MBS-4697] – Approve votes missing from edit search
- [MBS-4720] – Audiobook is a primary release-group type
- [MBS-4733] – Merging release groups fails if release groups have secondary types
- [MBS-4752] – Musicbrainz webservice <iswc-list> changes break compatibility with existing applications.
- [MBS-4757] – Edit artist alias (sort name) is auto-edit
- [MBS-4760] – Release groups with secondary types cannot be deleted
- [MBS-4765] – DB_SCHEMA_SEQUENCE hasn’t been updated to 15 in DBDefs.pm.default
- [MBS-4767] – Cannot accept edit release group edits that change the primary type to a type that no longer exists
- [MBS-4769] – Bad description row in statistic_event
- [MBS-4770] – ISE: Error when requesting non-existent relationship types on /relationships
- [MBS-4772] – ModBot cannot apply old edit work edits that add ISWCs
- [MBS-684] – TOC lookup displays too little release info
- [MBS-1874] – Search for the documentation
- [MBS-3748] – Adding new instruments is a pain
- [MBS-4298] – Confusing text when merging releases+recordings
- [MBS-4561] – Make disambiguation on tracklist credits smaller
- [MBS-4568] – Add <bdi> tags to help with rendering of RTL text
- [MBS-4657] – Add cover art to timeline
- [MBS-4700] – Fix inline buttons
- [MBS-4742] – Add a mention to the “Split Into Separate Artists” page that aliases prevent a split (or automatic removal)
- [MBS-4762] – Cover art statistics should be displayed on the tabular pages.
- [MBS-4686] – Add wikisource.org to the lyrics whitelist
In other news, Oliver will be on holiday for one week (yipee!), and will be back in a week. Reachable via email if need be.
The commit sha for this release is
, a Git tag will follow when Rob is back tomorrow. and the git tag is
Yesterday we found a bug that prevents the import of a post schema change update data set. We’ve pushed out a fix for this and tagged it with:
If you’re planning on importing a new data set, make sure to check out this tag, rather than the tag mentioned in this entry.
In case you haven’t gotten enough of release announcements, we have another one for you. Yesterday during the main releases we also released a new search server to match the main server release. Thanks much to Paul Taylor for working on this release to be timed perfectly!
UPDATE: The search server and the MMD schema repositories have been tagged with this tag:
- [SEARCH-198] – The artist is getting a lowered score on MBS
- [SEARCH-199] – Search includes empty annotations
- [SEARCH-200] – Search on release giving to much boost to matches on CatalogNo
- [SEARCH-201] – explain option doesnt work if search results contain non ISO-8859-1 characters
- [SEARCH-216] – Null pointer exception when building freedb
- [SEARCH-157] – Be able to search for a track by its metadata OR its puid
- [SEARCH-186] – Search Server has hard coded redirect URL
- [SEARCH-187] – Update Junit Test from 3 to 4
- [SEARCH-202] – Allow searching for RGs based on their releases’ status
- [SEARCH-204] – Upgrade codebase to Lucene 3.6
- [SEARCH-214] – Add release group ID to the web service indexed search results for recordings
- [SEARCH-205] – Search server should return multiple ISWCs for works
- [SEARCH-207] – Changes due to introduction of ISO-3 language code
- [SEARCH-208] – Chnages due to Split release group attributes into two types Schema Change
- [SEARCH-209] – Support for Multiple IPI Artists
- [SEARCH-211] – Support for new Track ‘Number’ field in a track
- [SEARCH-212] – Add abiility to index, display and search works by lyrics language as part of schema change
- [SEARCH-213] – Changes due to MBS-1385:Support unknown end dates
Regrettably, a couple of errors were found close to the release of v4.0.2 and v5.0.0. I have just released v4.0.3 and v5.0.1 with the following changes:
– Fix LMB-32 – Correctly ignore unrecognised nodes
– Don’t compile using -Werror when building from tarball
The releases are available:
(MD5 checksum: 19b43a543d338751e9dc524f6236892b)
(MD5 checksum: a0406b94c341c2b52ec0fe98f57cadf3)
Documentation for the new version is available under
Apologies to all for the need to make this release so soon after the last one.
Nearly one year after we released NGS, we have another schema change update with lots of new features!
This release contains 9 new features and improvements that take advantage of the new schema. These are:
- More social user profiles which can now have Gravatars, languages (and the users proficiency) age and country.
- More expressive aliases for artists, labels and works. Aliases can now have types, sort names and multiple aliases may be used per a locale, along with the ability to mark one alias as ‘primary’ for that locale.
- Release group types have been separated into primary and secondary types. A release group now has 1 primary type and may have multiple secondary types. This allows us to have ‘remix compilation albums’, for example
- Works may have multiple ISWCs
- Artists, labels and relationships may be marked as ‘ended’ to indicate that they have ended, but the exact date is not known
- Vinyl style/free text track numbers are now supported.
- Works may have a lyrics language associated with them
- Artists and labels may have multiple IPIs
- We have moved to use ISO 639-3 for our language table. While not all languages are exposed at the moment, this gives us a lot more flexibility going forward.
Many thanks to nikki for going way beyond our expectations for testing (and patience!); to Ian McEwen for his continued work on statistics; and to the MusicBrainz team for making this all happen.
If you have a replicated instance of MusicBrainz, please follow these instructions to get your server running on the new schema:
- Take down the web server running MusicBrainz, if you’re running a web server.
- Turn off cron jobs if you are automatically updating the database via cron jobs.
- Make sure your REPLICATION_TYPE setting is RT_SLAVE
- Switch to the new code with
git fetch origin followed by
git checkout v-2012-05-15-schema-change
carton install --deployment. If you have not switched your installation to using carton, please read INSTALL.md on how to do this.
carton exec -- ./upgrade.sh from the top of the source directory.
- Set DB_SCHEMA_SEQUENCE to 15 in lib/DBDefs.pm
- Turn cron jobs back on, if needed.
- Restart the MusicBrainz web server, if needed.
If you are running a mbslave mirror, check out the latest code and read the upgrade instructions in the README file.
- [MBS-3189] – Remove unused ref_count column and related functions
- [MBS-4616] – Add work language statistics
- [MBS-4629] – /cover-art page shows no collections
- [MBS-4637] – Timeline graph won’t graph anything without an entry in statistics/view.js
- [MBS-4640] – Clicking cover art opens box with “����” (4 U+FFFD)
- [MBS-4642] – Thickbox CSS interferes with MB CSS
- [MBS-4647] – Cover art page allows submitting edit with no cover art when JS is off
- [MBS-4648] – Changing cover art type from “other” to unset causes Internal Server Error
- [MBS-4678] – upgrade.sh is not ready for testing
- [MBS-4679] – Internal server error adding secondary types to a release
- [MBS-1485] – Alias types
- [MBS-1798] – Lyrics language for works
- [MBS-1799] – Add ISO 639-3 language codes to the database
- [MBS-1981] – Add blog feed to the home page
- [MBS-2240] – Aliases: certain locale can be used only once in the list of aliases
- [MBS-2532] – Allow more than one IPI per artist
- [MBS-2851] – Timeline graph events should be in the database
- [MBS-2885] – Allow more than one ISWC per work
- [MBS-3646] – Split release group attributes into two types
- [MBS-3788] – Alias improvements
- [MBS-4625] – Improve wording of cover art tab when cover art comes from relationships
- [MBS-4676] – Do not allow people entering deprecated relationships
- [MBS-842] – Allow vinyl style track numbers and sides
- [MBS-1385] – Support unknown end dates
- [MBS-3704] – Allow adding sort names to artist aliases
- [MBS-4337] – Make user profile more social: add (optional) fields avatar, gender, birth year, country