Category Archives: Server

Schema change release, 2015-05-18 (including upgrade instructions)

Our previously mentioned schema change release is finished! Below will be upgrade instructions, including configuration updates for replication access tokens.

This release does not include UI for several of the schema change patches, which will (hopefully) happen for next release on June 1. The incomplete patches are MBS-7489 (credits for artists in relationships), MBS-4145 (tag upvote/downvote), and MBS-8004 (collections for additional entity types). These patches have had their schema change components finished, but the UI was incomplete or needed more work.

Schema Change Upgrade Instructions

These are largely as previous upgrade instructions, using the tag v-2015-05-18-schema-change. The primary difference is the inclusion of configuring an access token for replication.

  1. Make sure your REPLICATION_TYPE setting is RT_SLAVE and your DB_SCHEMA_SEQUENCE is set to 21 in lib/DBDefs.pm. If you’re running a standalone server, you can run the upgrade, but it may be easier to just import a new data dump!
  2. Ensure you’ve replicated up to the most recent replication packet available with the old schema. (if you’re not sure, run ./admin/replication/LoadReplicationChanges and see what it tells you; if you’re ready to update, it should say “Mismatched schema sequence, 21 (database) vs 22 (replication packet)”).
  3. Take down the web server running MusicBrainz, if you’re running a web server.
  4. Turn off cron jobs if you are automatically updating the database via cron jobs.
  5. Switch to the new code with git fetch origin followed by git checkout v-2015-05-18-schema-change
  6. Run ./upgrade.sh (or carton exec -Ilib -- ./upgrade.sh if you’re using carton, with very old setups).
  7. Run cpanm --installdeps --notest . to ensure your perl-based dependencies are up to date. This release adds a dependency on LWP::Protocol::https, for fetching replication packets from the new server; many systems may already have this installed, but it should be verified.
  8. Set DB_SCHEMA_SEQUENCE to 22 in lib/DBDefs.pm as instructed by the output of ./upgrade.sh
  9. Assuming you have been updating your server with replication, it will now be necessary to configure an access token:
    1. Go to https://metabrainz.org/supporters/account-type and choose your account type as applicable. If you’re an individual, non-commercial user of the data, choose “non-commercial”; if not, choose an applicable tier in the “commercial” section. If you’re not sure of the appropriate tier, make your best guess; it can be adjusted if necessary.
    2. Then, from https://metabrainz.org/profile, create an access token, which should be a 40-character random alphanumeric string provided by the site.
    3. Finally, add this token to lib/DBDefs.pm under the REPLICATION_ACCESS_TOKEN configuration option. The final configuration section should look something like sub REPLICATION_ACCESS_TOKEN { "ck3UpgwgOXhWC6SpFcd99rZOTjzfrei3gQlgZZ9z" }.
    4. Don’t reveal your access token! If you do, inadvertently, you can use the MetaBrainz site to generate a new token, invalidating the old one. (The one in the example above is one I created for myself and then invalidated — don’t get any ideas, it won’t work!)
  10. Turn cron jobs back on, if applicable.
  11. Restart the MusicBrainz web server, if applicable. It’s also recommended you restart memcached.

Finally, the list of bugs closed this release:

Bug

  • [MBS-4436] – Medium titles cannot be longer than 255 charaters

Improvement

  • [MBS-1347] – Implement aliases for release groups, releases and recordings
  • [MBS-7906] – maybe don’t show “”≠null diff. in edit pages
  • [MBS-8279] – Remove empty_artists etc. database functions

New Feature

  • [MBS-8302] – Add Live Data Feed access token support

Task

  • [MBS-8266] – Make medium titles VARCHAR NOT NULL
  • [MBS-8278] – Update DB_SCHEMA_SEQUENCE in DbDefs.pm.sample
  • [MBS-8283] – Remove DB constraint that disallows empty event names

Not included in this list but also relevant is MBS-8349, which while fixed for a previous release, in this release is also applied to old slave servers, which may help performance for some queries.

Server update, 2015-04-27

Here’s the final server update before our 2015-05-11 schema change. This posting comes late, which I apologize for, since the release coincided with switching over to our new load balancers, and I had trouble deploying things at first.

Included in this update are new “Guess feat. artists” buttons, seen on recording, release, and release group edit pages. Their purpose is to help move featured artists from titles to artist credits. To find them, look for icons with “ft.” next to the older guess-case ones.

We also now have the possibility for ordering relationships other than series ones (say, work parts), though as of this writing, they still need to be enabled as a style matter.

Thanks to chirlu, nikki, Ujjwal Wahi, and the MetaBrainz team for contributing to today’s release. The git tag is v-2015-04-27 and the complete changelog is below.

Bug

  • [MBS-8284] – Contact user form should require a verified email address
  • [MBS-8321] – Readding same information in the relationship editor gives unexpected results
  • [MBS-8349] – Unique (medium, position), (track, position) indexes no longer created on slave databases

Improvement

  • [MBS-3375] – Ability to specify sort order for sub-works that are part of an aggregate work.
  • [MBS-3799] – Show more information for releases in the inline search
  • [MBS-8332] – Make removing URLs an auto-edit for auto-editors

New Feature

  • [MBS-3388] – “Guess artist credits” – automation support for RFC-327 and moving feats etc to artist-credits

Task

  • [MBS-8344] – Remove abbreviation expanding from guess case

Server update, 2015-04-06

Another release out today, this time with a couple new features for editors to take note of:

  • It’s now possible to add releases without any mediums attached. This is useful for cases where the release is known to exist but the tracklist is unknown. Currently this feature is a bit hidden: you have to remove the default blank medium to expose a checkbox that lets you confirm your intention to submit things that way. Based on community feedback, we can possibly make this more visible in the future. Note that it’s not (yet) possible to add mediums without tracks, only releases without mediums.
  • There’s now a minimum voting period length for destructive edits like merges and removals. Even if such edits get three yes votes, they’ll remain open for a minimum of 48 hours.

Thanks to chirlu (especially for the latter feature) and the MetaBrainz team for their work on today’s release. The git tag is v-2015-04-06 and the complete changelog is below.

Bug

  • [MBS-8137] – Release editor does not prevent duplicate label/catno pairs
  • [MBS-8162] – “Add work” from artist sidebar includes useless artist parameter
  • [MBS-8163] – Merging labels can produce duplicate label/catno pairs
  • [MBS-8300] – Medium title “0” not displayed
  • [MBS-8313] – Can’t remove last attribute from a relationship via the /relationship-editor endpoint

Improvement

  • [MBS-8234] – Enforce a minimum voting period for destructive edits
  • [MBS-8327] – Show an icon for CDBaby links

New Feature

  • [MBS-3235] – Allow adding release stubs (without tracklist)

Server update, 2015-03-23

Today’s release focuses on fixing various editing bugs, particularly in the release editor. For those who use the track parser often, it should be a lot better at retaining track MBIDs after you reuse recordings now.

The entire list of changes for today is below. Thanks to chirlu and the MetaBrainz team for working on these tickets.

The git tag is v-2015-03-23.

Bug

  • [MBS-5108] – Setting end-date is no longer auto-edit
  • [MBS-7719] – “Reuse previous recordings” button not available after using the track parser
  • [MBS-7983] – After adding “translator” relationship, “translated” flag gets stuck for subsequent relationship
  • [MBS-8212] – “Copy the release artist credit to the release group” on the release page is broken
  • [MBS-8272] – Web service doesn’t output language or script for releases that are relationship targets
  • [MBS-8285] – Unable to edit release with deprecated external links
  • [MBS-8286] – Seeding URLs on Add Release is broken
  • [MBS-8296] – It’s now impossible to re-order Release Groups in a Manual Series

Improvement

  • [MBS-7071] – Make it possible to use the Track Parser to fix mistakes without creating new Track MBIDs
  • [MBS-8231] – Use https URLs in emails
  • [MBS-8281] – Remove the knockout-postbox plugin

Postgres troubles

(Regular readers of this blog, please ignore this post. We’re casting a wide net to try and find help for our problems.)

UPDATE: This problem has been resolved and all of our services are returning to their normally dubious service levels. For a technical explanation of what went wrong, see here.

Dear Postgres gurus:

We at MusicBrainz have been very happy postgres users for over a decade now and Postgres is something that gives is very few headaches compared to all the other things that we run. But last week we started having some really vexing issues with our server. Here is some back-story:

http://blog.musicbrainz.org/2015/03/14/hosting-issues-downtime-tonight/

When our load spiked, we did the normal set of things that you do:

  • Check for missing indexes, made some new ones, no change. (see details below)
  • Examined for new traffic; none of our web front end servers showed an increase in traffic.
  • Eliminated non-mission-critical uses of the DB server: stop building indexes for search, turn off lower priority sites. No change.
  • Review the performance settings of the server. Debate each setting as a team and tune. shared_buffers and work_mem tuning has made the server more resilient to recover from spikes, but still, we get massive periodic spikes.

From a restart, everything is happy and working well. Postgres will use all available ram for a while, but stay out of swap, exactly what we want it to do. But then, it tips the scales and digs into swap and everything goes to hell. We’ve studied this post for quite some time and ran queries to understand how Posgres manages its ram:

http://www.depesz.com/2012/06/09/how-much-ram-is-postgresql-using/

And sure enough ram usage just keeps increasing and once we go beyond physical ram, it goes into swap. Not rocket science. We’ve noticed that our back ends keep growing in size. According to top, once we start having processes that are 10+% of ram, we’re nearly on the cusp of entering swap. It happens predictably time and time again. Selective use of pg_terminate_backend() of these large back ends can keep us out of swap. A new, smaller backend gets created, RAM usage goes down. However, this is hardly a viable solution.

We’re now on Postgres 9.1.15, and we have a lot of downstream users who also need to upgrade when we do, so this is something that we need to coordinate months in advance. Going to 9.4 is out in the short term. :( Ideally we can figure out what might be going wrong so we can fix it post-haste. MusicBrainz has been barely usable for the past few days. :(

One final thought: We have an several tables from a previous version of the DB sitting in the public schema not being used at all. We keep meaning to drop those tables, but haven’t gotten around to it yet. They tables are not being used at all, so we assume that they should not impact the performance of Postgres. Might this be a problem?

So, any tips or words of advice you have for us, would be deeply appreciated. And now for way too much information about our setup:

Postgres:

9.1.15 (from ubuntu packages)

Host:

  • Linux totoro 3.2.0-57-generic #87-Ubuntu SMP Tue Nov 12 21:35:10 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
  • 48GB ram
  • Raid 1,0 disks
  • PgBouncer in use.
  • Running postgres is its only task

postgresql.conf:

archive_command = '/bin/true'
archive_mode = 'on'
autovacuum = 'on'
checkpoint_segments = '128'
datestyle = 'iso, mdy'
default_statistics_target = '300'
default_text_search_config = 'pg_catalog.english'
data_directory = '/home/postgres/postgres9'
effective_cache_size = '30GB'
hot_standby = 'on'
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'
listen_addresses = '*'
log_destination = 'syslog'
log_line_prefix = '<%r %a %p>'
log_lock_waits = 'on'
log_min_duration_statement = '1000'
maintenance_work_mem = '64MB'
max_connections = '500'
max_prepared_transactions = '25'
max_wal_senders = '3'
custom_variable_classes = 'pg_stat_statements'
pg_stat_statements.max = '1000'
pg_stat_statements.save = 'off'
pg_stat_statements.track = 'top'
pg_stat_statements.track_utility = 'off'
shared_preload_libraries = 'pg_stat_statements,pg_amqp'
shared_buffers = '12GB'
silent_mode = 'on'
temp_buffers = '8MB'
track_activities = 'on'
track_counts = 'on'
wal_buffers = '16MB'
wal_keep_segments = '128'
wal_level = 'hot_standby'
wal_sync_method = 'fdatasync'
work_mem = '64MB'

pgbouncer.ini:

[databases]
musicbrainz_db_20110516 = host=127.0.0.1 dbname=musicbrainz_db_20110516

[pgbouncer]
pidfile=/home/postgres/postgres9/pgbouncer.pid
listen_addr=*
listen_port=6899
user=postgres
auth_file=/etc/pgbouncer/userlist.txt
auth_type=trust
pool_mode=session
min_pool_size=10
default_pool_size=320
reserve_pool_size=10
reserve_pool_timeout=1.0
idle_transaction_timeout=0
max_client_conn=400
log_connections=0
log_disconnections=0
stats_period=3600
stats_users=postgres
admin_users=musicbrainz_user

Monitoring:

To see these, enter these anti-spam passwords: User: “musicbrainz” passwd: “musicbrainz”

Load: http://stats.musicbrainz.org/mrtg/drraw/drraw.cgi?Mode=view;Template=1196205794.8081;Base=%2Fvar%2Fwww%2Fmrtg%2F%2Ftotoro_load.rrd

Disk IO: http://stats.musicbrainz.org/mrtg/drraw/drraw.cgi?Mode=view;Template=1196376086.1393;Base=%2Fvar%2Fwww%2Fmrtg%2F%2Ftotoro_diskstats-sda-count.rrd

RAM Use: http://stats.musicbrainz.org/mrtg/drraw/drraw.cgi?Mode=view;Template=1196204920.6439;Base=%2Fvar%2Fwww%2Fmrtg%2F%2Ftotoro_disk-physicalmemory.rrd

Swap use: http://stats.musicbrainz.org/mrtg/drraw/drraw.cgi?Mode=view;Template=1196204920.6439;Base=%2Fvar%2Fwww%2Fmrtg%2F%2Ftotoro_disk-swapspace.rrd

Processes: http://stats.musicbrainz.org/mrtg/drraw/drraw.cgi?Mode=view;Template=1196376477.1968;Base=%2Fvar%2Fwww%2Fmrtg%2F%2Ftotoro_processes.rrd

Indexes:

We ran the query from this suggestion to identify possible missing indexes:

http://stackoverflow.com/questions/3318727/postgresql-index-usage-analysis

this is our result:

https://gist.github.com/mayhem/423b084043235fb78642

Most of these tables are tiny and kept in ram. Postgres opts to not use any indexes we create on the DB, so no change.

UPDATES:

  • Five months ago we double the RAM from 24GB to 48GB, but our traffic has not increased.
  • We’ve set kernel.swapiness to 0 with no real change.
  • free -m:
             total       used       free     shared    buffers     cached
Mem:         48295      31673      16622          0          5      12670
-/+ buffers/cache:      18997      29298
Swap:        22852       2382      20470

Server update, 2015-03-09

This one’s mostly a small bug-fix release, since improvements to sitemaps and other big projects have been concurrently in the works. One thing of note is that the “external links editor” on entity edit pages was rewritten to be more maintainable in the long run; hopefully no bugs have creeped in with the new code, but if any have then please report them on our issue tracker!

Thanks to reosarevok and the MetaBrainz team for working on the changes below. The git tag is v-2015-03-09.

Bug

  • [MBS-8055] – Series-Series relationships are blocked by JS
  • [MBS-8151] – Filename not being shown in remove cover art edits
  • [MBS-8221] – Work pages don’t show event rels
  • [MBS-8270] – Bottom display of work relationships is inaccurate for works appearing across multiple tracks

Improvement

  • [MBS-7913] – Allow seeding of non-URL ARs when creating non-Release entities via URL parameters
  • [MBS-8258] – Rewrite the external links editor in React

Server update, 2015-02-23

We have another server release out today (a little over a day late, as given away by the title). The most exciting change this time around is CritiqueBrainz integration: on release group pages, the most popular and most recent reviews from CritiqueBrainz are now displayed, along with links to write your own. Hopefully people find these reviews useful and interesting and are encouraged to start contributing to CritiqueBrainz themselves!

As usual, the release contains a variety of other bug fixes and improvements, detailed in the changelog below. Thanks very much to chirlu, Freso, reosarevok, and the MetaBrainz team for their work on today’s release. The git tag is v-2015-02-23.

Bug

  • [MBS-4928] – It is possible for non-auto-editor users to submit ‘approve’ votes
  • [MBS-5959] – _uniq indices for editor_subscribe_{artist,editor,label} not unique
  • [MBS-6763] – Entity merges don’t save Ended if there’s no date
  • [MBS-8200] – ws/js/edit does not validate release event dates
  • [MBS-8201] – Dates with year == 0 are invisible
  • [MBS-8203] – Spurious spaces in relationship target lists
  • [MBS-8224] – Events can’t be rated via /ws/2/ratings

Improvement

  • [MBS-6164] – Filter out duplicate annotations when merging
  • [MBS-6885] – hard limit height of wikipedia excerpts the same as annotations
  • [MBS-7477] – Add a unique constraint on track (medium, position)
  • [MBS-7872] – Show CritiqueBrainz reviews on the release group pages
  • [MBS-8205] – Remove entity browse pages
  • [MBS-8220] – Align Artist columns in discographies

Task

  • [MBS-3998] – Editor may vote / may add edit note checks should also be done at the Data level.
  • [MBS-6451] – Update cleanup code for Amazon India and Brazil
  • [MBS-8107] – Add autoselect for Sina Weibo URLs
  • [MBS-8176] – Add autoselect for CD Japan URLs
  • [MBS-8192] – Add autoselect for work-level IMDb rels
  • [MBS-8199] – Unset the date for release events that have 0000-00-00
  • [MBS-8226] – Add ClassicalArchives.com to Other DBs whitelist