Because the servers seem to be under an ever-increasing load, I decided to disable TRM statistics updates. So far it seems to have had the desired effect – the server load is down, and responsiveness is up.
There are four TRM tables in the database: trm and trm_join (which store information about which TRMs are on what tracks); trmjoin_stat and trm_stat (which store information about how often the trm and trm_join rows are used.
trmjoin_stat is basically totally unused. It contains one row, which was a test row we inserted when the table went live. So, let’s not worry ourselves about this one.
trm_stat on the other hand accounts for 14% (by byte count) of the size of the whole database, is updated every time the tagger uses a TRM, and because it has a “month” column, it’s always growing. It’s constantly updated (because people are constantly using the Tagger). And – here’s the good bit – the server never uses the data in this table. So, turning off updates of trm_stat seems like a no-brainer.
There’s also the column “trm.lookupcount”, which is like trm_stat but it’s a running total instead of a per-month value. The lookupcount is used by the server, at least in the sense that it’s visible on the “track info” page. However it too is very expensive to update, so updates of this value are disabled now too.
So far the server load looks quite a bit better.
On a related matter it’s still proving necessary to “prune” (i.e. massively and mostly randomly cull) TRMs every 5-6 weeks. Removing TRMs from the TRM signature server itself is a reasonably quick process (I think it takes about 10 minutes). However removing those same TRMs from the MusicBrainz database takes much longer – often over 24 hours. Again this is in large part due to the trm_stat table.
All in all I’m very much tempted to drop (or at least empty) the trm_stat table. It’s causing headaches, and (as far as I can tell) has never achieved any of its aims.