TRM stats disabled

Because the servers seem to be under an ever-increasing load, I decided to disable TRM statistics updates. So far it seems to have had the desired effect – the server load is down, and responsiveness is up.


There are four TRM tables in the database: trm and trm_join (which store information about which TRMs are on what tracks); trmjoin_stat and trm_stat (which store information about how often the trm and trm_join rows are used.

trmjoin_stat is basically totally unused. It contains one row, which was a test row we inserted when the table went live. So, let’s not worry ourselves about this one.

trm_stat on the other hand accounts for 14% (by byte count) of the size of the whole database, is updated every time the tagger uses a TRM, and because it has a “month” column, it’s always growing. It’s constantly updated (because people are constantly using the Tagger). And – here’s the good bit – the server never uses the data in this table. So, turning off updates of trm_stat seems like a no-brainer.

There’s also the column “trm.lookupcount”, which is like trm_stat but it’s a running total instead of a per-month value. The lookupcount is used by the server, at least in the sense that it’s visible on the “track info” page. However it too is very expensive to update, so updates of this value are disabled now too.

So far the server load looks quite a bit better.

On a related matter it’s still proving necessary to “prune” (i.e. massively and mostly randomly cull) TRMs every 5-6 weeks. Removing TRMs from the TRM signature server itself is a reasonably quick process (I think it takes about 10 minutes). However removing those same TRMs from the MusicBrainz database takes much longer – often over 24 hours. Again this is in large part due to the trm_stat table.

All in all I’m very much tempted to drop (or at least empty) the trm_stat table. It’s causing headaches, and (as far as I can tell) has never achieved any of its aims.

12 thoughts on “TRM stats disabled

  1. teleGUISE

    Guess it was a good thing I accidentally clicked
    the ‘Brain Box’ area when trying to click a tab above it.
    I would have never known why all of a sudden things
    were flowing much smoother.

    So at this point all trm counters are frozen then, correct?
    You say the server never used trm_stat was that
    info stored on the Signature Server then and thats
    why never visible?

    So is the reason the removal of trms from the server
    takes longer because it has to also remove the
    related stat info for that trm?

    ‘What exactly was the original aim’ of the trm_stat
    table and the never incrementing ‘Use count’er?

    On a semi-related note from conception of FUCH’s
    trm dupe report (about a week ago) to date zout &
    I have removed some 4,000+ TRM’s!!

  2. Dave Evans

    The trm counters are, indeed, frozen. The original intention, I think, of the trm_stat table was to be able to analyse trends in TRM usage, so for each TRM you could see a month-by-month hit count. However although that data was in the main Postgresql database (and was included in the data dumps, available for download), no user interface was ever made to read or otherwise use the data. In fact, nothing accessed the data at all apart from the background process to update the stats, and the backup/export scripts.

    It seems to have been a bad idea generally, and overall it’s clear to me that the cost of losing those stats is far outweighed by the benefits. It ate up an awful lot of the servers’ power keeping those stats up to date, and as you say, when TRMs were pruned the stats had to be pruned too.

  3. zout

    I would like to see that no trms attached to non-album tracks are pruned. It’s often difficult to determine whether they’re really non-album or not; always keeping the trms will give at least some clue.

  4. teleGUISE

    Thanks Dave.. It would have been interesting to see had the hit count worked per track. Having a total
    from the combination of any track attached makes it kind of useless and is better off disabled.

    I can’t see how any can be pruned until the ‘trm.lookupcount’ stat is re-enabled.
    I thought that was the criteria never knew there was a ‘random’ variable added to that though.

  5. dupuy

    The only problem I see with freezing the TRM stats is that (as I understand it) while the use count may not used by the server per se, it is used by the pruning script (this is what teleGUISE is referring to, I think). As it stands now, *all* TRMs that are newly submitted will be pruned, as their use count will never go above 0. Previously, there was at least a chance that somebody other than the original submitter would use a TRM to tag a track in the 5-6 week window between prunings, and thus the TRM would be kept.

    If you are really going to leave the stats counters frozen, you might as well disable the Add TRM operation entirely, as they have no permanent effect on the MB DB. (Note, I am not actually supporting doing this; I think TRMs are useful and would rather see some kind of change to the stats that reduces the DB load – perhaps by eliminating the Month column?)

    @alex

  6. Dave Evans

    I think you over-estimate the level of sophistication involved in TRM pruning. There is no “script” – it’s me, performing the same manual steps, each time.

    You wrote (sort of) “there was a chance that somebody would use a TRM and thus the TRM would be kept”. Last time I pruned the TRM database, only TRMs with a lookupcount >= 10 were kept. Thus, looking up info on a TRM once would not be enough to save it from oblivion.

    And yes, disabling “Add TRM” (i.e. turning “submit TRMs” into a no-op) isn’t such a bad idea.

  7. teleGUISE

    Yes, Alex described exactly what I meant by the counter never accruing. (always 0)

    Wow…the ‘little machine’ is Dave. Why is that?
    Seems a whole lot easier to let a script check if use count <=X than Dave=check many trm’s.

    I’d really like to see my suggestions of auto dupe pruning be done.
    One run should put things in check & give manual editing a chance to cope.
    Run both and the database should float.

  8. Dave Evans

    I think you underestimate the complexity of performing a TRM prune. Suffice to say it crosses all three servers and I haven’t found it worthwhile to try to automate it yet. Also I think you overestimate the effects of removing a few TRMs; I routinely slash millions of TRMs from the system, but the effect is only temporary – they (or at least, other TRMs) are back to replace them all too soon.

  9. teleGUISE

    Well granted I don’t know the behind the scenes action and just see the web front end for removal.
    You say you routinely ‘manually’ slash millions of trm’s ??
    (I must be missing something because that seems, well, a little far fatched).
    It’s taken about 2 weeks and I’ve only managed 5,000 (dupes).

    And speaking of dupes the effect of slashing them
    should, I hope be much longer than temporary.
    It may not free up as many resources due to many
    still being linked however it will clean up many
    bad indexes & the cause of more mistags. Which
    means a much happier user experience :).

    Over 40,000 crosslinks have accrued in a few years .
    Its going to take a little automation to get those numbers in check to allow manual maintenance.

  10. Dave Evans

    Far-fetched? Yes, if I were to use the web interface. But I don’t.

    This graph [http://musicbrainz.org/mrtg/bender.cgi?log=bender-datsize] shows the disk space occupied by the TRM database. Divide by 564 bytes per TRM to get the approximate number of TRMs. So for example 1.32Gb / 564 bytes per TRM = approx 2.35 million TRMs. When that graph reaches 2.0Gb, it’s time to “prune”.

  11. teleGUISE

    …I didn’t mean to imply you were making it up 😉
    but its hard to fathom millions when I am pecking my butt off and only managed a mere few thousand.
    So I kind of figured there was some ‘secret’ hence beneath the foo foo web interface & direct to the db.
    I gather indexing of which track(s) the trm belongs is several bytes per & part of the 1.32Gb then, correct?

    So how ’bout wiping those crosslinks? It couldn’t hurt & bound to knock a few points off the total database size. It won’t catch the many tracks mislinked having track times far from the calc’d
    trm but a start.

  12. Eamon Nerbonne

    Disclaimer: I have no idea about musicbrainz’s internal structure. And I also don’t know how to insert line breaks in this message – Oh well.

    ==========================================================================

    Why is the total TRM count a problem? I image 2.0 gigs is either a limit imposed by the system memory amount, or the VM size (if it’s a 2/2 32bit kernel/user split). But why would you need to store the entire TRM in the first place in such a high speed, high cost medium?

    ==========================================================================

    You could for example make due with the CRC-64 of the TRM for all intents and purposes: According to my methodologically imprecise back of the hand calculations; the chance of a single collision would be at most approx 1 in 10 million. The data isn’t that good; this is definitely an option which would hardly impact data quality. (Methodology: 2^64/(2000000^2/2), which is a slight overestimation but as n! isn’t exactly calculable here, and I’ve forgotten the continous equiv. atm). If you’re still worried about collisions, just up the number of bits in your hash… with 128 bits the chance of a collision are so minuscule you really should start worrying about world peace instead :-).

    ==========================================================================

    You could also choose to go with a disambiguation scheme; work with the CRC-64 normally but use the TRM if that hash is marked as colliding: that check path will be so infrequent that it’s not an issue if it’s gotta check the HDD. You can even work with a CRC32 check only by default without too much problem – though then you’ll need to use a disabiguation scheme.

    ==========================================================================

    Frankly, it’s gotta be possible to get this to run efficiently on a much smaller setup, with even more meta-information, rather than less :-).

Comments are closed.