NGS: From here to there

[ Before reading this post, make sure to read the previous NGS related post ]

The question that is on my mind right now is how to build a coherent roadmap that gets us from the mb_server codebase that we’re running today to the NGS codebase, complete with new edit system. The factors that play into this are:

  1. mb_server codebase: This is the codebase that we’re running today. We’re updating it one more time this year and then early next year we hope to move to the Template Toolkit work.
  2. Template Toolkit: This is Oliver Charles’ work to clean up our codebase. Template Toolkit is available for perl and looks like it will be available for Python soon. Our hope is the clean up the codebase so that we’re ready to take on more developers to help with the development — especially as we move closer to NGS.
  3. NGS playground: See the previous post for details on this.
  4. NGS proper: This is the finished NGS that we roll out onto the MusicBrainz servers.

Finally, the BBC has been keen on getting what they are calling Cultural Identifiers. This name is a bit of a misnomer — essentially it would be the release related portions of NGS. Release groupings that allows us a more product centric approach to managing releases. Right now we list and identify releases with different track layouts as totally separate releases, even though they ought to be properly related. The BBC wishes this work to happen sooner than later and have indicated that they would be willing to sponsor this work.

That’s awesome, right??

Well, yes. But there is one problem. In the last post we concluded that we should move to NGS in one fell swoop. And now the BBC would like us to take an intermediate step? As much as we agreed that moving to NGS in one step, I think we must work with our most visible partner. Since we are severely resource constrained (we have just enough money to hire a part time University student right now) I feel compelled to find a way to get the BBC what they want as soon as possible while accepting money from them to boost our development funds. Taking money from the BBC may allow us to accelerate our development schedule towards NGS. But at the same time, it may slow us down getting to NGS.

I’m very much looking for feedback on how to best make this happen and how to best accomplish all these goals. Do you think that adding an intermediary step in exchange for funds from the BBC is an acceptable compromise?

16 thoughts on “NGS: From here to there”

  1. danBloo, they want to use these “cultural identifiers” to help develop their own applications using MusicBrainz, I believe – and because we are lacking this crucial feature, they cannot move forward.

  2. I think this would be a good compromise. I think it would also be a good sign to future partners that Musicbrainz is willing to listen to their wishes.
    Off course, it has to be feasible from a technical point of view.

  3. In general I’m in favour of incremental improvements. Allows you to get feedback on changes earlier, etc… So I’m definitely in favour of musicbrainz working on cultural identifiers first.

    Also keep in mind that the BBC aren’t the only people who want that feature. For example, I think it will also make niklas’ collection code more useful. (atleast, I assume niklas’ code doesn’t do any release grouping currently).

  4. >Do you think that adding an intermediary step in exchange for funds from the BBC is an acceptable compromise?

    Yes. And in any case, I think a lot of editors from the UK would be more than happy for more of their TV License fees working its way to MB! 😀

  5. A little hello from the BBC!

    In terms of why we’re keen on ‘cultural identifiers’ (you don’t like the name!?) Oliver and warp pretty much have it.

    In order to sustain our involvement in Musicbrainz we need to keep shipping the features our management have been requesting for our MBz-powered site at http://www.bbc.co.uk/music/beta. One of these is ‘a page per (cultural) release,’ which puts cultural identifiers right in the middle of our critical path. We’ve been nagging Robert about them for some time now, and we are lucky he is so very patient.

    Also – practically – it’s much easier for us to secure funds for smaller, specific functional increments than for a less well-defined epic like NGS.

    Thanks to Robert for being so open about this issue – I look forward to trying to help work out what’s the right thing to do here!

  6. Just a quick thought:
    We are abusing the ARs with the ‘part of a set’ AR now for the kind of thing we want to have with NGS. If it helps the BBC, what about creating one for the release grouping as well somehow? It’s dirty yes, but the info can also be used to partially automate the switch to NGS. That is, what we group now with an AR doesn’t have to be grouped again after the switch. We’d have to determine which release to use as a base and link to the other releases (only the first disc if it’s also ‘part of a set’ by AR) from there.

  7. Prodoc:

    Part of the Cultural Identifiers feature is to assign MBIDs to groups of albums. ARs aren’t capable of doing that right now. If it weren’t for that, your suggestion might have been a brilliant intermediary step!

  8. There are two realistic solutions, as far as I can see:

    1) Add a “release group” entity, for simplicity I’d call it just “album”, and an AR to connect releases to albums. This is so simple that could be hacked in maybe one weekend, but seems like a step sideways, not forward, to me. Mostly because for NGS I planned to have what used to be called AlbumId, now ReleaseId, to be called AlbumId again. The reason is that the current “ReleaseId” points to objects which are not that interesting from NGS point of view, and it would be a shame to waste the stable IDs in such a way.

    2) Another option is to implement the release grouping part of NGS (semi-)properly. I *think* I could implement it in not-so-long time, but obviously it’s not as simple as the first option. The advantage is that it is definitely a step forward, even if from the database point of view, and the code will have to be rewritten later.

  9. Sounds like a reasonable compromise to me. We’ve been waiting for/talking about NGS for many years now, so what’s another 5 or 6 months? 🙂

    Having said that, I guess I wonder/worry whether the rest of our existing edit and voting system can cope with a more “proper” or “NGS-like” release group implementation without the quantity of work required that originally led to the “let’s do it in one foul swoop” conclusion.

    Do the BBC need a “proper” implementation with its own separate ID or would having a hack AR between releases linking back to the “original” release (without bonus tracks, on an agreed media etc) suffice? Does the same “track” on multiple releases need to have a single ID from a “cultural” release’s perspective; like a song/work entity?

  10. Couldn’t we make better use of ‘is Earliest Release’ relationship between releases, so that when there are multiple versions of a release this relationship is created. And BBC just use the earliest release, and this releases id is used as the cultural id for this ‘Release’, or are they interested in the different versions of the release, in which case they can access these other releases by traversing the relationship between them.

  11. ijabz:

    I’m more in favor of doing this work right so it can cleanly be migrated to NGS when we finish that. One big part of that is the creation of the new PackageID or GroupingID — with Lukáš’ second suggestion we force only one change for our customers/users as far as IDs are concerned. With the former, we would change the semantics of the AlbumID and then later we would come up with a totally new ID later.

  12. voice:

    >I wonder/worry whether the rest of our existing edit and voting system can cope with a more “proper” or “NGS-like” release group implementation without the quantity of work required that originally led to the “let’s do it in one foul swoop” conclusion.

    I think that should be fine — its not that complicated a change. Once you get to the track level changes and “Works” in particular, it gets hairy fast.

    I believe the BBC needs a proper implementation. I for one am not keen on the hacked up implementation either. Also see my note about changing the IDs wrt to our data users and customers in my comment above.

  13. I have a concern about this from an editors perspective. Getting a new way of grouping releases is great but once it’s out it takes time for editors to scour the huge database grouping the albums that need to be edited. From a time perspective no matter how it’s implemented the most time taken will be that of editors making the changes.

    I’m amazed at how fast people are editing in the new “Part of a set” AR but even that still has huge holes in some major artists. This isn’t a fault of the code or system, things like this just take time. I’d like to help the BBC out as quick as possible, partially because then we can go to finishing NGS, but also it makes MusicBrainz look better when it’s able to work more efficiently for it’s customers.

    My proposal is that we “knockout” the new AR over the weekend. But this is NOT the final solution for the BBC, it is so that the editors can get to work toward the solution parallel to Lukáš developing the ‘cultural identifiers’. This way instead of one person working while thousands wait, we can all do something useful now.

    Meanwhile Lukáš finishes the proper implementation of the cultural identifiers, then migrates the waiting ARs the editors have been working on for the past few months. It may some extra time building a method to import the waiting ARs into the finished solution but the time saved in editing afterwards should make up for that.

  14. Another little hello from the beeb in answer to voiceinsideyou. We really do need a ‘proper’ implementation rather than a bastardised AR. ie one with ids for ‘cultural releases’. We need to be able to point to one Canonical resource for eg Rubber Soul and we don’t want to invent our own identifiers to do that, then have to map them back to brainz when the full NGS happens.

    ps ‘cultural identifiers’ is indeed a misnomer for this work. It’s how we refer to identifiers for cultural releases, works, parts, performances, sessions etc etc. So NGS in full would give us ‘cultural identifies’. Slicing out this part just gives us ‘cultural releases’. i’ll try to clarify the language at our end : )

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.