Making graphs is easy. Making intuitive, easy-to-understand graphs? It’s harder than most people think. At the Rochester Institute of Technology, the ISTE-260 (Designing the User Experience) course teaches the language of design to IT students. For an introductory exercise in the class, students are tasked to visualize any set of data they desire. Students David Kim, Jathan Anandham, Justin W. Flory, and Scott Tinker used the MusicBrainz database to look at how five different Queen songs went mainstream in different ways.
I am Chhavi. I have mostly been helping around with all things design in MetaBrainz. I recently graduated from IIT Guwahati, India and started contributing to MusicBrainz after attending the summit last year, around the same time.
As a Google Summer of Code student, my project was to build a design system with React UI components for the upcoming overhaul of MusicBrainz’s website. It surely was a really interesting journey, right from when I heard about the community and I would like to share some snippets of it with you!
May 2017: I hear about Picard, and how a bunch of really cool people who meet online are building it. I was intrigued.
Around August 2017: I pop in the IRC channel #metabrainz, and after much overthinking, I drop a “Hi”. Followed was a really warm welcome by people I will soon call friends and a lot of developer-y jargon I had no clue about.
September 2017: I attend the annual MusicBrainz developer summit in Barcelona. And boy oh boy, I am now part of the family. Over the few days there, I have immense fun interacting and learning from the community.
November 2017: We set up our JIRA ticket system for design issues and start working on the mockups for the redesign. The entire community comes together on JIRA tickets and Discourse posts to talk about where we want to go with this overhaul.
January 2018: Community members encourage me to try my hand at front-end development. One is really lucky to find people, who encourage you to grow out of your comfort zone and help you cross that wall. In MetaBrainz, there is no shortage of such kind of people.
March 2018: With little confidence and lots of hopes, I apply for the Google Summer of Code programme. I start learning the ropes of development, with help of online tutorials and obviously our community. We also met for a mini-summit in Delhi to discuss ListenBrainz and spicy food.
April 2018: Hence began my full-fledged journey of learning and spending a summer of coding. It wasn’t easy, but I learned a lot in the process.
We set up the initial design system using react-bootstrap and react-storybook. I then started importing UI components into the system, followed by its documentation. I wrote up a more detailed description of the process too.
August 2018: As of now, we have the design system in place. The future plan is to continue adding components to it as well focus on having well thought contributing guidelines. I will also continue working on designing the mockups for the user interface for various entities.
Google Summer of Code was just another milestone in my journey with MetaBrainz. My time here has been a time of both personal and professional growth. I now feel more comfortable in a development environment, the ongoing chats on IRC make more sense to me and I feel less inhibited to put my thoughts out there. I completed my college, moved cities, traveled… all while having a set of these amazing people I call family.
A special shout out to Rob for keeping me going, bitmap for being ever so patient and understanding, samj1912 for introducing me to MetaBrainz, CatQuest, iliekcomputers, Suyash, Freso, reo and zas for being amazing friends through it all.
The thing I like about our community is, we had seasoned developers as well as newbies like me, all together working together to create amazing stuff. Hoping to continue being an involved and colorful part of this community,
You will obviously keep hearing from me in the coming days,
After 2-ish months of summer vacation I decided to do an “After Summer Special” (name as a reference to After School Specials)
By the end of summer I had been working hard… on Debussy instead of instruments! Oops!
Once September arrived I realised that I needed to get things back in order, so I set about finishing this version:
- [INST-545] – Reverse parentage of classical/acoustic guitars
- [INST-239] – Tubulum
- [INST-293] – Nyabinghi Drums (Funde, Thunder, Repeater/Keteh)
- [INST-437] – Mijwiz
- [INST-478] – boobam
- [INST-533] – Oval spinet
- [INST-538] – Tarota
- [INST-540] – Tenora
- [INST-543] – Gemshorn
- [INST-546] – tible
- [INST-549] – kannel
- [INST-568] – Irish flute
- [INST-570] – octoban
- [INST-575] – Rubab
- [INST-584] – vuvuzela
- [INST-430] – akete is missing proper description (along with some improvements)
- [INST-477] – table steel guitar alias
- [INST-501] – Please add a couple of Dutch aliases to bullroarer
- [INST-518] – rebec relations
- [INST-554] – Improvements to hardingfele aliases
- [INST-583] – add aliases to handclaps
I also decided to close the tambouras (strings) version, as after much delay there wasn’t really much progress on it. I had started work on these way back in January, adding the Tamburica instruments (see Instruments part two).
However, by the time ASP came to an end it had became clear to me that I had to move on from dwelling on this, to feel like there was progress and avoid burning out.
Now, after over half the INST tickets have been closed, I had a bit more experience and I could finally finish work on this + some random stragglers left over from the “next” version.
Anyway, after all that, I decided that the thing we needed to do, was to, Get Serious
It’s time for Instruments again!
Now that I was allowed to create fix versions I set about creating such fix versions! (And also, retroactively (re)creating previous batches where these seemed logical.) This also meant creating JIRA Tickets for instruments I had already added without tickets, like the Taonga pūoro instruments.
By now, I was also starting to see the limits of the relationships between instruments we already had, so I was already thinking about what relationships could be added to improve the way we link things. As such, I created a topic to discuss some ideas I had (Reader beware! This is an early and now outdated idea-thread! (More on this in a later post!))
next: 2017 March to May
- [INST-442] – Alias of three-hole pipe has erroneous end date
- [INST-465] – Typos in names, descriptions and aliases of concertinas
- [INST-59] – Anglo concertina
- [INST-61] – English concertina
- [INST-73] – Djoze
- [INST-81] – Duggi
- [INST-83] – Dulcitone
- [INST-102] – Flageolet
- [INST-262] – Blaster Beam
- [INST-332] – Vibrandoneon
- [INST-378] – Kagurabue
- [INST-402] – Trikitixa
- [INST-418] – txistu
- [INST-420] – Flabiol
- [INST-429] – Electronium
- [INST-464] – Orphica
- [INST-468] – Tonette
- [INST-469] – Video game console
- [INST-451] – Confusion in n'goni/ngɔni instruments
- [INST-452] – Fix ngɔni relationship to banjo and picture
- [INST-409] – Doussn'Gouni
- [INST-453] – Fix donso ngɔni description and its relationship to kamalen ngɔni (and add relationships as improvement)
- [INST-133] – Kamale n'goni
- [INST-454] – Fix kamalen ngɔni description, check invention relationship (and add relationships as improvement)
- [INST-450] – jeli ngɔni
- [INST-476] – Conclusion of the n'goni/ngɔni instruments
- [INST-125] – add alias to zither (was: Harpeleik)
- [INST-379] – Add aliases to the Melodica instrument
- [INST-398] – add aliases to three hole pipe (was Flauta de 3 agujeros)
- [INST-438] – Missing search hint aliases for sax instruments
- [INST-441] – Typos in descriptions of pipe and tabor
- [INST-443] – Add aliases to three-hole pipe and tabor
Quite a lot of tickets!
Eventually I came to recreating the Taonga pūoro tickets (previously I just added the ones in the Wikipedia page). It became clear to me that this was huge task and deserved it’s own version:
Taonga puortwo: 2017 May 9th to 25th
- [INST-388] – adding "Taonga Puoro" to instrument tree (while technically these were all added in the first batch, pretty much every single one of them were updated and expanded.)
- [INST-491] – Koauau ponga ihu
- [INST-492] – Hue Puruhau
- [INST-493] – Hue Puruwai
- [INST-494] – rōria
- [INST-495] – te kū
- [INST-497] – pūpakapaka
- [INST-498] – porotiti
- [INST-509] – tumutumu
- [INST-515] – pahū
- [INST-520] – tōkere
- [INST-521] – Poi
- [INST-522] – pākuru
After this I was a bit tired and summer was finally here, so I decided to Go On Summer Vacation… so hold on for the fourth part!
Hi, I’m Leo and I spent my summer building and training SpamBrainz, our new solution to fighting spam in MusicBrainz. If you haven’t heard of SpamBrainz before it’s probably because it did not exist before this year’s Summer of Code.
For quite a while now the amount of spam in MusicBrainz has started to become a serious problem. Often this means editors are automatically created with descriptions that look not unlike the spam emails most of us get every day, promoting other websites and services.
During last year’s MetaBrainz Summit we discussed possible solutions to this and came up with the Spam Ninja system. Essentially this means that Soon™ there will be a group of editors that receive spam reports and have the ability to delete editors and entities that are nothing but spam.
Now with MusicBrainz having almost two million registered editors, could we really expect the Spam Ninjas to manually check every single one of them in addition to all the new registrations? Obviously not, and this is where SpamBrainz comes in.
SpamBrainz is a machine learning system that looks at all editors and decides whether or not it thinks they are spammers. If it thinks they are, it automatically notifies the spam ninjas who then decide whether or not SpamBrainz was correct.
What’s great about this system is that a human is guaranteed to look at any report and at no point does a computer decide that you’re a spammer and should be banned, because no one wants machines to run the world, right?
While most GSoC projects involve adding features to existing systems, SpamBrainz is something entirely new and I had not built anything on this scale before so I started out by doing tons of research.
When building a machine learning project you should always start by doing some good
old statistics first and trying to figure out what matters about your data and how the
system could use it. I wrote a couple Jupyter notebooks (which are great for working with data) to do this.
Next I built a pretty boring Flask-based API that would allow MusicBrainz to queue up editor analysis and training. Quite a few different MetaBrainz projects use Python and need to access the MusicBrainz database so a long time ago someone wise decided to move commonly used code into a repository called brainzutils-python. All I had to do was to add some code for accessing editor data through it.
But before I could build my Keras model I had to decide on a final set of input features and do write code for preprocessing the data. Only then could I finally get started building and testing models.
The current SpamBrainz state of the art model is Lodbrok which actually turned out to work really well, reaching a 99% accuracy in detecting spam while only mis‐classifying 0.2% of real users as spammers. Obviously the latter won’t be a problem because after all a Spam Ninja will still check these reports.
Now that GSoC is over I could just disappear with all the money and leave SpamBrainz in its current state but obviously that’s not what I am planning to do.
I would like to work with zas on getting it deployed along with the Spam Ninja system, improve the code documentation and try to tackle the remaining problem that is online learning (which as it turns out, isn’t as easy as I had thought).
With spam always evolving and spammers already moving to more sophisticated methods than just using editor biographies, I’d also look into building separate models for other entities.
After all SpamBrainz is just getting started and I’m very much looking forward to continuing our journey towards reducing the spam we all have to endure on MusicBrainz and other MetaBrainz projects.
Here comes an end to a fantastic summer for this year and time to wrap up my GSoC project which I have been working in for the last 3 months (the official GSoC coding period).
I am Rashi Sah, an undergraduate student at the National Institute of Technology, Hamirpur, India. I have been working on a really cool AcousticBrainz project for MetaBrainz Foundation Inc. as a participant in Google Summer of Code ‘18. It has been an amazing experience and I’ve learned a lot over the summer, spending countless days and nights to successfully take the project to the stage of completion. I decided to contribute to MetaBrainz in late December, then spent some time understanding the codebase of the project and then began creating pull requests and pushing commits for many features, tasks and fixing bugs since January 2018. This blog post consists of my GSoC experience as a student and the work I’ve done for the program so far.
Before starting the GSoC program, I started looking for some good-first-bugs initially and found some tickets to work on. Then I talked to the AcousticBrainz community members and started contributing. I created some big PRs mostly for adding new features to AcousticBrainz. I also worked on many bug fixes which are already merged into the AcousticBrainz codebase. New feature additions PRs include AB-21, AB-98 and AB-298. In mid‐February, I started looking for a suitable idea to work on for GSoC program and to create a proposal for the same. As the month of March was approaching, I did a lot of proposal discussion with MetaBrainz community members especially with Alastair, AcousticBrainz project lead who has helped me a lot in reviewing and guiding me to improve my proposal to a better extent. Later April, my proposal for a more detailed integration of AcousticBrainz with MusicBrainz got accepted. In the community bonding period, I mostly tried to continue my work which I was already doing for the past 3–4 months.
Getting entity information from the MusicBrainz database
The first thing I worked on when the official GSoC coding period began was adding a way to directly access MusicBrainz database for different entities to the MusicBrainz database module in BrainzUtils (a Python utility for all of our MetaBrainz projects). I worked on getting artist and release entity information from the MusicBrainz database via a direct connection. (See PRs BU-13 and BU-14.) Later, I worked on setting up the MusicBrainz server by adding a service in AcousticBrainz’s docker-compose files allowing us to easily read data directly from the MusicBrainz database in AcousticBrainz (PR AB-334). Our major aim of the project was to implement both the methods of MusicBrainz database access in AcousticBrainz especially importing the MusicBrainz database in AcousticBrainz from scratch and then to decide which methods works better while implementing a particular functionality in AcousticBrainz using MusicBrainz data.
Import the MusicBrainz data in AcousticBrainz database
MusicBrainz’s database contains a huge number of tables, but I analysed the use case of MB data in AB and made a list of those tables that we would actually require in our AcousticBrainz integrations. Then I made a PR (AB-338) for creating new tables in the AB database under the MusicBrainz schema. Later, I worked on a big PR (AB-340) which imports MB data corresponding to each and every recording present in AcousticBrainz’s database and writes the data into the tables of the MusicBrainz schema in AB. This PR was really huge and I had to take care of a lot of integrity constraints and foreign key dependencies.
Update MB data in AB for every new recording added to AB
Another feature I worked on after importing the MB data was updating the MB data present in AB whenever any new recording is added to the AcousticBrainz database (see PR AB-346) by importing the data from MB’s database via the direct connection. While working on a few bug fixes, I and my mentor, Param realized that the MB data import is taking a lot more time than expected when I applied the MusicBrainz importer script for full MB data dumps (of around 2.8 GB). So, I then worked on making the MusicBrainz importer more efficient and was able to import the data for few recordings within seconds (see PR AB-348). I had to figure out a lot for each table import and to detect the parts of the code which were making things slower.
To reduce the load on the processor, I included a sleep schedule of 5 seconds in the MusicBrainz importer module to wait before importing data for any new recording (see PR AB-354). During my GSoC period, I learned how important it is to write tests and make them run fast. I wrote tests for almost every script inside the db module. Later, I worked on writing tests for the MusicBrainz importer script (AB-352).
Apply replication packets to keep MB data in AB updated with the actual MusicBrainz database
Then came another tricky part of this project which was to update the MusicBrainz schema data in AB whenever there is any change in the actual MusicBrainz database whether it is an update or a deletion taking place. MusicBrainz provides hourly replication packets which describe the changes to the database in a specific period. Replication packets are .tar.bz2 archives with a collection of files in them which can be downloaded via the MetaBrainz API. Lukas Lalinsky, a long-time contributor to MetaBrainz projects, the founder of AcoustID and maintainer of the mbdata Python module, had worked on implementing replication packets on MB data. I did a lot of modifications in his script to apply replication packets to the MusicBrainz schema data till it’s recent update for the recordings data present in AcousticBrainz (see AB-350).
Integration with MB database: Use MBID redirect information to get original entity
After working on the direct connection and importing the MusicBrainz data, keeping it updated by all means, it was time to start working on writing evaluation scripts to decide the better method for any integration we apply in AcousticBrainz. I wrote a script to implement an integration in AB with MB database to use the redirect information of an entity and then returns the original entity corresponding to the MBID provided (see PR AB-356).
Evaluate both methods of MusicBrainz database access in AcousticBrainz
Now moving towards the last work of my GSoC period and the most important as well. After working on both the methods, we really needed to evaluate both in order to test which one is more efficient for any specific integration with the MB database. I first wrote an evaluation script which fetches the data from the recording and low-level tables. For this case, the difference between the time taken by both methods comes out to be really large (approx. 70 seconds for around 250+ recordings). So whenever we would have to get the data from local AB tables and MB tables as well, we would go for the import database method as this method turns out to be faster than the other one. Next I tested with the MBID redirect integration part in which I didn’t find much difference between both the methods (PR AB-357). But I ran these tests locally, the tests in production may yield different results.
All in all, it has been an exciting summer. By this time I am familiar with a very good part of the AcousticBrainz codebase. I really look forward to work on adding a lot more integrations with MB data in AcousticBrainz and plan to completely remove AB’s dependency over the web service to use the MusicBrainz database which would be very useful for the users.
Details of contributions made
By the end of the GSoC coding period, I have opened a total of 39 PRs of which 35 are pull requests to the AcousticBrainz server, 3 are pull requests to BrainzUtils and 1 pull request to the AcousticBrainz client and have made a total of 135 commits (109 in AB, 9 in BU, 3 in AC and 14 in AB master) and out of them, pull requests created and merged during the official GSoC coding period are PRs to AcousticBrainz server and PRs to Brainzutils.
These last three months were full of thrill, excitement and much frustration as well. And this doesn’t end here, I’d love to contribute in the future and act as a maintainer for the AcousticBrainz project. I believe people must try to contribute to open source organizations as it helps you learn and gain much experience in a short period of time especially when working for a great platform like Google Summer of Code.
I am really happy working with the awesome MetaBrainz community and the people here are fantastic. I’d love to stay being a part of MetaBrainz in future as well. So in the end a big thanks to my mentor Param Singh, without his help & support throughout the program, wouldn’t have been possible for me to reach the end phase of GSoC, and my organization admin Robert Kaye, AcousticBrainz project lead Alastair Porter and all of the MetaBrainz Foundation community members for choosing me as a GSoC student and thus providing me such a great opportunity and also for being very kind and helpful throughout the program. And I want to thank Google for making this all possible. Hope I get a chance to work with you all again!!
Note – There are no changes for Linux users, so they can safely skip this release if they want.
Given the massive feedback about the shortcomings of the Windows and macOS versions of Picard, we decided to do a minor release addressing some of the issues with our executables.
As usual, you can find the latest downloads on Picard’s Website.
The change-log is as follows –
- [PICARD-1283] – Fingerprinting not working on macOS in Picard 2.0
- [PICARD-1286] – Error creating SSL context on Windows
- [PICARD-1290] – Improve slow start up times by moving to a non single file exe
- [PICARD-1291] – Use an installer for Picard 2.x windows exe
Basically, the Windows executable is now a proper installer and some missing SSL dependencies are bundled with it.
The macOS builds also include the missing AcoustID fingerprinting binary.
The startup time for both the Windows and macOS version has been improved as well.
Have fun tagging your files!
samj1912 signing off o/