It would be a fantastic hack if they were using some cool algorithms/machine learning to separate the sounds, analyzing their timing, then dynamically generating the animation from that. I would be genuinely interested in discussing that.
But not this, unfortunately. Seems to be a simple (albeit well-done) cg video set to some music.
Here's a snopes article about it:
http://www.snopes.com/photos/arts/musicmachine.asp