An image is a lot more complicated than a pair of ids and a rating. Counting the number of rows in the training database is misleading. I can build a reasonable dataset for a prediction task from a set of 100M rows from a database that I maintain in my spare time (http://councilroom.com , predict player actions given partial game states).
Don't get me wrong, the Netflix prize was cool.
What's cool about this is that Google hasn't given the learning system a high level task. They basically say, figure out a lossy compression for these 10 million images. And then when they examine that compression method, they find that it can effectively generate human faces and cats.
"An image is a lot more complicated than a pair of ids and a rating."
Predicting someone's reaction to a given movie is a lot more complicated than a pair of IDs and a rating, too, it turns out.
Let's take the speculation out of this.
You can get features of an image with simple large blob detection; four recurring boltzmann machines with half a dozen wires each can find the corners of a nose-bounding trapezoid quite easily. They'll get the job done in less than the 1/30 sec screen frame on the limited z80 knockoff in the original Dot Matrix Gameboy. You'll get better than 99% prediction accuracy. It takes about two hours to write the code, and you can train it with 20 or 30 examples unsupervised. I know, because I've done it.
On the other hand, getting 90% prediction accuracy from movie rating results takes teams of professional researchers years of work.
.
"I can build a reasonable dataset for a prediction task from a set of 100M rows from a database that I maintain in my spare time"
And you won't get anywhere near the prediction accuracy I will with noses. That's the key understanding here.
It's not enough to say "you can do the job." If you want to say one is harder than the other, you actually have to compare the quality of the results.
There is no meaningful discussion of difficulty without discussion of success rates.
I mean I can detect noses on anything by returning 0 if you ignore accuracy.
.
"What's cool about this is that Google hasn't given the learning system a high level task."
Yes it has. Feature detection is a high level task.
.
"They basically say, figure out a lossy compression for these 10 million images."
I have never heard a compelling explanation of the claim that locating a bounding box is a form of lossy compression. It is my opinion that this is a piece of false wisdom that people believe because they've heard it often and have never really thought it over.
Typically, someone bumbles out phrases like "information theory" and then completely fails to show any form of the single important characteristic of lossy compression: reconstructibility.
Which, again, is wholly defined by error rate.
Which, again, is what you are casually ignoring while making the claim that finding bounding boxes is harder than predicting human preferences.
Which is false.
.
"they find that it can effectively generate human faces and cats."
Filling in bounding boxes isn't generation. It's just paint by number geometry. This is roughly equivalent to using a point detector to find largest error against a mesh, then using that to select voronoi regions, then taking the color of that point and filling that region, then suggesting that that's also a form of compression, and that drawing the resulting dataset is generation.
And it isn't, because it isn't signal reductive.
Here, I made one for you, so you could see the difference. Those are my friends Jeff and Joelle. Say hi. The code is double-sloppy, but it makes the point.
The person who invented the boltzman machines - is - the inventor of this technique. He invented boltzman machines in the 80s and spent over 20 years trying to get them to actually work on difficult tasks.
Your rant about this not being compression or whatever you're trying to say is completely off the mark. You don't seem to understand what this work is about.
The netflix challenge is a supervised learning challenge. You have lots of 'labeled data'. This technique is about using 'unlabeled' data.
(Side note: At one point, Geoff Hinton and his group using this technique had the best result in the netflix challenge, but were beaten out by ensembles of algorithms.)
Cyc has nothing to do with this and is huge failure at AI.
tldr; You don't seem to be knowing what you're talking about after having reading your comments, and seem to readily discount the some of the most prominent machine learning researchers in the world today. You're obscuring important results that newcomers might have found interesting to follow up on.
Don't get me wrong, the Netflix prize was cool.
What's cool about this is that Google hasn't given the learning system a high level task. They basically say, figure out a lossy compression for these 10 million images. And then when they examine that compression method, they find that it can effectively generate human faces and cats.