Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The problem is not the theoretical peak teraflops. The problem is actually achieving those teraflops with useful work. Due to architecture that is easier on a CPU than on a GPU, so you can't directly compare teraflops and conclude that GPUs are superior. Getting something to run fast on a GPU is very difficult.

And actually the thing that does 4.5 teraflops in single precision does only 95 gigaflops in double precision per GPU. A good x86 CPU does ~100 gigaflops in double precision as well, and you're much more likely to actually achieve that number on a x86. Although another one on the page you linked to theoretically does 665 gigaflops double precision.



Single precision is probably fine for a neural network. Neural networks are somewhat insensitive to noise and failure and single precision adds very little noise.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: