Loefflers Discrete Cosine Transform Algorithm in Futhark

fmap · on July 11, 2018

Very nice! I wasn't aware of this algorithm at all, but it seems to be an early application of the lifting scheme to the DCT filter, 8 years before this method was introduced in the more general context of perfect reconstruction filters.

There's a great expository paper about these kinds of filter implementations by Daubechies and Sweldens in the general case: https://9p.io/who/wim/papers/factor/factor.pdf

Interestingly enough, Sweldens does not seem to cite Loeffler's paper, so it's likely that they came up with the same method independently of one another.

nayuki · on July 11, 2018

Doing DCT the naive way, the formula is as easy as the DFT. But computing the DCT quickly (like in O(n log n) time) is much harder to understand than the FFT.

I read a couple of papers and implemented the fast DCT algorithms. The Arai, Agui, Nakajima 8-point DCT uses 13 multiplications. The Lee DCT algorithm is recursive and works on any length that is a power of 2. https://www.nayuki.io/page/fast-discrete-cosine-transform-al...

B1FF_PSUVM · on July 11, 2018

> papers, I found the original [...] after days of struggling, I almost understood it

...

> From this code and staring at the paper, I learned a few things. First of all figure 1 is wrong.

Probably anyone who has implemented a non-trivial algorithm from a paper will feel a sympathetic twinge ;-)

enriquto · on July 11, 2018

I'm always sad that scipy and numpy do not use good FFT implementations due to ridiculous political concerns. However, in that case, there was a very happy outcome with this beautiful article!

haneefmubarak · on July 11, 2018

I Googled around a bit and didn't find anything on political concerns other than the GPL thing (which would cover linking, not reimplementation of math). Is that the political concern in question or is there also something else?

enriquto · on July 11, 2018

Yes, I was referring to that. They refuse to use the standard and state-of-the-art FFTW3 library because it's distributed under the GPL. (Actually, it's dual-licensed GPL/commercial).

AstralStorm · on July 12, 2018

FFTW3 is where near close to state of the art anymore actually... At least on a few platforms, where GPGPU is available.

enriquto · on July 12, 2018

in any case it's an order of magnitude faster than scipy's implementation. Can you point me to a good free software implementation of the fft in gpu?