jPSXdec is a cross-platform PlayStation 1 media decoder/converter.
Get the latest version here.

Saturday, March 13, 2010

IDCT Demystified (a little)

The inverse discrete cosine transform is a very mysterious and intimidating equation.

(apologies if I messed up the notation)

For the longest time I let the IDCT remain a black box. I found a handful of Java IDCT implementations, plugged them in, and cross my fingers.

I know what the 2D DCT does: it pushes all the image data to the top left corner of the block, while the IDCT undoes that magic. I'm not sure how it does this, but just knowing what it does is enough for me.

But recently I finally discovered that the IDCT is simply a couple of matrix multiplications.

idct_matrixT . coefficients . idct_matrix

The IDCT equation doesn't really suggest that to the casual mathematician. Of course if you take a class or pay for a book on the subject, maybe this is old news to you.

For those uninformed like me, let's take a closer look at this IDCT matrix.

Theoretically we could throw a bunch of trigonometry identities at this matrix to simplify it, but it turns out to be so much easier to just calculate it and see which decimal values are the same. In the end, there turns out to only be 7 unique values (listed here in varying forms).

1/sqrt(8)       =  cos(  PI/ 4)/2
cos(1*PI/16)/2 = cos( PI/16)/2 = sqrt(2+sqrt(2+sqrt(2)))/4
cos(2*PI/16)/2 = cos( PI/ 8)/2 = sqrt(2+sqrt(2))/4
cos(3*PI/16)/2 = cos(3*PI/16)/2 = sqrt(2+sqrt(2-sqrt(2)))/4
cos(5*PI/16)/2 = cos(5*PI/16)/2 = sqrt(2-sqrt(2-sqrt(2)))/4
cos(6*PI/16)/2 = cos(3*PI/ 8)/2 = sqrt(2-sqrt(2))/4
cos(7*PI/16)/2 = cos(7*PI/16)/2 = sqrt(2-sqrt(2+sqrt(2)))/4
Now the IDCT matrix can be simplified to this:

Taking things a step further, let's multiply the two IDCT matrix multiplications out (Maxima is awesome). After a lot of trigonometric simplification, it turns into a massive matrix. This tiny portion below resembles what the entire matrix looks like.

You can download the full 30,000 pixel wide image if you dare.

All those additions/subtractions help to explain why fast IDCT implementations consist of so many sums and only occasional multiplications.

The bare math still makes it difficult to identify patterns, so I took things to the extreme and visualized it a bit.

No comments:

Post a Comment