jPSXdec is a cross-platform PlayStation 1 media decoder/converter.
Get the latest version here.

Thursday, August 19, 2010

Immaculate Decoding

Just writing a straight-forward PlayStation 1 video decoder has been a lot of work. However, for the absolute most impeccable quality, there is so much more that can be considered in the process.

Upsampling

When PlayStation videos are created, the pixels are broken up into luma (brightness) components and chroma (color) components. Like with JPEG and MPEG formats, 3/4 of the chroma information is thrown away because the human eye can't really tell (this is an example of lossy compression).

When decoding, that lost chroma information needs to be recreated somehow to convert the pixels back into RGB. Unfortunately there is no one 'right' way to do it, because there's really no way to get that lost information back. All you can do is 'guess' by filling in the blanks based on the information around the pixels using some kind of interpolation. Some of the most well known kinds of interpolation are: nearest neighbor, bilinear, bicubic, and lanczos. I've read about more advanced chroma upsampling approaches that also take into account the luma component. This works because there is often a lot of correlation between the luma and chroma components--when the luma changes, the chroma probably will also. I'd like to try to find the best one, but I haven't had much luck on finding many good resources about them all.

Now, because this is essentially just scaling of a 2D image, I've been worried about this article that points out a nasty little gremlin called gamma correction. It seems nearly everyone has been doing image scaling wrong since the popularization of the sRGB gamma corrected color space. I'm assuming video isn't immune to the same problem, yet I've never seen anyone mention it.

Deblocking

Assuming we find the upsampling method of choice, there are still ways the image can be improved. Most video codecs break the frames down into 'blocks', then encode each block separately--again losing information along the way. When everything is reconstructed, that lost information can often be seen as visible distortions between blocks. This problem has been addressed in more recent video codecs such as h.264, but is still a problem with the older MPEG2. I believe nearly all DVD players do some deblocking before showing the final frame.

Even though MPEG2 has been around a long time and deblocking is so common, I've had the darndest time trying to find much mention of what deblocking algorithms are in use today. UnBlock, and this page on JPEG Post-Processing are the best I've come by. I think I've read somewhere that some advanced deblockers can even make use of the original MPEG2 data to improve the deblocking.

I still consider myself a multimedia novice, so there are probably more post-processing methods that would really make the output shine. A big bummer among all research in that area is that if you can think it, you can pretty much count on it been patented.


Given how difficult all this stuff is, I really really wish I could just pass that problem off to the big players in the field, such as ffmpeg (i.e. libavcodec). I've even considered writing a PSX video to MPEG2 video translator so the MPEG2 video can be fed into ffmpeg. Unfortunately there are some big reasons why doing this still makes me uneasy.

IDCT

The PlayStation uses its own particular IDCT approach that I've never seen anywhere else. Given how important it is that the DCT used to encode the video matches the IDCT used to decode, there are no existing decoders that can do it (except jPSXdec of course).

Differences in YCbCr

Another worry is that a real good quality MPEG2 decoder will spatially position the chroma components in the proper location (vertically aligned with every other luma component) as opposed to how I believe PSX does it (the MPEG1 way: in-between luma components).

To make things a bit more complicated, MPEG2 uses the proper Rec.601 Y'CbCr color space with [16-235] luma, and [16-240] chroma range. PSX on the other hand, uses the full [0-255] range for color information. Many video converters don't handle that discrepancy very well. Related to that, pretty much all converters store the data as integers, so any fractional information is lost after every conversion. In contrast, jPSXdec maintains all that fractional information until the very end.

In general though, I have not been impressed with ffmpeg's quality, so I can't suggest people use it when looking for good quality.


One advantage that comes when incorporating all these enhancements in jPSXdec is it provides a much nicer user experience. No need to be hopping between multiple tools to get the best results.

So if I were to actually implement all these features, where would I get the information I lack? Perhaps the doom9.org forums could help. If any multimedia gurus happen to pass by this post, please, if you could, toss some wisdom my way.