I'm still learning about audio in PlayStation games (Nocash Playstation Specifications is amazing!). The XA ADPCM audio format, frequently seen with STR videos, is what I'm most familiar with. But the PlayStation also has a Sound Processing Unit (SPU) that is used to play all other audio you hear in the game.
XA audio is easy to identify and convert. SPU audio, often referred to as "VAG" ("Very Audio Good"), isn't so easy. The easiest clips to identify are simple sound effects that are played and then end. It gets a little more difficult with audio clips that need to loop. Where it gets impossible to identify is when multiple audio clips are combined to form unique sounds in real-time. I believe this is called "SEQ" and is how a lot of background music is done in games. Each instrument is actually little sound clips being played at different frequencies. This brings up another challenge with all SPU audio: clips can be played at any frequency, and there isn't any way to know what it is.
One case where a game used instrumental audio along with STR video is the Valkyrie Profile opening FMV. jPSXdec can only identify the video clip, but has no way to recreate the music. Thankfully, diligent people have put a lot of effort into extracting these instrumental kinds of audio. These are stored in what's known as PSF files. Lo and behold, someone has taken the time extract the Valkyrie Profile instrumental music into PSF files.
With my growing knowledge around PlayStation audio, I thought it would be fun to create a very high quality conversion of the Valkyrie Profile opening. I assume using a PSF converter could produce better quality audio than what you can get on hardware or emulators. jPSXdec can extract the video with the best possible quality which can be made even better with other tools.
- Extracted Valkyrie Profile opening video with jPSXdec in avi:jyuv format. This is YUV using the [0-255] component range.
- Downloaded Valkyrie Profile psf audio clips.
- Used Audio Overload to convert opening video PSF to wav.
- Used VirtualDub to mux the video and audio into a single avi, with a 1 second audio delay to sync them up correctly.
Created an Avisynth script to upscale the video to HD quality and convert to RGB. DGMPGDec plugin was used for deblocking and nnedi3 plugin for scaling.
Used ffmepg and this script to convert to an uncompressed/RGB/DIB AVI (5 GB file!):
That looked good, so then compressed to an almost lossless mp4