Saints Row 2: Extracting and Replacing audio

I believe I've figured out what's up with the Saints Row 2 PC audio. Since they're mastered in 'CD Quality' I was playing around and found a spec called ISO-908 De-emphasis. I used SoX to apply it to the cutscenes, and the output was *much* closer to the expected console audio. Unfortunately the base files I used were from the Linux version of the PC port, so the quality is still a bit subpar overall even if it's still way better, and many files are corrupted, but it's a starting point. I'm gonna keep working on this over the weekend and see what I can accomplish. I'll post a video later if anyone's interested.

EDIT: Some more explanation. The reason I brought up CD mastering is that some CDs up-EQ'd the treble (pre-emphasis) to try and compensate for the 44.1kHz quality and add some extra resolution. A lot of audio players are designed to recognize this and down-EQ the output back to the expected level, but it appears the audio player used by Saints Row 2 does not. It may be better for the overall sound quality to try and implement the spec in the player than to convert all of the audio, but since I don't believe I can do the former I'll focus on the latter.

This information may also be useful to the port team, assuming they haven't yet figured out what's wrong with the audio and assuming that's even still happening.
 
Last edited:
Here's a video example featuring 3 cutscenes. The first cutscene, the courtroom cutscene, and the cutscene where we meet Maero. Still not perfect, but I think it's better. You can watch the video before checking the description if you want to decide for yourself. Worth noting that the difference is a little more obvious ingame, the video compression reduced the audio quality somewhat.


I'm aiming to do an experimental release of music4.vpp_pc in Mods In Progress over the weekend, I've worked out the audio corruption issue. Turns out I'm just an idiot who screwed up when stitching the .xwb together with a hex editor. (All versions of the xact build tool that make files compatible with SR2 have a memory leak, which I worked around by building two partial wavebanks and manually stitching them together with a hex editor.)

music4 contains the main cutscenes, gameplay player voice clips, and the ambient music loops (like play in stores and clubs.) I want to get any potential kinks worked out in the process before I tackle the other audio packfiles, especially the main audio.vpp_pc as that contains over 200 wavebanks, and the process of doing even one wavebank is still fairly time consuming. I'm considering if the process can be automated, but there's still one manual step that's in my way -- updating the xact project file the build tool reads. It's XML but I'm not in the mood to write a parser script to update it via CLI, so for now I'm still using the GUI tool provided in the OP.

UPDATED: The biggest stumbling block I'm encountering is that in the process of extracting and converting the XBox audio, there's a rounding error in computing the sample rate. Files range from 44098Hz to 44101Hz. Now the problem I'm having is that SoX's ISO 908 De-emphasis option will reject anything not 16 bit 44100Hz. I can reencode the files to get around that, but it's adding another layer of loss on an already hopelessly lossy situation. I've written a script for my hex editor that replaces bytes 24-27 (the uint32 in the header that defines the sample rate) with 44100hz - everything else be damned - and it seems to work well without introducing noticeable loss. That said, the script is, frankly, slow as fucking balls. It takes about 5 seconds per file, and I'm not looking forward to using it on audiobanks like voc_sp that have over 8000 wavs in them. There has to be a faster way to bulk change the same couple bytes to the same value across multiple files. If anyone has a suggestion it'd be very welcome, preferably CLI tools.

UPDATE 2: As N69 found, most of the audio is sampled at 24000Hz. I'm upsampling it to 48000Hz to apply the de-emphasis shelving filter, then resampling it back down. We'll see how that works. On the plus side it means I don't have to use my awful hex editor script as much.
 
Last edited:
Top