Tuesday, 9 May 2023

Real-time audio mixing using the Sound Blaster

A few weeks ago, I managed to program a real-time audio mixer that can play up to 4 voices at once through the Sound Blaster! This is also a rare occasion where a theory in my head actually works out. The steps to reach this goal are pretty hilarious, and may be useful to fellow masochists like me.

Immediately, we have a huge advantage in the form of the Sound Blaster's DMA. This allows for audio to be sent to the Sound Blaster and played independently, leaving the computer to do its own thing. The only way I could think to approach this, was to create a buffer of fixed size and tie the mixing routine to the vertical retrace. There's probably a better way, but I'm not known for that, and it works well enough!

So, the first thing to do was measure the samples between each retrace period. I did this by playing a click on every retrace, and recording the output from DOSBox-X to a wave file, which I could then analyze in Audacity. The period turned out to be 625 samples. Because it recorded at 44100Hz, and my target sample rate was 11025Hz, I divided the result by 4 to get the final buffer size of 156 samples (plus one to reduce clicking!).

The next step was to do the exciting stuff, actually mixing the sounds together, dumping the result into the buffer, and then sending it to the Sound Blaster. My plan was to have a bunch of states for each voice, which determined how it would be played. These included:

  • Is the sample playing?
  • Memory offset to the sample (where the sample pointer starts)
  • Current position in the sample (added to the memory offset, to get the current byte)
  • Length of the sample (so the sample can end)
  • Is the sample looping?
With these in mind, I got to programming! Using 4 voices has many advantages, mainly speed, as it's a small amount of voices, and only requires bit-shifting which is much quicker than division. The whole mixing routine is performed in a loop that loops through the entire buffer length. Inside that, is another loop, which goes through each of the 4 voices.

First, we clear the sample buffer, ready to change. Then we need to check if the current voice is even playing. If not, add a null byte (127) and check the next voice. Otherwise, get the sample byte, shift it to the right (by 2 in this case), and add it to the current buffer byte. Then we increase the sample position, and check if we've reached the end of the sample, using our length variable. If it's reached the end, check if it's looping. If it's not looping, stop the sample from playing, otherwise, reset the sample position back to 0. Once all voices have been processed, go to the next buffer byte (bufferbyte bufferbyte bufferbyte).

You're probably wondering why we need to do some of that stuff, so let me explain!

Firstly, summing audio is very easy, but it comes with a catch. To add 2 audio files together, you could just step through each byte and add them together, but it's not that simple. We're using 8-bit samples here, so a number can only range from 0 to 255. If you add 2 numbers together, the result could be over 255, and it would result in clipping. That's why we need to divide (or bit shift) by the amount of samples we're adding together. The volume will be lower, but all sounds will play at once. I used this principle to make a replayer for my music format (Unrefined Sequencing Format), which isn't real-time, but has way more features. But that's a topic for another day! As for 127, that's like the centre point when it comes to amplitude. Each byte is a different point in a waveform, with 0 being at the very top, and 255 at the bottom. Using 127 prevents any clicking when a sample stops!

Once all that is done, we write the whole thing to the Sound Blaster's DMA, and start playback. At this point, the computer can do whatever it wants, without interfering with sample playback, because the Sound Blaster is handling it all!

While this method does work well, it has a few caveats. The most prominent issue is the fact it's tied to the vertical retrace. This means that if your code slows down, the sound stutters, because the buffer size doesn't match anymore. Secondly, the retrace period isn't perfect, so you'll get some slight clicking. I think that's impossible to avoid when using this method. I could try tying it to the system timer instead, but that only runs at 18.2Hz, and results in severe time travel if you mess with it.

That's a lot of explaining, but hopefully that shows you the basics! I won't cover streaming here, because I already made a post about that. This method can also be applied to any kind of sound driver, since it's just adding bytes together!

No comments:

Post a Comment

Amiga module player for DOS - the first draft!

After one week, I've finally pulled off what I previously thought impossible - an entire module player, running in real mode DOS! I'...