Audio Compression and Normalization

Have you ever been listening to a sermon or audio book in the car, and the speaker started to talk really quietly, so you turned the volume up, only to be blown away in the next passage: “and Moses’ anger waxed hot, and he cast the tables out of his hands, and brake them beneath the mount”

Audio segments that are way too loud or way too quiet can be really obnoxious. It’s distracting and confusing to the listener, and in all cases detracts from the content.

Today we’re going to talk about two ways to address audio that is too loud or too quiet: Compression and Normalization.

Typically, when you are working with audio, when the input is very quiet, like when the speaker is whispering, the output is very quiet, and when the input is very loud, like when the speaker is shouting, the output is very loud as well.

The distance between the loudest sound and the quietest sound is called the dynamic range.

Audio compression works by taking those very loud input parts, and making them not-quite-as-loud in the output, thus compressing the dynamic range of the audio.

If you looked at an audio waveform, you’d see that the output waveform has those highest parts compressed by bringing them lower.

By bringing those loud parts down, we can then make the difference between the loud parts and the quiet parts smaller, so the listener doesn’t have to adjust the volume when listening because they’re not being blown away.

If you’re using common audio editing software like Adobe Audition or Audacity, there are 5 main settings that you’ll see when compressing audio:

First is the threshold. The threshold is how loud the input must be before the compressor kicks in at all. If you put the threshold very low, at say -35db, you’ll be compressing nearly all the input to some degree. If you put it rather high, at say -6db, you’ll only be touching the parts to the audio that are quiet loud to begin with. You’ll need to listen to your source material to find the best threshold, but for sermons using something like -24db is a good place to start.

Next is the noise floor. The noise floor tells the compressor what to completely ignore. By setting a noise floor you avoid sounds below that level from being amplified in any way. You’ll often know that the noise floor is too low if you start hearing loud hissing during pauses.

Third, and most important, is the compression ratio. The compression ratio determines just how compressed the audio is. A ratio of 3:1 means that if the audio input triples, the output will only going up one step. A ratio of 5:1 means that the input will have to quintuple before the output goes up one step. For spoken audio a ratio of 3:1 is usually pretty good, though you can adjust this if you think you’re getting better results at a different ratio.

Attack time, our forth parameter, is the amount of time a sound has to be loud before the compressor will kick in. If you set this very fast it will be very obvious to the listener that you’re compressing the audio (awkwardly so), but if you leave it too long then the listener can still get blasted with sudden shouts. Start with about 0.3s and adjust from there.

Decay time, our last parameter, is the amount of time the sound must be quiet again before the compressor kicks out. If this is too short you’ll start to hear staccato-like hissing as the background comes back. If you leave it too long you’ll miss words that are quiet immediately following loud ones. Start with about 0.8s and go from there.

There are a few other things you might run into as you’re working with audio compression. One is often called something like “Make up gain after compressing” or “normalize after compressing.” Now, after compressing those loud parts, the highest peaks of the audio will be quieter than they used to be. What this setting does is bring up all of the audio so that it’s a bit louder overall. But it doesn’t change the compression, which has narrowed the range from the quietest sounds to the loudest sounds. To provide your listeners with a consistent experience from your audio, I highly recommend you enable this, or manually normalize your audio after compressing.

Another setting you might see is “compress based on peaks.” Now, I’m not going to get into the difference between peak and RMS values right now, but for most applications you’re not going to want to compress based on peaks.

Whether you choose to normalize to 0db, -1db or -3db, just be consistent with your option from recording to recording.

One of the great things about Sermons.io is that we do all of this compression and normalization for you automatically. You just upload your recording and we take it from there.

If you have any other tips for compression or normalization please share them in the comments.

As always, please subscribe so you don’t miss any of our future tips, and if you have any feedback checkout the links in the description below.

Oh, and if you haven’t tried Sermons.io yet, sign up for your free 40 day trial now!

X

Subscribe to get more
Church Tech Tips