Introduction to Audio Compression

The purpose of audio compression

Up to the advent of audio compression, high-quality digital audio data took a lot of hard disk space to store. Let us go through a short example:

Let's say you want to sample a 1-minute song and store it on the hard disk. Because you want CD quality, you sample at 44.1 kHz, stereo, with 16 bits per sample.

If you wanted to download that over the internet, given an average 56kbit modem typically connected at 44kbit, it would take you:

(10584000 bytes ´ 8 bits) ÷ 44000 baud = 1924 seconds ~ 32 minutes

just to download one minute of music! Digital audio compression is the art of minimising storage space (or channel bandwidth) requirements for audio data. Modern perceptual audio coding techniques (like MPEG Layer III) exploit the properties of the human ear (the perception of sound) to achieve a size reduction by a factor of 11 with little or no perceptible loss of quality.

Therefore, such schemes are the key technology for high quality, low bitrate applications, like soundtracks for CD-ROM games, solid-state sound memories, Internet audio, digital audio broadcasting systems, etc.

The two parts of audio compression

Audio compression really consists of two parts. The first part, called encoding, transforms the digital audio data that resides, say, in a WAV file, into a highly compressed form called bitstream. To play the bitstream on your sound card, you need the second part, called decoding. Decoding takes the bitstream and re-expands it to a WAV file.

The program that effects the first part is called an audio encoder. LAME is such an encoder . The program that does the second part is called an audio decoder.

Compression ratios, bitrate and quality

What you end up with after encoding and decoding is not the same sound file anymore. All superfluous information has been squeezed out, so to say. It is not the same file, but it will sound the same, more or less, depending on how much compression had been performed on it.

Generally speaking, the lower the compression ratio achieved, the better the sound quality will be in the end - and vice versa. Table 1.0 gives you an overview of quality achievable.

Because compression ratio is a somewhat unwieldy measure, experts use the term bitrate when speaking of the strength of compression. Bitrate denotes the average number of bits that one second of audio data will take up in your compressed bitstream. Usually the units used will be kbps, which is kbits/s, or 1000 bits/s. To calculate the number of bytes per second of audio data, simply divide the number of bits per second by eight.

Bandwith

Quality

Comparable to

16 kbps

4.5 kHz

Shortwave radio

32 kbps

7.5 kHZ

AM radio

96 kbps

11 kHZ

FM radio

128 kbps

16 kHZ

Near CD

160-180 kbps (VBR)

20 kHZ

Perceptual transparency

256 kbps

22 kHZ

Studio

Table 1.0 Bitrate versus sound quality