What is MP3?

MP3 stands for MPEG-1 Audio Layer III, which is a part of MPEG-1 video encoding format, developed by MPEG (Moving Pictures Experts Group) in the early 1990's. The file-shrinking technology itself was developed by the Fraunhofer Institute in Germany. MP3 is a method to store good quality audio into small files by using psychoacoustics in order to get rid of the data from the audio that most of humans can't hear.

Before there were MP3's, digital audio files took hours to download. But on a 56K modem, most MP3's can be downloaded in just a few minutes. MP3's are widely recognised as the most popular format for storing and listening to music on the World Wide Web.

MP3 bitrates vary from 8 kbps (that is 8 kilobits per second, not kilobytes) to 320 kbps. When the MP3 phenomenon began in 1996, most audio files were encoded using 128 kbps bitrate, which is still the most popular bitrate, although many people agree that by using slightly higher bitrates, like 192 kbps or 256 kbps, the audio quality can be compared with CD quality.

How did these little files cause such a huge stir?

No doubt about it, MP3's are one of the most exciting developments in the history of recorded music. Here's why:

They're online. MP3s are ideal for Internet distribution because they sound great, they download quickly and there are plenty available online.

You can make your own MP3s. With some simple software, a PC and a few clicks of a mouse, anyone can create MP3 files from their own CD collection.

Those portable MP3 players are tiny. Most portable MP3 players are about one-third the size of a Sony Discman and hardly ever skip. MP3 players can be small enough to be integrated into other electronic devices like personal digital assistants and even wristwatches!

Get the music you want, when you want it. For many music fans, MP3 has become a symbol of total freedom, opening new distribution channels, unveiling a whole other world of music delivery beyond the heavily marketed, repetitive programming of commercial radio and MTV.

MP3's are the format of choice. Unlike other proprietary formats such as Liquid Audio and Windows Media Audio, MP3 is an open standard, meaning no one corporation controls it. For this reason, there are more MP3 listeners, software programs and hardware devices than any other CD-quality audio format in the world.

How is MP3 built?

Most people with a little knowledge in MP3 files know that the sound is divided into smaller parts and compressed with a psycoacoustic model. These smaller pieces of the audio are then put into something called 'frames', which are small blocks of data with a header.

The header is 4 bytes, or 32 bits, big and begins with something called sync. This sync is, at least according to the MPEG standard, 12 set bits in a row. Some add-on standards made later use 11 set bits and one cleared bit. The sync is directly followed by a ID bit, indicating if the file is an MPEG-1 or MPEG-2 file. 0 = MPEG-2 and 1 = MPEG-1.

The layer is defined with the two layers bits. They are oddly defined as:

0 0	Not defined
0 1	Layer III
1 0	Layer II
1 1	Layer I

With this information and the information in the bitrate field we can determine the bitrate of the audio (in kbps) according to this table:

Bitrate	MPEG-1, Layer I	MPEG-1, Layer II	MPEG-1, Layer III	MPEG-2, Layer I	MPEG-1, Layer II	MPEG-1, Layer III
0 0 0 0
0 0 0 1	32	32	32	32	32	8
0 0 1 0	64	48	40	64	48	16
0 0 1 1	96	56	48	96	56	24
0 1 0 0	128	64	56	128	64	32
0 1 0 1	160	80	64	160	80	64
0 1 1 0	192	96	80	192	96	80
0 1 1 1	224	112	96	224	112	56
1 0 0 0	256	128	112	256	128	64
1 0 0 1	288	160	128	288	160	128
1 0 1 0	320	192	160	320	192	160
1 0 1 1	352	224	192	352	224	112
1 1 0 0	384	256	224	384	256	128
1 1 0 1	416	320	256	416	320	256
1 1 1 0	448	384	320	448	384	320
1 1 1 1

The sample rate is described in the frequency field. These values are dependent on which MPEG standard is used according to the following table:

Frequency value	MPEG-1	MPEG-2
0 0	44100 Hz	22050 Hz
0 1	48000 Hz	24000 Hz
1 0	32000 Hz	16000 Hz
1 1

Three bits are not needed in the decoding process at all. These are the copyright bit, original home bit and the private bit. The copyright bit has the same meaning as the copyright bit on CD's and DAT tapes, i.e., indicating that it is illegal to copy the contents if the bit is set. The original home bit indicates, if set, that the frame is located on its original media. No-one seems to know what the private bit is good for!

If the protection bit is not set then the frame header is followed by a 16-bit checksum, inserted before the audio data. If the padding bit is set then the frame is padded with an extra byte. Knowing this the size of the complete frame can be calculated with the following formulae:

FrameSize = 144 × Bitrate / SampleRate

when the padding bit is cleared and

FrameSize = (144 × Bitrate / SampleRate) + 1

when the padding bit is set.

The FrameSize is of course an integer. If, for example, BitRate = 128000, SampleRate = 44100 and the padding bit is cleared, then the FrameSize = 144 × 128000 / 44100 = 417.

The mode field is used to indicate which sort of stereo/mono encoding has been used. The purpose of the mode extension field is different for different layers:

Mode value	Mode
0 0	Stereo
0 1	Joint-stereo
1 0	Dual channels
1 1	Mono

The last field is the emphasis field. It is used to sort of 're-equalise' the sound after a Dolby-like noise suppression. This is not very used and will probably never be. The following noise suppression model is used:

Emphasis value	Emphasis method
0 0	None
0 1	0 / 15 microseconds
1 0
1 1	citt j.17

Frame header

Please go to www.id3.org for more information on MP3's.