imagemap imagemap imagemap imagemap mp3 converter logo

Section Navigation

Inside the MP3 Codec
Breaking It Down
Lossy Compression
Masking Effects
Freedom of Implementation
Who Defines Imperceptible?
The Huffman Coding
MP3 Decoding
Anatomy of an MP3 File
ID3 Space
Frames Per Second

MP3 Converter Home

Inside the MP3 Codec

Featured with permission from MP3: The Definitive Guide by Scot Hacker

Ever wanted to know exactly how MP3 works?  We've provided a special sample chapter from MP3: The Definitive Guide that explains the nuts and bolts of this now famous compression technique.

Sample Chapter 2:
How MP3 Works: Inside the Codec

In this chapter:

So what's the trick? How does the MP3 format accomplish its radical feats of compression and decompression, while still managing to maintain an acceptable level of fidelity to the original source material? The process may seem like magic, but it isn't. The entire MP3 phenomenon is made possible by the confluence of several distinct but interrelated elements: A few simple insights into the nature of human psychoacoustics, a whole lot of number crunching, and conformance to a tightly specified format for encoding and decoding audio into compact bitstreams. In this chapter, we'll take a look at these elements in detail in order to understand exactly what's going on behind the scenes of MP3 encoding and decoding software, as well as some of the chicanery that takes place between your ears.

Note that this chapter goes fairly deeply behind the scenes of MP3, and is somewhat technical in nature. You can skip this chapter if you're not interested in learning how MP3 works. If you just want to get started creating and playing MP3 audio, you can skip ahead to Chapters 3, 4, and 5.

A "Perceptual" Codec

Well-encoded MP3 files can sound pretty darn good, considering how small they are. As mentioned in Chapter 1, The Nuts and Bolts of MP3, your typical MP3 file is around one-tenth the size of the corresponding uncompressed audio source. How is this accomplished? That's a somewhat complex topic, so we've devoted this entire chapter to explaining the process.

MPEG Audio Compression in a Nutshell

Uncompressed audio, such as that found on CDs, stores more data than your brain can actually process. For example, if two notes are very similar and very close together, your brain may perceive only one of them. If two sounds are very different but one is much louder than the other, your brain may never perceive the quieter signal. And of course your ears are more sensitive to some frequencies than others. The study of these auditory phenomena is called psychoacoustics, and quite a lot is known about the process; so much so that it can be quite accurately described in tables and charts, and in mathematical models representing human hearing patterns.

MP3 encoding tools (see Chapter 5, Ripping and Encoding: Creating MP3 Files, for examples and usage details) analyze incoming source signal, break it down into mathematical patterns, and compare these patterns to psychoacoustic models stored in the encoder itself. The encoder can then discard most of the data that doesn't match the stored models, keeping that which does. The person doing the encoding can specify how many bits should be allotted to storing each second of music, which in effect sets a " tolerance" level-the lower the data storage allotment, the more data will be discarded, and the worse the resulting music will sound. The process is actually quite a bit more complex than that, and we'll go into more detail later on. This kind of compression is called [_Fi_] lossy, because data is lost in the process. However, a second compression run is also made, which shrinks the remaining data even more via more traditional means (similar to the familiar "zip" compression process).

MP3 files are composed of a series of very short frames, one after another, much like a filmstrip. Each frame of data is preceded by a header that contains extra information about the data to come. In some encodings, these frames may interact with one another. For example, if one frame has leftover storage space and the next frame doesn't have enough, they may team up for optimal results.

At the beginning or end of an MP3 file, extra information about the file itself, such as the name of the artist, the track title, the name of the album from which the track came, the recording year, genre, and personal comments may be stored. This is called " ID3" data, and will become increasingly useful as your collection grows. We'll look at the structure of MP3 files and their ID3 tags in this chapter, and the process of creating and using ID3 tags in Chapter 4, Playlists, Tags, and Skins: MP3 Options. Let's zoom in for a closer look at the entire process.



Always remember to set your encoder to store ID3 data during the encode process, if possible-doing so will save you a lot of work down the road.


Next:  Waveforms and Psychoacoustics


If you would like to learn more about MP3, consider purchasing MP3: The Definitive Guide by Scot Hacker.