Sample Chapter 2:
How MP3 Works: Inside the Codec
In this chapter:
So what's the trick? How does the MP3 format accomplish its radical feats of
compression and decompression, while still managing to maintain an acceptable
level of fidelity to the original source material? The process may seem like
magic, but it isn't. The entire MP3 phenomenon is made possible by the
confluence of several distinct but interrelated elements: A few simple
insights into the nature of human psychoacoustics, a whole lot of number
crunching, and conformance to a tightly specified format for encoding and
decoding audio into compact bitstreams. In this chapter, we'll take a look at
these elements in detail in order to understand exactly what's going on behind
the scenes of MP3 encoding and decoding software, as well as some of the
chicanery that takes place between your ears.
Note that this chapter goes fairly deeply
behind the scenes of MP3, and is somewhat technical in nature. You can skip
this chapter if you're not interested in learning how MP3 works. If you just
want to get started creating and playing MP3 audio, you can skip ahead to
Chapters 3, 4, and 5.
A "Perceptual"
Codec
Well-encoded MP3 files can sound pretty darn good, considering how small they
are. As mentioned in Chapter 1, The Nuts and Bolts of MP3, your typical
MP3 file is around one-tenth the size of the corresponding uncompressed audio
source. How is this accomplished? That's a somewhat complex topic, so we've
devoted this entire chapter to explaining the process.
MPEG Audio Compression in a Nutshell
Uncompressed audio, such as that found on CDs, stores more data than your
brain can actually process. For example, if two notes are very similar and
very close together, your brain may perceive only one of them. If two sounds
are very different but one is much louder than the other, your brain may never
perceive the quieter signal. And of course your ears are more sensitive to
some frequencies than others. The study of these auditory phenomena is called
psychoacoustics, and quite a lot is known about the process; so much so
that it can be quite accurately described in tables and charts, and in
mathematical models representing human hearing patterns.
MP3 encoding tools (see Chapter 5, Ripping
and Encoding: Creating MP3 Files, for examples and usage details) analyze
incoming source signal, break it down into mathematical patterns, and compare
these patterns to psychoacoustic models stored in the encoder itself. The
encoder can then discard most of the data that doesn't match the stored
models, keeping that which does. The person doing the encoding can specify how
many bits should be allotted to storing each second of music, which in effect
sets a "
tolerance" level-the lower the data storage allotment, the more data will
be discarded, and the worse the resulting music will sound. The process is
actually quite a bit more complex than that, and we'll go into more detail
later on. This kind of compression is called [_Fi_]
lossy, because data is lost in the process. However, a second compression run
is also made, which shrinks the remaining data even more via more traditional
means (similar to the familiar "zip" compression process).
MP3 files are composed of a series of very
short
frames, one after another, much like a filmstrip. Each frame of data is
preceded by a
header that contains extra information about the data to come. In some
encodings, these frames may interact with one another. For example, if one
frame has leftover storage space and the next frame doesn't have enough, they
may team up for optimal results.
At the beginning or end of an MP3 file, extra
information about the file itself, such as the name of the artist, the track
title, the name of the album from which the track came, the recording year,
genre, and personal
comments may be stored. This is called "
ID3" data, and will become increasingly useful as your collection grows.
We'll look at the structure of MP3 files and their ID3 tags in this chapter,
and the process of creating and using ID3 tags in Chapter 4, Playlists,
Tags, and Skins: MP3 Options. Let's zoom in for a closer look at the
entire process.
NOTE
Always remember to set your encoder to store
ID3 data during the encode process, if possible-doing so will save you a lot
of work down the road.
Next: Waveforms
and Psychoacoustics
If you would
like to learn more about MP3, consider purchasing MP3: The Definitive Guide
by Scot Hacker. |