Waveforms and Psychoacoustics
Everything is
vibration. The universe is made of waves, and all waves oscillate at different
lengths (a wavelength is defined as the distance between the peak of one wave
and the peak of the next). Waves vibrating at different frequencies manifest
themselves differently, all the way from the astronomically slow pulsations of
the universe itself to the inconceivably fast vibration of matter (and
beyond). Somewhere in between these extremes are
wavelengths that are perceptible to human beings as light and sound. Just
beyond the realms of light and sound are sub- and ultrasonic vibration, the
infrared and ultraviolet light spectra, and zillions of other frequencies
imperceptible to humans (such as radio and microwave). Our sense organs are
tuned only to very narrow bandwidths of vibration in the overall picture. In
fact, even our own musical instruments create many vibrational frequencies
that are imperceptible to our ears.
Frequencies are typically described in units called
Hertz (Hz), which translates simply as "cycles per second." In
general, humans cannot hear frequencies below 20Hz (20 cycles per second), nor
above 20kHz (20,000 cycles per second), as shown in Figure
2-1.[1] While
hearing capacities vary from one individual to the next, it's generally true
that humans perceive midrange frequencies more strongly than high and low
frequencies,[2] and that sensitivity to
higher frequencies diminishes with age and prolonged exposure to loud volumes.
In fact, by the time we're adults, most of us can't hear much of anything
above 16kHz (although women tend to preserve the ability to hear higher
frequencies later into life than do men). The most sensitive range of hearing
for most people hovers between 2kHz to 4kHz, a level probably evolutionarily
related to the normal range of the
human voice, which runs roughly from 500Hz to 2kHz.
Figure 2-1: While
vibratory frequencies extend both above and below, human hearing is pretty
much limited to the range between 20Hz and 20kHz
|
These are simple and well-established empirical
observations on the human hearing mechanism. However, there's a second piece
to this puzzle, which involves the mind itself. Some have postulated[3]
that the sane mind functions as a sort of "reducing valve,"
systematically bringing important information to the fore and sublimating or
ignoring superfluous or irrelevant data.[4]
In fact, it's been estimated that we really only process a billionth of
the data available to our five senses at any given time. Clearly, one of the
most important functions of the mind is to function as a sieve, sifting the
most important information out of the incoming signal, leaving the conscious
self to focus on the stuff that matters.
The basic principle of any perceptual codec is
that there's little point in storing information that can't be perceived by
humans anyway. As obvious as this may sound, you may be surprised to learn
that a good recording stores a tremendous amount of audio data that you never
even hear, because recording equipment (microphones, guitar pickups, and so
on) is sensitive to a broader range of sounds and audio resolutions than is
the human ear. After getting an overview of how perceptual codecs work in
general, we'll take a closer look at exactly how the MP3 codec does its thing.
NOTE
The word "
codec" is a foreshortening of the words "compress" and
"decompress," and refers to any of a class of processes that allow
for the systematic compression and decompression of data. While various
codecs are fundamental to many file formats and transmission methods (for
instance image and video compression formats have their own codecs, some of
which are perceptual as well), it's the MP3 codec that concerns us here.
Next: Breaking
it Down |