Hear What You’re Not Hearing When Music Is Compressed
The convenience of the MP3 format has made many things easier -- sharing songs, taking the music with you without lugging around boxes of CDs or tapes. But in that drive for mobility and simplicity, it's not just the luggage that's being cut out. Ryan Maguire, a Ph.D. student in Composition and Computer Technologies at the University of Virginia Center for Computer Music, is on a mission of sorts.
That mission is to expose and explain just what is being taken from the listening experience when audio is compressed in the MP3 format. In a video posted on his website, The Ghost in the MP3, Maguire has taken the Suzanne Vega song, "Tom's Diner," and created an audio file of all the ambient sounds that were chopped off during its digital compression:
He calls the track "moDernisT" and as he explains, it was "created by salvaging the sounds and images lost to compression via the MP3 and MP4 codecs. The audio is comprised of lost MP3 compression material from the song 'Tom's Diner' famously used as one of the main controls in the listening tests to develop the MP3 encoding algorithm. Here we find the form of the song intact, but the details are just remnants of the original. The video is the MP4 ghost of a corresponding video created in collaboration with Takahiro Suzuki. Thus, both audio and video are the 'ghosts' of their respective compression codecs."
The MP3 standard was designed in the early 1990s by the Moving Pictures Experts Group. It reduces the audible quality of sound files, and despite technology regarding bandwidth and capacity evolving, the MP3 file remains the standard.
"MP3 has become a nearly ubiquitous digital audio file format," Maguire goes on to say. "First published in 1993, this codec implements a lossy compression algorithm based on a perceptual model of human hearing. Listening tests, primarily designed by and for western-european men, and using the music they liked, were used to refine the encoder. These tests determined which sounds were perceptually important and which could be erased or altered, ostensibly without being noticed."
So, in other words, the design of the format inherently decides what should be washed out of the mix. A program, therefore, decides what sounds it deems disposable. Kind of creepy when you think of it that way, right?
"What are these lost sounds? Are they sounds which human ears can not hear in their original context due to universal perceptual limitations or are they simply encoding detritus," he asks. "It is commonly accepted that MP3s create audible artifacts such as pre-echo, but what does the music which this codec deletes sound like? In the work presented here, techniques are considered and developed to recover these lost sounds, the ghosts in the MP3, and reformulate these sounds as art."
One listen to "moDernisT" and it's clear just what he's talking about. No, there are no huge chunks missing, no crucial parts or instrumentation -- what is missing is, perhaps, hard to define. It sounds like a wash of ambient effects, static and almost other worldly voices. It alone could be heard and thought of as an avant-garde piece.
Another way of interpreting it is that it is the undefinable, almost ghostly, soul of the song -- wiped out and left for dead.
So how did Maguire find the missing parts? "Using the Bregman, pyo, and pydub libraries, along with the LAME MP3 encoder, I begin with an uncompressed WAV file and save it as an MP3 file, 128kbps in this example, which does quite well. I chose 128kbps for these examples because that was the 'high-quality' bit rate used in the original MP3 development listening tests. In the music I've made (moDernist, etc.) using this process, I've used 320kbps MP3s."
While it's probably safe to say that the MP3 of "Tom's Diner" sounds fine in the capacity it is normally used, it is still fooling the listener and not presenting the complete aural picture. One listen to the missing piece makes you wonder just what else is being cut from any given file before it reaches our ears. There is a certain, as he calls it, "ghost"-like quality to the missing part that, in eerie tradition, would haunt the tune if given the chance. But the file decided it wasn't needed, so it's gone.
"The MP3 is not always the most appropriate format for a given task and a critical evaluation of the technology and its limitations is warranted," he adds. "Many listeners today listen exclusively to MP3 files, even in settings where the gains from a higher fidelity format would be clearly perceptible. This lossy compression codec has thus come to dominate unanticipated listening spaces.Despite its highly touted performance in listening tests, the MP3 compression codec does generate audible artifacts and remove perceptible sonic information, especially when implemented at low bit rates. For example, white, pink, and brown noise, when compressed to the lowest possible MP3 bit rate, sounds very different from the original random signal."
The bottom line is this: The listener is being cheated of hearing the entire recording via MP3. While this may not be news to those who have been keeping tabs on such things, hearing these fragments that have been removed helps illustrate just what is actually happening. Of course, records have the occasional pop and skip, tapes get mangled now and again and a smudged CD will send your player into hyperdrive, but in all those formats, there was nothing taken out of the equation. Obviously, a higher grade cartridge or top of the line speakers make things sound better than your average close and play, but that was up to the discretion of the listener.
With the MP3, that choice is gone.