After MP3: The Past, Present, and Future of Audio Codecs
With the MP3 gone, what audio codecs are poised to become the new standard? Let’s talk about compression, modern audio codecs, and the future of audio.
In 2017 the patents for MP3, held by The Fraunhofer Institute, expired. As 2018 unfolds, more and more manufacturers are pulling support for the codec. The MP3 format is one of the most widely known audio codecs, and it was largely responsible for the explosion of the digital audio world.
With the MP3 gone, what codecs are in the best position to become the new industry standard?
In this article, we will examine the basics of codecs, take a look at the big-name audio formats around today, and finally take a speculative look into the coming decades of audio codecs.
What Are Audio Codecs?
Codec is short for encode/decode. They can be hardware or software — both take the analog signal input and convert it to a digital format. The decode function is the exact same process but reversed to allow the digital data stream to be converted into analog sound waves for output.
There are three categories of codec: uncompressed, lossy, and lossless.
- Uncompressed: Uncompressed audio files encode the full audio input signal into a digital format capable of storing the full load of the incoming data. They offer the highest quality and archival capability that comes at the cost of large file sizes and high latency (non-real time playback), prohibiting their widespread use in many cases.
- Lossy: Lossy files are encoded differently than uncompressed. The essential function of analog-to-digital conversion remains the same in lossy encoding techniques. Lossy diverges from uncompressed, as the frequency of the input sound waves are sampled down to an approximately similar digital value. The sum total of all of these possible digital values gives the codec what is known as its bit depth. The bit depth of the codec, 16-bit or 24-bit most commonly, determines how accurately the sound is being “quantized” — the process of sampling used to round the incoming sound waves to their nearest values. Lossy codecs throw away a considerable amount of the information contained in the original sound waves.Because of this, lossy audio files are vastly smaller than uncompressed ones and offer much lower latency playback, allowing for use in live audio scenarios.
- Lossless: Lossless encoding stands as the middle ground between uncompressed and lossy. It grants similar audio quality to uncompressed at significantly reduced sizes. Lossless codecs achieve this by compressing the incoming audio in a non-destructive way on encode before restoring the uncompressed information on decode.
History of Audio Codecs
Since the first known audio recording in 1860 on a Phonautograph, audio recording and playback technology has been in a constant state of flux. The 20th century introduced the era of the professional sound recordists and engineers, the age of transmission of audio across radio waves, massive advancements in audio quality and technology, and continued growth in the audio industry — and trade in general.
In 1982, the audio world took its first steps into the new millennium with the first ever digital audio format — the compact disc. Built on the groundbreaking Pulse Code Modulation (PCM) technology, the CD was able to store analog sound waves as digital values by “quantizing” them to their nearest supported digital value.
Pulse Code Modulation sparked a new era of innovation for digital audio formats. Within a decade, recognizable modern codecs like MP3 and WAV were gaining traction. The early 2000s saw the first wave of lossless audio codecs that brought digital quality even higher without the trade-off of large file sizes.
But these superior formats were unprepared for the MP3 player craze of the early ’00s. Apple’s iPod brought the masses to the world of digital audio, and the MP3 became the standard for audio playback worldwide.
With the death of the MP3, what codec is best poised to take its place? What possibilities does modern technology bring to the potential of future audio codecs?
The Codecs of Today
There are many audio codecs in heavy use across a myriad of industries today. Many of the entries on this list were introduced decades ago, but there are a few new codecs shining light on the potential that the future of audio codecs holds.
- AMR — Adaptive Multi-Rate: The AMR Codec family is one of the most commonly used audio formats in the world. This is largely because it is the de facto audio standard on mobile phones. AMR is optimized for speech, which means it is a low-quality, low-bandwidth, low-latency codec. AMR was not developed for music or high-quality audio recording or playback.
- FLAC — Fully Lossless Audio Codec: FLAC is considered by many to be the better version of MP3. It compresses files to remarkably small sizes, and does so without any percieved loss in audio quality. FLAC files are very lightweight and versatile and can be played back on any device that can play MP3s. And it’s open-source, which wins it all sorts of awards in my book, notably third-party implementations of the codec and its features.
- Waveform Audio File Format, commonly shortened to WAV, has been an industry workhorse for almost three decades. The secret of its sticking power is the codec’s simplicity and durability. WAVs generally offer some of the highest-quality uncompressed audio without need for transcoding. Its stability means that often WAV files that have been damaged or corrupted will still play back.
- ALAC — Apple Lossless Codec: Released in 2004, Apple Lossless Codec, or Apple Lossless, supports eight channels of audio up to a 32-bit depth and maximum sample rate of 384kHz. In 2011, Apple made ALAC open-source and royalty-free.
- AAC — Advanced Audio Coding: AAC was created by the Fraunhofer Institute — the same engineers behind MP3 compression. With the expiration of MP3 earlier this year, the Fraunhofer Institute recommends AAC as its replacement codec. AAC is a lossy format whose main selling point is its significantly higher sound quality than MP3 at the same bit rates.
- DSD — Direct Stream Digital: DSD is a unique high-quality audio codec. The underlying technology behind DSD is a bit different than the standard Pulse Code Modulation found in most other codecs. DSD employs Pulse-Density Modulation encoding — reducing the resolution of the bit stream and increasing the sample rate to 2.8 million times per second — to generate the audio signal. DSD has limited use in the audio world, serving largely as an audiophile codec to be played back on specialty hardware.
- Opus: Opus is the most modern codec to make this list. Released in 2012, it was developed to serve as a single standard for several applications where there were multiple before. Opus was created with the needs of the modern world in mind — central to its philosophy is high-quality, low-latency audio suitable for network communications and live music performances. Its latency can be reduced to as low as 5 ms — most other codecs can barely offer 100ms of latency, by comparison.
Each year brings new variations and flavors of new codec technology, but what should we be looking for in a new standard for mass audio distribution?
While the myriad of formats to choose from today might be overwhelming, the time is ripe to start setting expectations for the codecs to come. There is much to learn from codecs of the recent and more distant past. The WAV’s reliable simplicity and universal functionality, FLAC’s fully lossless open-source model, Opus’s optimization for voice and general audio — technology has changed radically, so why shouldn’t our codecs?
A Dream Codec
Here’s what I’m looking for in an ideal future audio codec:
- Secure: Makes use of all modern security and cryptographic technologies.
- Open: Open-source and fully documented.
- Universal support: Can be recorded and played back without the need for new hardware or software.
- Fully lossless: Lightweight and high-quality.
- Multi-use: Optimized for voice and general audio use.
- High resolution: Support for highest possible recording resolutions.
While most of these ideal parameters exist in codecs today, no single codec has united all of these specifications in a meaningful way. Let’s hope the future is more open, more usable, more functional, and sounds great to boot.
In the coming decade, audio codecs that push the boundaries are going to become more commonplace. The underlying technology of encoding and decoding advances and evolves, merging with other sciences and disciplines as it goes.
I doubt that the audio world will ever see a single “industry-standard” codec again — I predict that audio formats will continue to develop into niches in much the same way as video codecs have. This will allow for greater usability and greater task specialization, which will make significant strides in streamlining the audio pipeline.
No matter what happens, the technological backbone of audio production and playback is past due for alignment. If current trends hold, there is a good chance the audio world will be unrecognizable in 10 years.
Cover image via gonin.
Looking for more audio information? Check out these articles.