Codecs and Containers
While one half of the battle was won when browsers began supporting audio and video directly with inline players, the other half, which formats to use, had only begun. While browser manufacturers have largely coalesced around a few specific formats, there's still enough fragmentation to warrant a page. I'll be going over the basic terminology of formats and support and then try to point you in the direction of formats to use and software to generate files in those formats.
(To keep myself from having to say "audio and video" over and over, I'll be referring only to audio for most of this page. Video works identically, just apply the principles to visuals instead of audio. The terminology is the exact same.)
In the simplest terms, a codec is a way to encode audio as a way to make it smaller and easier to play back. Codecs are called such because they compress audio and then provide the way to decompress that audio back into something playable. Codecs are necessary because raw audio and raw video are both incredibly large and take a lot of horsepower to play back. In fact, even when you take video with your phone or a camcorder, it's encoded in some kinda format; otherwise, it would fill up the storage space of the device quickly.
There's two types of codec, lossy codecs and lossless codecs. (If you remember the page on images, these are very similar.)
Lossless codecs simplify down data. They often use a prediction algorithm to guess at what the next section of audio will sound like and store only what it got wrong in its initial guess. Lossless codecs include FLAC, Apple Lossless (or ALAC), and WavPack.
Lossless codecs, while still highly efficient in their own right, are mostly unusable for internet streaming; they're simply too big. Thus, you'll usually see lossless audio used as a way to archive CDs and transferred analog formats to maximize quality.
Lossy codecs, on the other hand, carefully remove data to reduce the file size. These are highly tuned to the weaknesses of the human ear and attempt to remove the audio we're least likely to notice missing. Lossy codecs include MPEG AAC, Vorbis, and the venerable MP3.
Lossy codecs, by their nature, are destructive to audio; however, thanks to their ability to reduce file sizes to about 5% of the raw file size, they're highly suited for internet streaming, WebRTC, and quick downloads. The audio you embed on your site will more than likely be in lossy formats; there's nothing preventing you from posting FLAC files to a site, of course, they'll just be a much bigger download, and you'll have to make the call if the boost in audio quality is worthwhile to you and your visitors.
After the audio's been encoded, it is stored in a container format. Containers are used to group multiple audio, video, and informational (like closed captioning) streams into a single file for ease of playback.
Matroska is an excellent example of a popular container format for large video files. Because of its support for metadata, multiple selectable audio, video, and caption streams, error resilience, and chapters like on DVDs and Blu-rays, Matroska is very commonly used to distribute movies over the internet in the
Sometimes, when we think of audio formats, we're actually thinking of its container. As an example, if you've ever bought music from iTunes, you've likely seen an
.m4a file. The audio is encoded in AAC, and it's stored in a container called MPEG-4 Audio.
Common audio codec/container pairings
Does the distinction between container and codec matter? When it comes to web audio and video, unfortunately. Sometimes, containers and codecs are essentially one in the same; the venerable MP3 nearly always comes in an MPEG-1 container. Other times, the same audio format can be stored in a variety of containers, and some browsers that can technically play an audio format won't play it unless it's in a specific container. Thankfully, you don't need to memorize which can play where. I'll try to simplify things down into specific use cases.
For no-questions-asked support: MP3 + MPEG-1
MP3 is the absolute champion of the consumer audio world, and support is excellent, from dedicated portable devices like the iPod and (of course) MP3 players, home units like stereos and smart TVs, to of course computers. Nearly 98% of users worldwide use browsers that support MP3 playback. Even better, in 2017, the last of the patents protecting MP3 expired, and open-source software and operating systems began to include MP3 support out of the box.
As for container, MP3s commonly come in the MPEG-1 container, with the
.mp3 file extension. If you're looking for the "set it and forget it" option, use MP3. Just about any audio encoder frontend supports encoding MP3s, if you need software that can do it.
- Easy Windows encoding: LameDropXPd
- Easy Mac encoding: X Lossless Decoder
- Command line encoder (various platforms and architectures): RareWares compiles
- Recommended encoder settings: "LAME" on hydrogenaud.io
Compromising between support and open-source: Vorbis + Ogg
Vorbis is a curious competitor to MP3 that bests it in both sound quality at the same bitrate and in licensing, as it's been free and open-source from its inception in the early 2000s. This makes it extremely popular to implement in both game engines (Minecraft, Grand Theft Auto: San Andreas, and World of Warcraft being three prominent examples) and in streaming applications like web radio and Spotify.
Vorbis's support is good, but not universal. No version of Safari or Internet Explorer supports playing Vorbis files in the browser. This means for sites streaming to iPhones, Vorbis is a no-go. Otherwise, Chrome, Firefox, and Edge have all supported it for a long while. You can expect around 81% global support for Vorbis streams.
- Easy Windows encoding: oggdropXPd
- Command line encoder (various platforms and architectures): RareWares compiles
- Recommended encoder settings: "Recommended Ogg Vorbis" on hydrogenaud.io
For fast, small files: Opus + Ogg
As of writing this, Opus is at the cutting edge of lossy audio. Opus is a combination of a speech codec (SILK—most prominently used by Skype) and general-purpose audio codec (CELT), both heavily optimized for low delay streaming applications like voice chat. Opus is also excellent sounding at low bitrates, achieving transparency (the quality level where the difference between the encoded file and the original is imperceptable) at a much lower bitrate than Vorbis or MP3. In short—small files, and they stream fast.
Like Vorbis, Opus support is good, but not great. While Safari (on High Sierra and later only) nominally supports Opus, it's only in Apple's own Core Audio format container, not in the more traditional Ogg container (with a
.opus extension). Chrome, Firefox, Edge, and Opera all support these containers just fine. For incredibly small, good sounding audio files that stream without much overhead, there's nothing out there more suitable than Opus.
- Command line encoder (various platforms and source code): opus-tools
- Recommended encoder settings: Anecdotal, but speech is transparent around 32kbps and music is transparent around 96kbps, especially from a lossless or uncompressed source. The official Opus site offers a selection of audio samples in case you'd like to listen and decide for yourself.
Common video codec/container pairings
The container matters more when referring to video; often, we refer to the container more than the video codec itself (see
.mov, and the aforementioned
.mkv). While this isn't exhaustive, these are the two major web streaming formats supported and in use today.
Most commonly supported: H.264 + MP4
H.264 is unquestionably the industry standard, being the most common encoding on Blu-ray discs and used in streaming by everything from Netflix, Hulu, and YouTube to even terrestrial TV stations. There's no question that your users can play an H.264 video. The main issue comes down to licensing; H.264 is not a free format to implement into products, though there are free encoders out there, and H.264 can be streamed freely over the internet.
- Easy encoding (various platforms): HandBrake
- Command line (various platforms and source code): VideoLAN's x264
- Recommended encoder settings: "Adjusting quality" in HandBrake's docs
Open-source in recent browsers: VP8 + WebM
Of course, if you're a stickler for open-source, royalty free formats, Google's VP8 (better known by its container, WebM) format is what you want. VP8's quality is generally on par with H.264, and WebM support is fairly good, if not universal, and it can use both VP8 and VP9 video encoding, the latter being the next-gen, more-optimized-for-HD (though slightly less well-supported) version of VP8.
- Easy Windows encoding: Axiom, an ffmpeg frontend
- Command line encoder (various platforms and source code): ffmpeg
- Recommended encoder settings: Encode/VP8 in ffmpeg's docs
Go with multiple formats!
Remember that there's nothing preventing you from picking two or more formats and using them in a single
video element. If the browser can't play one format, it'll try another. Thus, it's most ideal to go with a codec that produces small files for browsers that can play it and a legacy format that's better supported for those that can't.
- Audio and video online come down to two parts: the codec used to encode the audio or video, and the container used to store it.
- Codecs are necessary for reducing the size of the file. Lossless codecs simplify the data without destroying it, while lossy codecs throw out data human senses are unlikely to notice.
- Lossless codecs are best suited for archiving audio for later playback and conversion to other formats, while lossy codecs are best suited for internet streaming and fast downloads.
- Three codecs are commonly used for audio: MP3, Vorbis, and Opus.
- MP3 is widely supported by all major browsers and is now patent-free. While larger and lower quality than newer codecs, it's a good, safe option all around.
- Vorbis is an open-source competitor to MP3. Only about 80% of browsers and no versions of Safari or Internet Explorer support Vorbis.
- Opus is one of the newest audio codecs, is fast, tiny, and highly optimized for streaming. Full support is generally on par with Vorbis; Safari only supports Opus through a QuickTime-specific container, and only on recent Macs.
- Video formats come down to two main choices: H.264 and VP8.
- H.264 has the industry's wide support, but is encumbered by patents. Thankfully, this doesn't apply to those streaming H.264, and free encoders are available online.
- VP8 is completely open-source and quality is on par with H.264, though support is lesser.
- Ideally, you should encode your audio or video in at least two formats for maximum support.