Almost 5 years ago Google first released VP9, the royalty free video codec that aimed to replace H.264 as the primary codec for online streaming and media consumption. While VP9 was not completely successful in that task, it has laid the foundation for Google’s next generation codec, AOMedia Video 1 (AV1), which is looking extremely promising.
When VP9 first released, there were substantial doubts about how it would fare against the upcoming HEVC codec, which was backed by the same groups that lead to H.264’s popularity over On2’s TrueMotion VP3, Xiph’s Theora, Microsoft’s VC-1, and many others. And yet, here we are 5 years later, and VP9 has taken the world by storm. While HEVC has failed to find software support, with Edge being the only major internet browser to support it (and even then, only on certain processors), VP9 is now baked into every modern web browser except for Safari, and its royalty free nature has been a key factor in creating that situation.
In order to ship a product with HEVC support, you need to acquire licenses from at least four patent pools (MPEG LA, HEVC Advance, Technicolor, and Velos Media) as well as numerous other companies, many of which do not offer standard licensing terms (instead requiring you to negotiate terms), which can potentially cost hundreds of millions of dollars (and that’s after the recent drastic cuts to HEVC royalty fees). While those steep royalties were already problematic for products like Google Chrome, Opera, Netflix, Amazon Video, Cisco WebEx Connect, Skype, and others, they completely exclude HEVC as an option for projects like Mozilla Firefox, both on an economic level (Firefox simply cannot afford to waste hundreds of millions of dollars on royalties and hundreds of man hours negotiating all the necessary licensing agreements), on a practical level (Firefox needs to be royalty-free in order to ship in many FOSS projects), and on an ideological level (Mozilla believes in a free and open web, and that isn’t possible if you promote patent-encumbered standards).
Those issues prevented Firefox (and Chromium) from even including native H.264 playback on many platforms until a couple years ago (with it still requiring a plugin on Linux), and will likely prevent Firefox from supporting HEVC until after its patents expire in the 2030s (or possibly even later). Even to this day, Firefox only supports H.264 natively thanks to Cisco offering to pay all of the licensing costs for Mozilla through OpenH264, in order to standardize H.264 for streaming across the market until the next generation codec was ready.
And that opened the door for VP9. By being royalty free, VP9 was able to be implemented on any platform or service that wanted it, and it is seeing substantial hardware acceleration support as well. Beyond Youtube using it on any device that can support it (as the reduced bandwidth usage is a huge cost savings for Youtube), the WebM container (which supports VPx video and audio in either Opus or Vorbis) is also replacing .gifs with silent videos that are substantially smaller on sites like imgur and gfycat, it’s being used throughout Wikipedia, it has been adopted by Skype (who were a driving force behind Opus’ development), and it’s even being adopted by Netflix (starting with their downloads for offline viewing, and moving to their regular streaming in the future).
However, VP9 alone was not enough. Google wants even better compression, especially for Youtube and Duo, where a tiny increase in video compression can result in huge cost savings and a major improvement in user experience. So Google put together a plan to rapidly update their VPx codec line, like they do with Chrome and some of their other products. Google announced that they planned to release VP10 in 2016, and then would release an update every 18 months to ensure a steady progression. It got to the point where Google even started to release code for VP10, and then suddenly Google announced the cancellation of VP10, and formed the Alliance for Open Media (AOMedia).
Despite HEVC and VP9 being the two most popular next generation codecs, they weren’t the only ones. Cisco was developing Thor for use in their videoconferencing products, and Xiph was developing Daala (a codec designed to be substantially different from all previous codecs, in order to prevent any possibility of patent claims). All three codecs (Thor, Daala, and VP9/VP10) were looking quite promising, but the split efforts were stifling their development and adoption, so the three organizations came together and merged their codecs into one (AV1), and created the Alliance for Open Media to further the development and adoption of this joint codec. AV1 aims to take the best parts of each of those three codecs, and merge them into a royalty-free package that anyone can implement.
While it is taking some time to merge Thor, Daala, and VP10 together, the first public beta for AV1 released in mid-2016, the bitstream is expected to be finalized later this year, and it appears that the Alliance for Open Media is gearing up to promote AV1. Some of the involved developers are starting to give public talks on it (like this one at FOSDEM) and it appears that Google may be promoting it at Google I/O this week.
That support isn’t just coming from Google either. The Alliance for Open Media includes everyone from processor designers (AMD, ARM, Broadcom, Chips&Media, Intel, Nvidia, etc.) to browser developers (Google, Microsoft, and Mozilla) to streaming and videoconferencing services (Adobe, Amazon, BBC R&D, Cisco, Netflix, Youtube, etc.). Those companies are expected to bring their substantial strength to play in rolling out AV1 support, with the first streaming services expected to be ready within just 6 months after the bitstream format is finalized, and the first hardware decoders are expected to ready within 12 months. That alone will bring substantial hardware support for AV1 fairly quickly, however if everything lines up, we may even see partial hardware acceleration backported to some already existing hardware, like what happened with VP9, which would be a huge boost for compatibility.
Video streaming is a massive chunk of total internet traffic, and even a couple percent improvement in compression can have massive effects on both the network as a whole, and on user experience for that specific application. AV1 and Opus will make it possible to have decent quality video on lower throughput connections (opening up video streaming for more situations and more markets), and will enable even better quality than before on high throughput connections. They also are both designed with use over cellular networks in mind, with AV1 and Opus bringing massive improvements in how well they scale as connection speeds change, not to mention the higher resolutions, higher frame rates, expanded colour space, HDR support (which will be vital for services like Netflix, Youtube, and Amazon Video to take full advantage of the new displays on devices like the Samsung Galaxy S8 and the LG G6, with the latter now being able to take advantage of Netflix’s recently-added HDR support in mobile), and lower latency that they will enable when combined in the WebM container.
Of course, the groups promoting HEVC won’t sit idly by while this is happening. They have already begun making threats about starting patent litigation against AV1 once it is released, and the Alliance for Open Media are going to great lengths to make sure that it does not happen. They are performing an extensive legal code review of AV1 to make sure it does not infringe on any patents held by MPEG LA, HEVC Advance, Technicolor, Velos Media, and others. That form of code review was highly successful for VP8 and VP9, both of which survived all legal challenges. MPEG LA’s actions against VP8 and VP9 were seen as potentially not having any legal grounding and instead being purely anti-competitive. The DoJ was investigating MPEG LA’s actions until they agreed to drop the lawsuit and give Google permission to sub-license out MPEG LA’s patent pool to any users of VP8 or VP9. While we likely will see similar attempts at stopping AV1, Google’s substantially expanded patent pool and the substantially increased number of companies supporting the codec (thanks to the Alliance for Open Media) should both go a long way towards ensuring that they are dealt with in short order.
It truly is exciting to see the improvements that AV1 is bringing to video encoding, especially since it is royalty free. The massive support it is receiving (even before release) will mean great things for the future of video streaming and local recording as well. AV1’s improvements will bring better live casting of events, better video chatting (via WebRTC), smaller files for local storage, previously unheard of quality for video streaming (such as high quality 4k HDR while on a cellular network), and potentially other uses that we haven’t yet thought of, especially when paired with the improved speeds of 5G mobile networks and 802.11ax WiFi. Best of all, AV1 is only the beginning. Google had plans for rapid releases for VPx in order to see constant improvements (with devices using the HTML5 Video tag to be served the highest quality version that they support), and we may not have to wait very long before we see talk of an incremental update to AV2.
What has your experience been like with current-generation codecs?