4 Minuten
IAMF, which stands for "Immersive Audio Model and Formats", is a new open standard format for immersive audio, also frequently referred to as 3D audio. It was developed by Samsung and Google under the umbrella of the Alliance for Open Media, which also developed the AV1 video codec, but unlike AV1, IAMF is not a new codec, but a codec-agnostic way to describe 3D audio content and its intended rendering.
Yet another format?
With many 3D audio formats like MPEG-H, Dolby Atmos, DTS:X and others already around, it of course begs the question why there is a need for yet another format. A big difference to the existing formats is that IAMF is an open and royalty-free format, meaning that there is no licensing cost associated with using, implementing or producing content using the format. It is also not tied to a single vendor and not limited to vendor-specific codecs for the audio data, so you can use open codecs like Opus, FLAC or LPCM in IAMF. The open nature of the format will hopefully also foster adoption across several vendors, eventually paving the way for easy distribution of 3D audio content to end users and making it easier for individual content creators to produce and distribute their creations.
IAMF was specifically designed for the 3D audio use-case and allows greater flexibility than existing open formats, as it can contain various kinds of 3D audio, be it multi-channel or Higher Order Ambisonics (HOA) and can even contain information on how to do mixes of these for the final playback. What that means in practice is that one IAMF file can contain for example various 5.1.2 audio elements with spoken content in different languages and a 3rd-order ambisonic audio element with all other environmental sounds. These are rendered according to the desired final speaker layout and then mixed together as specified in the file. This allows efficiently storing the data and also means not having to create pre-rendered mixes for all possible languages, simplifying production. This also means an IAMF file is not limited to a specific output layout, it can provide mixes for various speaker layouts, stereo and binaural mixes for headphones.
Industry Adoption
While this all sounds good, the question remains as to whether IAMF will actually be supported by consumer devices and software. As mentioned before, the open and royalty free nature will hopefully help foster adoption across the industry. Google and Samsung are two big names already committed to the format and accelerating support across their platforms and products. The recently announced Eclipsa Audio is based on IAMF, and will be supported by an upcoming Android release, Chrome and various devices, like Samsung's new TVs and Soundbars.
In the open-source landscape, FFmpeg in version 7.0 gained support for muxing and demuxing IAMF, facilitating creation of IAMF files. The open-source VLC media player will also support IAMF playback in the 4.0 major release, using the libspatialaudio rendering library. The next major FFmpeg release will also feature a libspatialaudio-backed filter able to process HOA and render them to speaker layouts, or binaural rendering for headphones using a user-specified HRTF file.
For content-creators that want to dive into IAMF themselves, there are tools and examples available at the IAMF Tools GitHub repository. Content creators might also already be familiar with ADM (Audio Definition Model), for which exists an extensive production suite, the EPS. The resulting ADM content can then be converted to IAMF using the aforementioned IAMF tools. Of course audio does not exist in a vacuum, so for final distribution IAMF will usually be muxed in ISO BMFF (MP4), which is fully specified and the primary way it is intended to be distributed. This is also fully supported by FFmpeg already.
For projects interested in implementing IAMF support, the specification is freely available and there is also the open-source reference renderer, libiamf. Of note is also the libspatialaudio project, which can be used for various 3D audio rendering and mixing tasks.. It allows for the real-time rendering of “sound objects” (i.e. mono files) via amplitude panning, decodes HOA to speaker layouts or binaural, and can map one speaker layout to another. The aim is to give developers access to immersive audio rendering in a simple way, without requiring in-depth knowledge of the techniques themselves. The binaural rendering in libspatialaudio supports loading SOFA HRTF files, unlike the libiamf reference renderer, which allows for a personalised or preferred HRTF to be loaded. This can further enhance the quality of the rendering for the listener.
And for those just curious and eager to experience immersive audio delivered by IAMF in action today without special hardware, there is the IAMF Binaural web renderer demo, which runs libiamf compiled to WebAssembly in the browser, rendering the IAMF files using EBU’s BEAR and a predefined HRTF for headphone listening. Therefore all you need for that demo is a pair of headphones and you’ll be able to experience immersive audio yourself. (It will not work well with in-ear headphones.)
Conclusion
IAMF is a promising new format in the 3D audio landscape that might help make immersive audio experiences available to a wider audience, facilitating creation and delivery. YouTube, Android, popular players like VLC and tools like FFmpeg adding IAMF support is an important step in making it widely usable and the open nature of the format hopefully helps to accelerate adoption further. Of course only time will tell if it succeeds on its promise to deliver immersive audio for the masses.
Jean-Baptiste Kempf & Marvin Scholz
Jean-Baptiste Kempf is the creator of the VideoLAN non-profit and a key figure behind VLC media player. Heavily involved in the past 20 years in the open source ecosystems, he is the maintainer of dozens of open source projects, has founded multiple startups in the multimedia and gaming space, advised VCs and numerous startups and has led large engineering teams at scale.
Marvin Scholz is a software developer from Germany interested in multimedia and contributing to various open-source projects, among others the popular VLC Media Player and the multimedia swiss army knife FFmpeg.
Article topics
Article translations are machine translated and proofread.
Artikel von Jean-Baptiste Kempf, Marvin Scholz