Spatial or 3D Audio in Games

Principles and terms

The nodes and ridges in the cartilage of our ears have an evolutionary basis. The sound waves reverberate inside the ear in a certain way depending on from which direction the sound reaches them. The sensation of the volume change is a result of the constructive and destructive interference that happens when the sound reaches the ear. Meaning that different elements of sound reach the ear at different times and the physical structure of the ear dampens or amplifies them in certain ways. The phase of the sound is determined by the path that the sound reached the ear canal. Phase differences and volume changes are different for each direction and also the frequency affects them. Also, head size and head shape are a factor in the differences that are perceived.

When we think of stereophonic audio, it acts as a flat plain of sounds where the 3D simulating distancing has to be made with volume changes, echoes, etc. to create the illusion of the three-dimensional property of the sound-field. In 3D audio, actual monaural or mono sound files are placed in the field of hearing, so that the audible source can be precisely located to a certain point on a sphere-like field. Hence, the sound can appear underneath, above, or from the sides of the player. Monophonic audio acts like a point-like sound source that does not have a wide audible field, like stereo, and can thus be traced accurately in a given array of sounds in a 3D domain.


Ambisonics

The simplest way to describe spatial audio, 3D, or 360 audio, is that it is a way of experiencing sound so that the auditory result is different based on the movements of the head or POV (Point of View), of the spectator. It uses either ambisonics, binaural audio, or modeling algorithms to reach the desired effect. Ambisonics is a multichannel, surround-sound format that allows a full sphere of sound to be heard, so the sound sources can be located also below and above the listener when including as well the traditional horizontal plane of sound as heard in two-channel stereo. It is a three-dimensional sound plane, consisting of three M/S (mid/side) stereo microphone pairs and additionally, it includes different channels for defining height and depth in the sound.


Binaural "dummy head"

The idea behind binaural audio and its recording is that it utilizes two microphones to create a three-dimensional sensation to the listener, hence sounding like being in the same room with the musicians. It is not very functional on a pair of speakers, but rather craves the use of headphones to fully translate to the listener's ears. To record binaural audio, one needs a "dummy head", which is a mannequin head that has a microphone in each ear fitted to it. Such binaural packages are for example Neumann KU-81 and KU-100. This technology can be used to the cinematic and precomposed soundtracks of games, but otherwise, it does not offer as wide applications as 3D software, at least for gaming use. It is used in IMAX movies, to offer the spectators a 3D experience on the audible level.


3D audio effect (modeling algorithm)

When thinking of 360 videos matched with 360 audio, the storytelling that results in that immersive experience can be done mostly on the auditive level. If there are no interesting aspects in the audio, the mere presence of the video tends to bore out the viewer quickly if the auditory cues don't move with the POV, which alternates. If one doesn't want to use any of the aforementioned techniques to create the spatialized audio track, there are a plethora of 360-degree microphones on the market as well in many price ranges.

Playstation 5 and 3D audio

One of the most intriguing examples of three-dimensional audio comes at the moment from console gaming, where Sony's late 2020 release of the next-generation PlayStation 5 also integrated for the players' delight 3D Audio in it. Sony declares that the player is even able to detect individual raindrops landing in the different environments they have created. 3D Audio is a highly immersive format of sound because the player can hear the sounds as if they would be taking place all around one. Thus, it puts the player in the spotlight and adds a greater level of intensity to the gameplay.

In 3D audio, the sound engine uses certain audio algorithms to generate realistic and convincingly natural soundscapes, where one can identify the sounds and their sources very accurately. In their previous PSVR-headset, the sound source replication was limited to 50 sources with acceptable quality, whereas the PS5 raises the bar to hundreds of them and with exceptional quality as well. The sound designers are given much more creative tools to work with when the scenes can now also be built around a highly interactive audio format.

The PS5 technology is object-based, so it means that all sound sources must be recorded in mono to get good audibility for the separate sounds. You can enjoy the three-dimensional audio of PS5 best from headphones at the moment, and Sony has a dedicated and optimized Pulse wireless headset, but other headsets work as well when plugged into the controller which has a headphone jack. They will concentrate first upon delivering the sound to headphones, but speaker system applications are coming up in the future.


Playstation 5

The Tempest Engine is a hardware chip, which is used in the PS5 to process the coordinates of the mono audio input signal into a 3D audio output signal. It uses HRTF or Head-related Transfer Function to decode the way that individual humans' ears receive the sound. The chip and its inbuilt modeling algorithms decode the way that the sound changes in terms of direction and frequency, and takes the aforementioned factors about hearing into consideration as well. With this in mind, it also considers on a true level the directivity of the sound and adjusts to the head movements that the player makes. The audio moves when the player moves the head in a certain direction.

The engine allows this to be done without the need for an expensive surround sound system. There are plans for developing and having the possibility to use over 100 profiles that make it easier to optimize the sound settings. Everybody has an individual HRTF, but these profiles allow us to get quite close at a minimum. The Tempest allows even normal stereo audio to have an added dimensionality to it, according to PS5:s lead architect, Mark Cerny.

PS5 games that include 3D audio are amongst others: Marvel's Spider-Man: Miles Morales, Marvel's Spider-Man Remastered, Astro's Playroom, Gran Turismo 7, Returnal, Destruction AllStars, Demon's Souls, Ratchet & Clank: Rift Apart, Sackboy: A Big Adventure, Horizon Forbidden West, and Resident Evil Village.

PC and other consoles

Starting with military and industrial projects in 2004 and gradually moving their scope to game audio and consumer electronics, Nahimic (short for N Array Headphone Integrated MICrophone), has since become a market leader in the PC-based gaming industry. The software solves many everyday problems like the comprehensibility of Skype or Zoom calls, dialogs on movies that are streamed from Netflix, or clarify the sound quality of the music played from Spotify. It doesn't either need expensive gear since it can be optimized for any kind of equipment, whether the user has headphones or just a stereo pair of speakers in use. It has won multiple awards with its' design that creates an immersive sound field for 360° video integration.


Nahimic 3

Another solution to create a spatialized sound field is the THX Spatial Audio, which is inbuilt in many gaming laptops, headsets, and smartphones. For the latter, they have even a dedicated app that costs around 20 dollars in their webshop. Natively, also many computers support Dolby Atmos. In Dolby Atmos, height channels are added to the speaker array, so the ear perceives the sounds from these as three-dimensional objects. Released in 2012, this surround sound technology has ever since become a factory standard for the movie industry when many high-end movie theaters have their speaker systems dedicated to producing the sensation of 3D audio. This format is also supported by many smartphones.

Xbox Series X enables gaming in Dolby Atmos (also on Xbox One), DTS:X, or Windows Sonic. In the Series X, there is a headphone implementation of Atmos, so you don't need an audiophile-level audio system to enjoy the immersive audio experience. DTS:X uses a similar technique but summarises the Atmos audio with another immersive format called Auro-3D. Windows Sonic is a free-of-charge headphone optimized spatial sound software that is compatible with Windows 10 upwards. It works on any headphones, so it makes it a cheap solution for adding three-dimensional audio to games.

Nintendo Switch does not support spatial audio, but multichannel surround-like 5.1 systems are supported. This might change with their upcoming console, though.

Apple Spatial Audio

As an answer to the growing interest in immersive audio, Apple has created its own concept. Apple Spatial Audio can be at the moment only enjoyed through AirPods Pro or Max. It supports an array of applications even though it is lacking some of the most popular streaming services like Netflix, Amazon Prime, and YouTube. The idea is that the accelerometers and gyroscopes, that are found in the AirPods, are used to track the user's head movements in precision. At the same time, the technology is tracking the position of the phone or tablet that you use for watching and hence places the sound in that relation to the screen of the device.

Mixing for 360 Audio

The foundation of spatialized audio and its mixing is that the listener should be able to pinpoint the direction and location of the sound source while listening to the sounds. In 360 audio, there has to be the additional notion of the player's head movements that result in the change of the audible cue.

DAWs or Digital Audio Workstations like ProTools have already support for ambisonic and VR/360 audio in their Ultimate version. There is also a free plugin for audio spatialization called Ambisonic Toolkit (ATK) for Reaper, which is a free low-cost sequencer workstation for Windows and Mac users alike. It even comes with a 60 day free evaluation period. Even Ableton Live has its own free tool for VR, AR, and spatial 3D audio called Envelop for Live or E4L. It can be used already in their second newest release of the DAW, namely Ableton Live 10.


Facebook 360 Spatial Workstation + Pro Tools

Facebook 360 is another interesting addition to the toolkit of immersive audio designers. The software allows one to make spatialized audio to 360 videos and cinematic VR alike, but can only be used on the Pro Tools license. Its plugins are supported by several popular DAWs and it is free of charge. Also the company Gaudio Lab offers many free solutions for different platforms.

In Pro Tools, for example, this can be done with the use of FB360, which has several plugins that automate the output to binaural audio. The user has to import the respective sound files and start to spatialize the points according to the action that takes place in the video. The plugin allows to track the movements in real-time and automates the output, so it is extremely handy in that manner. The FB360 offers a high level of automation of the parameters, although in complex audio processing and spatialization projects also more adjustment is needed to create a natural sounding
environment.

Summary

Three-dimensional audio is increasingly infiltrating into the gaming experience. The gamers want to experience the games on a more immersive level and the leap from the flattened plain of sounds in stereo audio into an environment, where individual audio cues can be tracked to the point of origin is staggering. Also allowing a deeper sense of interaction, where the player is an active explorer of sound sources, one can say that 3D audio is here to stay.


Apple Spatial Audio

Consoles as well as gaming laptops have adopted this audio format very quickly for the gamers' delight and the solutions will continue to advance when the computing power of processor chips is taken to new heights in the upcoming years. The multitude of the different plugins and platforms is striking and thus, a universal format would be needed in the future to add synergy and flow to the processing.

From the sound designer's point of view, spatial audio offers tools to enhance the narrative on the audible level, which is thus an independent and rich layer of storytelling. Mixing for the 360 audio is made relatively easy by many DAW plugins, where the process allows a high level of automation and is easy to get started with.



References

Allen, Jennifer. (2020). What is Windows Sonic for Headphones?. Retrieved from
https://www.lifewire.com/what-is-windows-sonic-for-headphones-4776289.

Ambisonics. (2021). Retrieved April 21, 2021, from
https://en.wikipedia.org/wiki/Ambisonics.

Avid Pro Tools. (2021). Retrieved April 22, 2021, from
https://www.avid.com/pro-tools.

Binaural Recording. (2021). Retrieved April 21, 2021, from
https://en.wikipedia.org/wiki/Binaural_recording.

Facebook360 Spatial Workstation. (2021). Retrieved April 22, 2021, from
https://facebook360.fb.com/spatial-workstation/.

Gaudio Labs. (2021). Retrieved April 22, 2021, from
https://gaudiolab.com/.

Hernandez, Robert. (2016). Spatial audio: How to record for VR. Retrieved from
https://medium.com/@webjournalist/spatial-audio-how-to-hear-in-vr-10914a41f4ca.

Kaye, Kieran., (2017). WTF Is VR and Spatial Audio? Mixing for 360 Audio. Retrieved from
https://flypaper.soundfly.com/produce/wtf-vr-and-spatial-audio-mixing-360-audio/.

Kirn, Peter. (2018). Free Tools for Live Unlock 3D Spatial Audio, VR, AR. Retrieved from
https://www.ableton.com/en/blog/free-tools-live-unlock-3d-spatial-audio-vr-ar/.

Parsons, Tom., (2020). Apple spatial audio: what is it? How do you get it?. Retrieved from
https://www.whathifi.com/advice/what-is-apple-spatial-audio.

Reaper by Cockos. (2021). Retrieved April 22, 2021, from
https://www.reaper.fm/.

Pictures:

Picture 1. Ambisonics. Retrieved from https://new.steinberg.net/nuendo/virtual-reality/.
Picture 2. Binaural "dummy head". Retrieved from https://en-de.neumann.com/ku-100.
Picture 3. 3D audio effect. Retrieved from https://www.wikiwand.com/en/3D_audio_effect.
Picture 4. Playstation 5. Retrieved from https://www.playstation.com/fi-fi/ps5/.
Picture 5. Nahimic 3. Retrieved from https://www.nahimic.com/gamers/.
Picture 6. Facebook 360 Spatial Workstation + Pro Tools. Retrieved from https://facebook360.fb.com/2017/10/18/spatial-workstation-pro-tools-hd-12-8-2/.
Picture 7. Apple Spatial Audio. Retrieved from https://www.patentlyapple.com/patently-apple/2020/09/apple-patent-reveals-their-work-on-a-spatial-3d-audio-engine-that-will-take-vr-gaming-to-the-next-level.html.

|