Making immersive audio for VR

Develop catches up with the VisiSonics team behind 3D spatial audio processing tech RealSpace 3D
Author:
Publish date:

Could you tell us a little about VisiSonic? You have an academic background, is that right?
Gregg Wilkes, Interim CEO and EVP, VisiSonics: Yes, we do. It all started about eight-to-ten years back when Ramani [Duraiswami, VisiSonics founder] and a couple of other scientists began working on a couple of projects; one being virtual audio, including conference calling, to help audio catch up with the video, as audio was still a way behind back then.

As they were going down that path they took on a number of other projects, including enabling newly blinded soldiers to cross the street trusting their ears only. This forced the team to really understand how humans process sound, so they could learn to recreate a full 360 immersive experience, so we could then create algorithms to recreate that over stereo headphones. The goal there was to simulate a street corner in a lab through headphones, and that’s where VisiSonic’s work really began. We were learning how to create audio clues not just to the left and right, but above, in front and behind.

People had been trying that for maybe 20 years, but nobody had truly got it down before.

That brought us to understanding head-related transfer function – or HRTFs [See ‘What is an HRTF?’]– and we actually started looking at several ways to capture individual HRTFs. We developed and patented a methodology and vehicle in the lab that enabled us to capture HRTFs from individuals in a matter of seconds, compared to the several hours it had taken previously.

Understanding how human beings process sound has allowed us to understand recreating that, and that’s let us build a vast library of HRTFs. We’ve also worked on capturing not only the initial sound, but the thousands of objects that are bouncing off the environment. As that happens, you also have to be able to recreate that environment, including room sizes, wall and object textures, and, importantly, you also have to be able to track head movement. All of that is part of how your brain processes sound.

That’s where our audio expertise, algorithms and HRTF knowledge comes from, and how we moved on to creating our Audio Video Camera, which is a video and sound capture device. That spherical array includes 64 microphones and five HD cameras, which means we can capture a complete 360 degree rendering of an environment, including the video and audio, and recreate that of a flatscreen, or on HMDs like the Oculus Rift, so wherever you look it is as if your head were inside that spherical camera.

Image placeholder title

How did that development of the company also see gaming and virtual reality gaming appear on your radar as an opportunity?
Wilkes: As we were developing the camera, and as we were looking for applications for our algorithms, we looked at several things, and the first was gaming, where pseudo 3D audio technology had been in place for a number of years.

Then, as VR came into its own, through the introduction of Oculus and their Kickstarter campaign, we quickly saw more opportunity.

Now you find yourself working with Oculus and your tech licensed in the Crescent Bay. How did that come to be?
Wilkes: Around a year ago when we were looking at the DK1, we also started at the same time developing our first plug-in, and we did that for Unity. We saw how important the portable experience is to gaming, and Unity kind of owns that space. That drove our direction then.

We were also noticing a lot of people have been through a 3D audio experience as it previously existed, and so there wasn’t a lot of credibility around it in the industry. That drove us to the folks on the virtual reality side, where audio is as imperative as visuals to the immersive experience, because if we can give them what they need, then traditional games people would also accept the technology for their games.

So we reached out to everybody we could in the business, and walking the floors of CES and E3 this year. Eventually people began to pay attention, and at GDC we got the opportunity to demo our technology. That catapulted us into some joint testing, and at some point in time a deal was put in place with Oculus. They wanted to create the best, easiest, most immersive experience and development set in their SDK, and part of that was to be the VisiSonic’s RealSpace 3D audio, as part of that 3D spacialised experience.

How does a developer access and work with your RealSpace 3D tech? Will that only be via the Oculus SDK, or will we see more on the plug-in side of things?
Wilkes: It is both. Initially we built our plug-in for the Unity game engine, and we’ve now made that available as a beta to anybody that is currently using Unity Pro. Any developer can go to our website and download it for free and test it out. And if they have a game they are working on in Unity Pro today, they can implement or change the ‘stereo sound’ relatively quickly. We have folks change their sounds there in ten minutes.

The second iteration was then to provide a plug-in for the Wwise audio engine, since Wwise is fairly ubiquitous on the triple-A side. It’s a compliment and contrast to Unity on the indie side, and helps us get exposure to a much broader audience.

After that – and this is almost completed – is the plug-in for FMOD Studio. We’re also working on Unreal as well.



And they’ll also be able to access your tech through the Oculus SDK in time?
Wilkes: Yes. The intent of the agreement was to take our technology as it stood and provide that to the Oculus team, so they can iterate and integrate it into their SDK, to make it available to the entire development community in the Oculus VR world.

And what kind of impact do you expect RealSpace 3D to make in VR games?
Ramani Duraiswami, founder and president, VisiSonics: In gaming right now sound is a sort of inferior cousin, if you will. It can be sort of added on, and it’s not allowed to take any CPU time. Part of the reason is that giving sound more CPU didn’t necessarily produce significant benefits in the past. However, now, with our engine, it is going to be possible to place sounds behind you, around you and above you, as long as the user is using headphones.

Image placeholder title

Our technology is completely headphone-based, and with VR I think there will be more and more people using headphones, and with the increase in mobile gaming there will also be more people there using headphones.
We want to give them a reason to use sound in their games, and through their headphones. We want to work with developers trying to help them understand this new palette they can paint with.

What pricing and licencing plans do you have for RealSpace 3D?
Wilkes: It’s been interesting to see the way pricing has evolved at the gaming level over the past couple of years, with what the folks at Unreal have done, and through what Unity has done. I don’t think we’re committed to any set way.

We’ve actually done several licensing deals, and each of them is a little bit different based on where that developer is going. I think that will continue to evolve overtime as we move forward.

--

What is an HRTF?

A healthy human ear can’t just hear a pin drop; it can usually pinpoint exactly where it fell in a space around the listener, even if a room scattered with different objects and textures.

The human ear uses various audio cues to do this, and it is a remarkably complex business. The cues – such as the difference in a single sound to the left and right ears – offer the hearing system ways to estimate position in a 3D space.

But sometimes the difference between the ears isn’t enough. Sounds can also be affected by the shape and position of a listener’s head, shoulders and ear canal. These changes are head-related transfers functions, and help position a sound.

In a 3D game world, these effects can be synthesised and applied to other sounds as algorithms to offer the player realistic 3D noises that can be positioned in a scene.

“HRTF has almost been the excuse for people to say ‘my audio isn’t great because HRTF is a mythical thing; if only they could get it right, my game audio would be great’,” jokes VisiSonics’ founder Ramani Duraiswami. Now he and his colleagues are confident they can change that perception.

Related