John Broomhall asks PlayStation audio guru Jason Page the facts about PS Vita's audio capabilities

Heard About: PS Vita

[Go here to read Develop’s comprehensive list of Heard About audio specials]

In his role as Senior Audio Manager in SCEE’s R&D department, Jason Page lives at technology’s bleeding edge.

Having headed up the team behind PS3’s fundamental audio tech and tools, you’d think he might have been able to kick back a little.

But long before us, he knew about an exciting development in portable gaming technology that could be a game-changer for handheld audio.

Please can you describe your current role at Sony and explain your involvement in the audio side of PS VITA?
My current role is as the manager of the audio department in SCEE’s R&D division.

We handle both developer support (mainly via the DevNet support sites, but on-site if requested), as well as developing a number of audio tools and libraries across the PlayStation platforms.

Some of these projects have had a higher profile than others – MultiStream on PS3, for example, which was our first project released within an official PlayStation SDK.

For PS Vita, my team was tasked with the responsibility of programming the main "game audio" synthesizer.

How long has it been in development and when was audio first considered?

We were initially involved in discussions with a number of other SCE departments three years ago.

This included SCE R&D staff, as well as World Wide Studios (internal game development and R&D) staff from Europe, US and Japan.

As a whole, we formed an "audio working group" who were responsible for designing the PS Vita game audio capabilities.

Programming started around three months after that time. So with that in mind, I think that audio was considered from an early stage. At this time, the final hardware specifications were largely still unknown to us.

However, from what we did know, we had an idea that it should be capable of "next-gen" style audio – which developers had already been used to harnessing on the PS3 for example.

How did Sony go about deciding what VITA’s capabilities for audio should be? What criteria was driving it?
The audio working group designed a synthesizer based on their collective experiences from PS3 audio development.

There was a wide range of requirements from the World Wide Studio teams. So we needed to ensure that the design would be capable of meeting the audio needs of many existing PS3 games, such as the Uncharted series, SingStar, MotorStorm and Gran Turismo.

We also considered the needs of 3rd party developers. From working on the developer support side, we had a lot of information regarding how developers handled audio within their games.

This included things like the minimum number of expected audio channels, which DSP effects are a priority (and now considered the "norm" for current gen games), codec requirements, average audio asset memory footprints and the like.

We‘ve also held a number of audio based surveys on our DevNet developer support sites, which have given us a good insight into developer audio requirements.

Hopefully developers will start to realise that we do listen to their suggestions and criticism from such surveys – they can help shape the future too.

Once we had collectively decided upon audio features and API, my team went ahead with development – initially on PC, until we had access to early PS Vita hardware. We decided to call the synth "NGS".

At this stage, we also contacted a number of middleware partners to gauge their initial feedback to NGS’s features and API.

We wanted to ensure that NGS met their requirements too, both to help them make audio on PS Vita sound great from day one, and to make sure that we’ve given them the flexibility that they knew their customers required, especially regarding aspects like DSPs and signal routing flexibility.

This, in turn, gave us a good idea that what we were developing would meet the needs of "many" developers.

Of course, there were always going to be the edge cases that required something completely different. But hopefully there are far fewer of those this time around, compared to PS3 MultiStream.

So what’s under the hood that you can talk about? What is it, in fact, capable of?
There’s two sides to the audio processing on PS Vita: the ARM processing and the codec engine processing.

The ARM side processing is also where the main game is processed, so it has been very important to keep CPU use to a minimum here.

The codec engine is where the serious audio processing happens – mixing, resampling, DSP processing, and such like. I can’t go into too much detail regarding the hardware, but suffice to say, it’s very powerful.

Developers are not allowed direct access to the codec engine, as with the PSP’s ‘Media Engine’. As such, we had to ensure that the design of Next-Gen Synth took this into consideration.

We needed to allow the synth to be configurable – synth DSP module routing and bus routing, for example – and we made sure that we’d include the high-priority DSP effects as standard, as it wouldn’t be possible for developers to write their own on the codec engine.

This is another major reason why we needed feedback from middleware developers at an early stage.

It’s probably worth noting that developers can just write their own synth on the ARM processors if they wish.

However, NGS’s use of the codec engine allows for far more processing to be available for game (graphics and AI) use.

How about memory and storage for audio?
There’s no actual audio memory (similar to both PS3 and PSP), and we’ve got a number of codecs that allow for compression ratios that are comparative with that of current gen memory footprints.

SCEI R&D implemented a new codec into PS Vita that was developed in the Sony group. This allows for compression ratios that are comparative with that of MP3.

The PS Vita audio codec is an improvement over the PSP’s codec format, where it now handles a number of key "game audio" specific features, such as allowing for the user to specify sample accurate seeking and looping, rather than such information being on "packet boundaries".

It can also handle a wider range of sample rates and bit rates – giving the user more control over compression rate versus sound quality.

From speaking to many developers, we concluded that a budget of somewhere around 20MB for current gen audio assets is quite normal (RAM being reserved for in-game sound effects, streaming audio buffers and such).

We needed to ensure that, where possible, developers could dedicate the same RAM budget on PS Vita and achieve similar results.

What tools will audio developers use to work with it?
Currently there’s a new version of the Sulpha audio debugger/analyser tool available for PS Vita.

This was developed by Richard Griffiths and Dan Radford, who work in the WWS Advanced Technology Group, and are based in our Soho offices.

We worked closely with them on the design and requirements, based on our knowledge of how developers used the previous PS2 and PS3 versions.

With developers using Sulpha, we found that we could reduce support times from, maybe days or weeks, to literally minutes or hours.

It allows developers to capture all audio activity from within NGS at run-time, and for this information to be displayed and analysed to help detect any audio issues.

Trying to figure out a support request of "it sounds distorted" can take a lot of exchanges to find out what the exact problem could be – especially if the problem only happens once every 48 hours (I’m sure you’ve all been there!). So Sulpha was an important part of the jigsaw puzzle to have in place at an early stage.

Will it deliver PS3 quality audio? In terms of fidelity and scope?
It really depends what PS3 games with which you want to compare the audio capabilities of PS Vita and NGS. But the goal was to allow our own WWS PS3 titles to work on PS Vita with minimum changes.

In terms of power, games process hundreds of audio channels, as well as processing high quality reverbs, etc.

We consider that a "game voice" would have resampling, volume changing, filter and some kind of codec decoding all being active.

So when I say that NGS can process hundreds of audio channels, I mean real "game voice" audio channels.

The main audio difference is obviously that it’s a portable unit with stereo output, rather than full-fat 7.1. But in terms of fidelity, I’d say it’s up with that of PS3, and in terms of scope, I think that the synth design is actually more flexible, allowing for wider opportunities for creativity than that of PS3 in the longer term.

Of course, PS3 is a home console. It plugs into the mains and has a fan to keep it from overheating! PS Vita is a portable, battery powered unit and as such, you have to be sensible when making comparisons between the two.

I see it like this: if you used the whole PS3 to do something amazing with audio (whatever that would be), then yes, PS3 delivers more.

But if you consider what resources are normally available for game audio on a home console (memory, CPU use, DSPs), then without a doubt, PS Vita can deliver to that level.

Is it fair to say it’s redefining handheld audio? If so, why and what do you think the long term knock-on effect will be on other devices?
With regards to the long term knock-on effect, portable audio can no longer be seen as the "simpler option" or the "poorer relation" compared to that of the home consoles.

In future, the budgets for portable game audio may indeed rise to meet that of home consoles, although re-using assets across the two will also be a viable option, and as such could make portable console development easier overall.

Does PS Vita redefine portable audio? I think so, yes. If I look at current portable devices, their audio capabilities are around 15 years behind that of home consoles.

So, comparing the audio capabilities of PS Vita to other portables on the market (be that of mobile phones or other game orientated devices), PS Vita audio is a staggering achievement.

It’s easy to forget this during development, when your target is to try to ensure that it delivers the power that developers expect on PS3.

But I honestly believe that PS Vita will deliver an audio experience that has never been heard before on a portable device.

At this point, I’d really like to say a big "thank you" to Olly Hume and Paul Scargill, who were the main driving force behind the programming of the synth.

They had both previously worked on PS3 MultiStream and Sulpha, so had a good understanding of the issues that might arise if the design and implementation didn’t meet developer requirements. I feel this really helped us to deliver a great audio library.

Overall, such developments should allow for better game audio, leading to a better experience. Whatever "better" means is up to the developer, but PS Vita certainly gives them the power do something rather special.

About MCV Staff

Check Also

The shortlist for the 2024 MCV/DEVELOP Awards!

After carefully considering the many hundreds of nominations, we have a shortlist! Voting on the winners will begin soon, ahead of the awards ceremony on June 20th