10: Fusing The Frameworks For Practical Implementation
Interactive audio and in particular kinetic sounds as argued by Collins , are those sounds that through physical propagation, caused by the players character, vehicle, icon etc or an object in the game world, a loop of interactive feedback occurs. For instance the sound of a tank approaching to the right of the player but outside of his/her peripheral vision causes a response from the player to turn towards the source of the sound and perhaps fire a weapon, whom in return reacts by also opening fire and in reaction to the sound of the firing cannon dives for cover, and so on. This responsive acousmatic device is an important tool particularly in open world gaming and arguably opens up a chasm between the possibilities of game sound mixing outcomes and the static linear model experienced in film. In games, such as many of those featured in this report, the surround sound informs the player continuously of the changing combat environment and as such would form the ‘blue – green’ part of the mix as though these elements might not constitute of actual dialogue they are linguistic effects even if all they are saying is “big tank to the right”.
In ‘Fallout 3’ we see the application of these same principles. We often hear creatures and people before we see them and can often identify the enemy long before we encounter them visually. Taking a scene from ‘Fall Out 3’ here as an example. The playable character and companions have entered a abandoned part of an hotel. On entering the location we can hear the whispering ghoul sounds that permeate associated ghoul infested areas indicate that we are are in a location occupied by such and that we are in danger. This whispering minimal soundtrack is partly just an ‘orange’ atmosphere embedded in the location however it also performs a linguistic, informative, kinetic function by causing a reaction in the player. Of course once a ghoul attacks the musical effects can sit perfectly well next to the linguistic effects of the ghouls as the ambient tracks preemptive warning is now redundant. Therefore at the moment this very short fight commences the audio mix is outputting sound across the spectrum. Dialogue from NPC’s and the ghoul, linguistic effects such as the ghouls footsteps and weapons being armed as well as the atmosphere track and a minimal piano line occupying the ‘embodied‘ aspects.[clip 9]
Did the ‘Fall Out 3‘ programmers have Murch in mind when they programmed the audio. Its hard to say, however this demonstrates how the audio engine can be outputting more than two and a half sounds whilst still maintaining intelligibility.
Also when accompanied by other friendly characters we can get verbal information or warnings of attacks beyond our immediate peripheral vision that we might not have detected by ourselves. acting in effect as Collins kinetic devices. If we look back to the clip from ‘Farcry 2’[clip 31] the yells of the enemies hunting the player down could be classed in this way, as although they are reacting to the discovery of your pressence through the firing of the gun they are then reacting to the artificial inteligence coding of the gameplay and are not controlled by the player, though they do continue to react to the players actions in a kinetic manner.
Here the diegetic mix is a combination of ‘blue – green’ sounds of people moving and vehicles and ‘violet’ dialogue provided by NPC’s. This means that mix wise you can have, if keeping to the Murch formula, four main sounds and two quite ones at the same time as the music.
In this example from ‘Mass Effect’[clip 36] we experience a variety of different acoustics applicable to the various environments our characters wander.
These reflective properties reinforce the visual graphical representations in an effective manner that we hopefully find believable. When in the loading sequence in the lift we are treated to the sound of an in lift news broadcast. The broadcast includes news on past, present and future quests and is dynamic, dependent on our progress or decissions in the game, offering structural clues as to our next decision. At the end of the sequence the music subtly changes, fading in over the previously diegeticaly dominant environmental sounds and taking on a sinister tone. This may be a randomised non-reactive sound cue there to remind the player of the under lying themes of impending apocalypse and not a reaction to the playable character crossing the bridge or it may be mixed this way to preempt plot developments on the far side of the bridge.
Interactive diegetic sounds occur in the character’s space, with which the player’s character can directly interact. The player instigates the audio cue, but does not necessarily affect the sound of the event once the cue is triggered.  This interactive diegesis encompasses a large proportion of the game sound, whether they are the sounds of vehicles as we drive them or the lift that we trigger to take us to a higher level or even our footsteps as we run, these ‘blue – green’ kinetic sounds inform us of our interactions with the game environment. “Interactive non-diegetic sounds”  are easily referenced by playing Bethesda’s ‘Fallout 3’, Crytec’s ‘Farcry 2’ or UbiSoft’s original ‘Assassins Creed‘ where in the latter two music only activates when in a fight or danger is present.[clip 37]
The ability to implement concepts can be greatly affected by the implementation tools used, as evidenced by Rob Blake at Bioware talking about working on Mass Effect 2 “As far as implementation, it was a whole new ball game using Wwise. For example in ME2, I was responsible for all the vehicles in game and in cutscenes and all that and we were able to do stuff like making sound more intimidating if you were more paragon. The idea being if you were more renegade then you wouldn’t be as scared or the sound wouldn’t be as scary to you. Things like that took only a few minutes to set up using Wwise where before it just wasn’t possible at all.”  This quote illustrates how tools can change or allow conceptual approaches. In this case its of interest how programmers have adapted approaches, that would normally remain static, change as the players character changes behavior in the context of a role playing game, where the player has control to some degree over their in game personality and relationship to the surrounding world. In this case dynamic changes in the characterization of in game elements adapt to character and actions of the player. For reference the previous system Rob Blake was using for the original ‘Mass Effect’ was Creative Labs ISACT audio engine.
10.2 Drawing Together The Strands
By drawing these ideas together we can see correlations that work conducively towards a framework complementary to the creative process of mixing interactive audio that takes account of the way the brain processes audio, at least according to Walter Murch anyway. Put into practice this could act as a framework for organizing audio and setting up sub mixes and asset prioritization algorithms that correlate to middleware workflows. As Firelight Technologies have yet to release their mixing console for FMOD this may be only possible using Wwise of the two engines covered here as of the moment.
Taking the IEZA framework as a starting point then it is fairly easy to implement the colour coded Murch spectrum into the cross sections. The ‘Zone‘ quarter which combines the diegesis and setting areas of the sound world incorporates the Yellow and Orange areas of the Encoded – Embodied spectrum. If we are using this idea to set up a interactive mix then this would stipulate that the ‘Zone‘ would be able to contribute four main sounds to the mix. Two ‘yellow’ and two ‘red’. Likewise the ‘Effect‘ zone which combines the diegesis with player activity here incorporates both ‘Blue – Green‘ and ‘Violet‘ sounds which are, as discussed, the sounds from which we derive linguistic meaning. The language of the game. Again in mix terms this means that four main sounds can be contributed to the mix from the effect quarter. The ‘Affect‘ quarter which deals with the non diegetic sound that contributes to the contextual atmosphere and emotional setting of the game is largely reserved for the musical soundtrack. Now, the non diegetic activity quarter is different in that as it deals with ontological interface sounds present in gaming this area of the IEZA framework can actually contribute to the mix through many of the colored zones including ambiences. In ‘Fall Out 3‘ the control panels have their own ambiences for instance. Interface audio can be supplied through dialogue, through feedback from in game stylized sounds like the crackles as you move through menus. Again illustrated well by Fall Out’s ‘PipBoy’. Or by musical segments that accompany the completion of a task, for example. Therefore in mixing term if we set up a prioritization system attributed to each colour that only allowed two assets to be active at any one time then interface assets would need be set up with maximum priority to insert into the appropriate sub mix as called.
Also, as we can see from the extrapolated diagram this new framework divides into two vertical columns that correlate with Collins’s Interactive and Adaptive concept. [fig 21] So in the left column we have diegetic and non-diegetic ambiences, music and emotional effects. In the right column we have sounds that generally actually triggered by the user interface such as doors opening, triggers pulled on guns, footsteps, engines on vehicles etc. Also in this column is placed the dialogue contained in the game. In fact, really, dialogue can be either ‘interactive’ in the case of conversations in an RPG or ‘adaptive’ if defined by a set narrative consequence. So where the player triggers the speech it would be interactive though the response would be of course adaptive. As our main goal here is for a framework for sound mixing then this should not really effect the practicality of application.
So if we look at this new framework we can articulate an algorithm for integrating mix solutions that obey Murch’s ‘Encoded – Embodied‘ spectrum whilst also incorporating the functions of game audio as framed by both Collins and IEZA. With that in mind it should be possible to utilize this framework to help construct effective interactive mix solutions by perhaps dividing buss hierarchies and sub mixes according to colour coded divisions of audio assets. This could be particularly effective if these busses could be governed by algorithms that determined through prioritization arguments which two assets from a particular colour could be played at a time. Maybe by applying these ideas they could even inform the design of future implementation tools.