2: The Language Of Sound In Interactive Media

2. The Language Of Sound In Interactive Media.

The relationship between moving image and sound content is complex and multi-faceted. Their parallel paths segue and pull away from each other in a dance of communication constructing dimensions of narrative and immersive experience that imbue the wide variety of films, interactive games and audio/visual media we consume. The tools of discourse exhibited in the audio content fulfill a complex web of functions and creative embellishments that can require thousands of creative decisions that manipulate our perception of the visual.

2.1 Dividing The Sound World

The sound world of interactive computer games can be divided into three areas of overlapping functionality. The first two  of these areas, environmental immersive sound and narrative sound, occupy established areas of functionality. Essentially film sound revolves around two things, the reality of the world in which the story is taking place, which of course is not a real world but a composed one  and the sound of the story itself, which can be very separate metaphorically from the perceived reality. These areas when addressed deliver both the narrative direction, or storytelling and the immersive environments required of the cinematic experience. A third area is then required of the sonic world of the computer game that effectively addresses the structural and interactive issues of the format.  To be effective in delivering a near cinematic experience it could be argued the sound world must convey this set of structural and framework obligations effectively. If we break these objectives down according to the roles of game audio we can then critique content according to the ability of the audio mixing and processing techniques and the suitability of the tools available to achieve these goals.

2.2 Environmental Immersion and Narrative Propulsion
Common to film sound, the importance of drawing the player into a  believable, tangible world is often of great importance. For the sound designer the question might be to what degree do you need the player to feel like they are interacting with a real world. Angling a sound world towards an immersive environmental experience would suggest that importance is placed on creating an, at least emotionally realistic feel. How well this balances against narrative requirements depends on the level of fantasy or ‘poetry’ that you might wish to convey. An example of this might be a war game such as the ‘Operation Flashpoint’ combat series from CodeMasters. This genre of game lends itself well to an effectively immersive environment, the rationale illustrated in the tagline printed on the packaging, “As close to war as you’ll ever want to get”.[10] In the case of this particular series of games the concept is largely founded on the idea of the experience being as realistic as possible, hence, there is no non-diegetic music, only sounds that the player would hear if they were physically in that environment and situation. This approach is not perhaps required for a platform game such as ‘Sonic the Hedgehog’ for example where a level of realism is far from called for. However a fantasy or horror game such as ‘Dragons Age: Origins’ or  ‘Dead Space’ might require more of a mixture to achieve the required blend of environmental immersion and narrative experience. In the latter for example there are elements in the game where the character is exposed to zero gravity zones where there is no air to conduct sound. As a result the only sound you hear from the characters world are sounds that are conducted through the space suit such as footsteps and combat associated contacts. [11]
Looking at the medium of film we can divide the sound world into two areas that, it could be said, in effect address the conflict between immersion and the requirements of narrative reinforcement. Referred to as diegetic and non-diegetic sound   these terms differentiate between the diegesis, or fictional world in which the narrative is taking place and that which originates outside of that. [8] So diegetic sound being the sounds of the world in which the story is taking place refers to the propagation of sound waves that exist or would exist in the environment as experienced by the characters on screen. This includes sounds that occur outside of the visual field such as passing traffic during an interior scene or music playing through an apartment wall. Non-diegetic sounds, conversely, are those sounds emanating from outside of the world occupied by these characters and are detached from the narrative inherent in the scene or location. Usual examples of a non-diegetic sound would be music that is not produced by an environmental source, narrative voice overs, and some subjective ‘emotional’ sounds.[12]There is not, for example an orchestra playing just out of shot as Darth Vader stalks the bridge of a Star Destroyer. [clip 1]

The music is there for our ears and not for those of the crew. Conversely music played from a radio in ‘Grand Theft Auto 1V’ for instance or ‘Fall Out 3’ [clip 2] is a valid part of the fictional world and as such is diegetic.

Why is this distinction important to interactive mixing? Knowing whether a sound is meant to be part of the diegesis or not alters how we need to treat the sound in implementation. If for example, we are listening to two characters engaged in a conversation in a church, being diegetic we need to make sure those voices sound like they are in that environment, if they haven’t been recorded there. If, on the other hand the scene is unchanged but in place of the conversation we are listening to a narrative voiceover from a character not physically present then the voice is non-diegetic and therefore need not be subject to the reflections and properties of the depicted environment. Of course, diegetic sound alone can fully address the needs of narrative exposition as exemplified by films such as “No Country For Old Men’ and ‘Morvern Caller’ [clip 3]. The later of which’s extensive musical sound track is entirely supplied by diegetic sources, predominantly the main characters personal stereo.

In the design of game sound of course it is important that the player can discern what is meant to be part of the world in which they are interacting and what is external to that in order to navigate the game environment effectively.

2.3 Player Engagement With Diegetic Sources Of Music
The minimalist non-diegetic soundtrack is augmented in ‘Fall Out 3’ by the ability to tune into the radio stations that litter the wasteland. Some of which are only able to function after the player has fulfilled certain side quests that enable the stations to operate. Thereby the player is actively contributing to the audio content possibilities of the game by activating these stations, some of which involves picking sides between the different factions. Diegetic music is kept integral to the themes of the ‘Fall Out’ universe that revolve around a parallel history to our own in which a nuclear war occurs in the mid twenty first century, though in the parallel history culture hasn’t actually moved on from the nineteen forties and actually represents a future reminiscent of the imagined future often depicted in the science fiction comics of the time. This is reflected in the music available on the radio. For example one station plays  1940’s American, big band and popular music such as Ella Fitzgerald and Billie Holiday. Mark Lampert the lead sound designer was very conscious of setting a sense of place with  these diegetic musical  options  “All the songs in the game were chosen very early on in the project…working with a music licensing company that was able to take our descriptions of what we were after and then come  back with a batch of music. the sound of the music, of course, was a big factor in which songs were chosen, but perhaps even more important was the lyrical content of those songs. There’s a heavy dose of black humor running through all of them.” [13] This attention to detail is perhaps intended to encourage gamers to stick with default audio settings and non-implement their own music as is commonly available on most X-Box games. By becoming an integral part of the game paying experience rather than just filling in sensory holes diegetic music can become an important immersive tool.
Rockstar’s recent 1940’s set detective game ‘LA Noire’ followed a similar approach for the diegetic music being played on radios, particularly in vehicles, using as it happens many of the same artists as those used on ‘Fall Out 3’, though in this case instead of referencing the aesthetic future vision projected from a particular time period being very much a reinforcement of the actual place and time. The choice of songs featured here on our car radio are not necessarily random or narratively neutral but designed through their lyrical content to reinforce the crime, violence, broken relationship themes through the lyrics of the songs. [14]
Deus Ex follows a similar trajectory blending non diegetic soundtracks with that of street musicians and other diegetic sources as well as triggering extra-diegetic sounds such as breaking glass  at various points in city locations. Of course these are randomized so that the player doesn’t regularly re-trigger the same audio asset. Like ‘Fall Out 3’ the non diegetic music esques strong melodies in favor of more abstract ambiences built around “an emotion or environment. If there is any melody, the melody comes last after the foundation is completed. ” according to  Deus Ex audio director Steve Szczepkowski [15]

2.4 Dynamic Shifts Between Diegetic and Non-Diegetic Audio
In both cinema and interactive games, using dynamic shifts between diegetic and non-diegetic dominated sound worlds can be an effective tool. We are generally used to a blend between the two realities so when confronted by just the one in isolation the effect can be profound as illustrated by the opening of the Steven Spielberg movie ‘Saving Private Ryan’. Before the release of this film in 1998 most war films and their action sequences had been dominated by orchestral soundtracks and roughly followed the Hollywood devised two-point-five  mixing rule, a topic which we will return to later.

When experienced in 5.1 surround sound the effect of having bullets and shrapnel seemingly flying around you’re head combined with the absence of a typical score highlights the brutality that the more poetic though similarly themed early battle scene from Jean-Jacques Annaud’s ‘Enemy At The Gates’  presents in a more romanticized light by utilizing a appropriately  emotional non diegetic musical score.

This scene from ‘Saving Private Ryan’ also illustrates the use of subjective ambience.[12] Subjective ambience is a technique that could be described as a process whereby the natural ambience of a scene is manipulated or replaced with sounds that better illustrate the narrative thrust or particular emotional or sensory experience of a particular character. In the above scene from ‘Saving Private Ryan’  Tom Hanks character is temporarily deafened by a nearby explosion. At this point we shift from an extremely immersive ambient sound world to the personal sonic experience of an individual soldier, presenting us with a subjective audio experience from his perspective. This effect is emulated in the ‘Medal Of Honor’ games franchise. Here, by dropping a grenade too close to themselves the player suffers a similar effect to that suffered by the soldier on the beach. Expressing physical damage to the player through sonic repercussions is an effective immersive tool that can eliminate the  need for that data to be illustrated on the HUD.

In this respect we can learn a lot from immersive, narrative dynamics by studying cinematic example. As we can see from the  ‘Saving Private Ryan’ beach scene, use of an immersive diegetic only sound world aims to achieve an emotional impact removed from that of the normal cinema experience of most war movies up to that point. Dynamics of sound volume levels are also showcased here. Many action games opt for an onslaught of noise at all times, following much the same approach as currant popular music conventions of using consistently heavy compression to give a greater impression of loudness  [16]. Though this approach may impress in the short term, over a sustained amount of time the player will ultimately adjust to that level of sound and will get no further impact from it. “A wide dynamic range is often talked about in audio terms as very desirable, meaning the amount of difference between the quietest and the loudest sounds. Something with no dynamic range cannot be experienced for very long before the viewer, or listener, becomes fatigued and reaches for the off-switch. A game without a range of varied game-play moments and experiences for the player will more often than not result in a game soundtrack that has little or no dynamic range. Batman: Arkham Asylum and Dead Space are wonderful examples of games that have been carefully designed to have specific moments of calm, moments of silence, a mixture of stealth and combat as well as moments of intense action and resulting high volume sound, music and dialogue.” [17]
The sound direction in the movie ‘Saving Private Ryan’s  opening battle is an impressive  example of how volume dynamics can act to renew impact while maintaining an immersive experience achieved  by amongst other  narrative techniques, dropping the camera below the surface of the water and by subjectively dropping inside the aural experience of a lead character as he loses his hearing, we are treated to a respite from the traumatic sound world of the D-Day landings just long enough that when these sequences end the onslaught has a renewed impact.At this point we shift from an extremely immersive ambient sound world to the personal sonic experience of an individual soldier, presenting us with a subjective audio experience from his perspective, his point of view. This achieves several, functional, narrative objectives. Firstly it establishes Tom Hanks’s character as a key protagonist. We have not dropped into any other soldiers personal experience in this way. Secondly it vividly demonstrates the damage caused by large explosions to human hearing by audibly demonstrating the effect on the frequency sensitive follicle hairs in the cochlea. As the hairs sensitive to high frequencies are the easiest damaged  we are treated to the sounds of battle as through a low pass filter, with an accompanying whistle as the damaged hairs propagate a constant signal not unlike a broken audio cable.[18] Thirdly, though the obvious intent was to make the sound world as believable as possible there was evidently still a need for  dynamic release from the sheer volume of modern warfare. As over a sustained amount of time the viewer will ultimately adjust to a level of sound that means if high levels are constantly maintained the audience will, over time, adjust and get no further impact from additional explosions, machine gun fire etc, arguably reducing the emotional experience. The sound direction in this scene is an  example of how volume dynamics can act to renew impact while maintaining an immersive experience. This is achieved earlier in the scene by dropping the camera below the surface of the water. By doing this we are treated to a respite from the traumatic sound world of the D-Day landings, though not its visual horrors, just long enough so that when these sequences end the onslaught has a renewed impact.
Developing a dynamic between diegetic immersion and narrative exposition offers the sound designer an opportunity to develop a wider range of  contrasting experiences. For example in the  opening scene from  John Hillcoat’s ‘The Proposition’

we have a scene that exemplifies how contrasting sound design approaches can be utilized to achieve both a vivid environmental experience and a strong narrative projection. Note how the gunfight and the resulting verbal confrontation occupy just the cramped confines of the shack, bullets pinging on metal, screams of the injured or dying, overwhelming sound of flies, etc. As Ray Winstone’s ‘Captain Stanley’ gazes out of the window and addresses the ‘proposition’ of the title, the diegetic world slips away to be replaced by Warren Ellis’s looped violin textures and a time slipping montage of plot exposition. Note how Ellis’s droned loops mimic the characteristics of the flies or how in later scenes how unusual plucks and scrapes of the violin are used to mimic the machinery common to the late eighteen hundreds outback. This example of joined up thinking, addressing the world of the characters through the application of timbre and  textural awareness, allows for a fluid transition between the  often brutal hyper realistic natural sound worlds and the romantic storytelling prerogative. There is also a good example of the use of a kind of subjective ambience morphing from diegetic sounds during this sequence as when the main protagonists character walks through the long burned out farmstead and pauses before the closed door. Here the sound of the gravel underneath his boots takes on the sound of wooden timbres burning. Arguably this is a subjective emotional extension of this characters imagination creating a parallel narrative in the sound world that contrasts with  the burnt out shell visually. Note how this sound builds, with the help of an unconventional percussive violin loop that evokes the breaking of wooden beams, before reaching resolution with the opening of the door onto a young child’s  destroyed room. This build of dissonant visual and audio narratives and consequent resolution is not unlike the semiotic device employed in character Michael Corleone’s first assassination scene in Francis Ford Coppolas ‘The Godfather’.

In a similar way as Ellis’s work on ‘The Proposition’ or Johnny Greenwoods score for Paul Thomas Anderson’s ‘There Will Be Blood’ the non-diegetic music in ‘Fall Out 3’ blends with the diegesis in an intelligent and join free manner. As a lot of the soundtrack is both arhythmical and largely atonal it allows for diegetic sounds, in for instance the ghoul infested tunnels, to blend with the non-diegetic to create a seamless texture. This also serves as a structural aid in that as you learn the textures for different environments you can prepare yourself appropriately in terms of weapons, health (stimpacks) and armor choice for the coming encounter. Hostile ghoul infested areas for example are characterized by hissing breathing sounds that are not attributable to individual ghouls but exist in a non-diegetic capacity in order to both unnerve and inform the participant. Karen Collins defines these sounds as “adaptive non-diegetic sounds” [19] as they are outside of the characters diegesis whilst still reacting to the instance of the player entering that specific room.

What is also quite interesting in ‘Fall Out 3’ is the way that the diegetic ambience morphs as evening and night approach. As an open world game, time progresses at a natural, though accelerated rate. The overall volume of the world seems to drop as night approaches  and the sound of certain bugs such as crickets start to appear that were not featured during the day. This constitutes  a good example of “adaptive diegetic audio”[19] as defined by   Collins. As the sound world is adapting to the time of day with the diegesis of the players character. Overall the exterior locations explored by the player are empty desolate spaces and the sound world reflects this. Due to the lack of foliage and general absence of absorbent materials in the wasteland and the presence of large areas of exposed rock, you would expect in reality to hear sound incidents from some distance away. This is in fact true of the sound propagation in the program and you can often hear the sound of conflicts some significant distance from the source. Whats interesting here is that these sounds are not embedded in the ambiance but relate to real interactions happening some distance from the player between rival groups or individual non-player characters and it is the prerogative of the  player whether they choose to investigate, intervene or ignore. In comparison to a game such as ‘Operation Flashpoint: Dragon Rising’ for example where the sounds of distant battle are embedded  in the ambiance most of the time the player is not engaged with the enemy directly it gives the impression to the player that this world is not revolving around them as the diegetic sounds they are hearing are linked to tangible events in the  game world.

Dynamic changes are an immensely important device for maintaining both energy and suspense. In the case of audio they usually relate to differences in perceived volume between different points in time. In the featured scene from ‘Spiderman 2’  we see the dramatic effect of using a momentary  instance  of silence to preempt a series of huge sound events.

The use of dynamics is demonstrated here as being an important tool in giving greater impact to any sounds which may follow, such as in this scene where the instance of the car crashing through the window is preceded by a momentary silence. Silence is used to similar effect in the tenement fire rescue sequence from ‘The Watchmen‘ where for several frames the sounds of the burning building and the musical score are dropped to allow dynamic space for the following explosion to have maximum impact.[17] [clip 12]

This sequence also serves to demonstrate the effectiveness of frequency dynamics which serve to add percussive energy in much the same way as a kick and snare drum do in a drum beat. Each cut in the edit is accompanied by an alternate frequency characteristic attributed roughly to either the bass heavy rumble of the car or the shimmer of the flying glass. The flying glass, in a glaring example of purely emotional realism, sounding like a pitched up yet time stretched knife being sharpened as it flies past Peter Parkers head, is perhaps a semiotic nod to its deadly attributes.
Of course these dynamic changes are not just audio but visual too and unfortunately the game designer  does of course lose some control of an equivalent narrative  by surrendering control to the player. However many of the same devices are built in by, for example, controlling the number of enemy NPC’s (Non Playable Characters) or indeed a simulation of hearing loss when subjected to a nearby explosion or the players character is subjected to severe injury.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: