Designing the Auditor: A Prototype of Locative Radio with Original Sound Content

This text was first published in Journal of Radio & Audio Media. The topic is Media design, the genre Research article and the year of publication 2017.

To cite: Nyre, L., Hoem, J., Tessem, B. & Ringheim, J. (2017) Designing the Auditor: A Prototype of Locative Radio with Original Sound Content, Journal of Radio & Audio Media, 24:1, 90-110, DOI: 10.1080/19376529.2017.1297152

Location-based sound media are similar to traditional radio in that they rely on communication in sound alone. The smartphone has GPS and Bluetooth, and it is easy to attach sound to locations like a rail station, a neighborhood, or a street. The research team designed a media prototype, Auditor, and produced a soundscape called “The Railroad Dialogues.” The medium and content were tested in a field trial with 42 young, urban headphone listeners. The article reports on the informants’ level of immersion in the sound content and interface, and considers their potential to supplement radio in the future.

Location-Based Services and Radio

There is a big market for location-based services in contemporary media cultures, and especially in relation to urban users of smartphones. Google Maps is one of the most successful services, and in 2016 the locative game Pokemon Go enjoyed great interest. Smartphones have been connected to GPS satellite for over a decade and the user’s exact location on the planet’s surface has been pinpointed. Google Maps, the weather service yr.no, and other tools present information that is characterized by the user’s location right now. Photos, sounds, and other material can flow to and from every single user’s positions, and be tagged with their geographical coordinates. What are called “micro-positioning services” are the most precise type of locationbased services, and can function at the level of decimeters and even centimeters. However, this level of precision is not market-ready at the time of writing.

While there is a strong tradition for experimentation in radio and podcasting, there is little focus on location-based services in this branch of the business. But there are interesting examples, like the Guardian’s Streetstories (Panetta, 2012) a free audio app for iPhone and Android about the King’s Cross area of London, with short stories that are triggered as the listener walks around the designated area. There is a service called Capsule.FM (http://capsule.fm//) for smartphones that presents geo-located music and synthetic speech so that listeners can hear news, weather, and other content based on their current location.

It seems fair to say, however, that the mainstream radio market does not experiment with location-based services, and neither does most of the emerging podcasting industry. Why should they? The traditional temporally driven narratives and soundscapes do not require geographical nearness. The established storytelling techniques require the creative person to organize a temporal progression of events, from a beginning, via the middle and then to the end. There are expectations about dramatic nerve and closure, and these qualities are mainly associated with the temporal dimension.

Another reason that radio does not explore location-based services very intensely is that the public sphere simply does not scale very well to the micro-positioned level of meters and decimeters. We will argue this point by juxtaposing the three concepts “macro,” “meso,” and “micro” positioning, and link them to the radio medium.

Macro Positioning

The public sphere is characterized by its macro size. Macro positioning will always be strongly connected to strategic communication patterns (Hoem, 2009, p. 67), dominated by transmission, where information is produced by a centralized information-service, which also controls how the information is distributed. This clearly fits with legacy radio. Radio is a mass medium based on a one-way address where very few journalists, technicians, and editors produce content for the rest of the population. Gripsrud (2010, p. 7) argues that “broadcasting produced the cultural conditions for a civic culture, i.e. semiotic and emotional conditions for citizens’ active, informed participation in democratic processes.” Jauert and Lowe (2005) consider the “Enlightenment Mission” of traditional broadcasting to be of great public value in the future.

Meso Positioning

There is also a level of size where far fewer people are addressed through the communication process. For example, the smartphone with its apps allows for addresses to groups that are well-defined and specifiable compared to the anonymity of radio’s macro address. Social media like Foursquare or Facebook allow users to address each other as groups based on geolocation. Given co-occurrence they can for example make visual contact at a town square after communication among themselves. Meso positioning facilitates communication wherein the users become more active, through adaptive communication patterns (Hoem, 2009) that allow communication to be negotiated between producers and consumers. Some theorists argue that these roles merge into new roles called “prosumers” (Toffler, 1980).

Micro Positioning

Micro positioning is the most detailed level of communication by location. Ideally positioning should work on the level of decimeters, which requires several transmitters (e.g., Bluetooth LE) to be installed physically in a given space. With such equipment every movement the user makes can influence the playback of sound and, at this level of precision, it is almost impossible for two people to have the same experience. Thus, micro positioning will tend to favor tactical communication patterns (Hoem, 2009) where individual users act on their own, only partly in relation to other actors. There is in all likelihood a highly individual experience of the story, determined by the use of headphones and the close relationship between the user’s movement and the sonic augmentation presented to each individual user.

Building on this foundation, we can specify the research interest in the Auditor medium and “The Railroad Dialogues.” How do you make engaging content for a location-based sound medium? What are the implications for sound design when the sound recordings are physically attached to places? This article first reviews research on design method for sound media, and theories about listening. We then explain how the prototype medium and its content were constructed, as well as how they were tested in a field trial of 42 respondents. The analysis goes into the users’ experience of the content, focusing on immersion and movement, plot and genre recognition, and emotional responses to the voices of people appearing in the storyworld. In the final part we ask about the potential of the Auditor as a medium, and deal with the sonic AR interface, possible genre orientations, and locations where it should be installed. In the conclusion we argue that location-based sound media can supplement to traditional radio and podcasting among urban listeners in the future.

Media Design Method

There are many research projects that experiment with new configurations of technology, and create what can be called “designs,” “prototypes,” “installations,” or simply “artifacts.” The basic method consists of designing a prototype that works as a vehicle for creative content and tests with “live” audiences of various types (Nyre, 2014).

For example, HyperNews presented interactive news stories to a test audience (Aam, 2013). Hoem (2009) explores how communication patterns can contribute to the design of communication systems used in education. In the specific field of locative media design, Fagerjord (2015) presents a mobile app for listening to church music in the very churches it was composed for. Liestøl (2009) creates a situated simulation 3D, locative display of archaeological sites like the Forum Romanum. Løvlie (2011) created Textopia, a service where literary quotes are attached to places in Oslo that are explicitly described in the quote. LocaNews tried out locationsensitive local journalism, and the prototype was tested with informants in a town in western Norway (Øie, 2013). Amplifon is a proof-of-concept prototype that explores location-aware radio news services tailor made for a light rail public transport system (Nyre, 2015a). Oppegaard (2015) explores locative mobile apps for park services. Tessem, Bjørnestad, Chen, & Nyre (2015) created PediaCloud; an app where locative information is presented as a serendipitous word cloud.

While there are similarities between these artifacts, there is no established method to conduct this type of research. We propose to call them “academic prototypes.”. Such a prototype (or invention) is a new technological solution to a predictable or previously unknown challenge, problem, or need. It takes the form of a prototype at a certain level of technological readiness, and with room for further iterations and development (Nyre, Ribeiro, & Tessem, 2017). A prototype does not become academic merely by having been built by people who work in academia. It has to be created with an explorative mindframe and the application of a rigorous methodology such as the principles of design science (Hevner & Chatterjee, 2010). The main objective for an academic prototype is knowledge expansion, not pecuniary profit. However, one should not underestimate the potential such academic prototypes can lead to with regards to media innovation. In contrast to most development projects in media businesses today, academic prototypes are not immediately bound by economic constraints; they can run over a long period of time, and the researchers can conduct careful testing with members of the public. Here lies a real potential for innovation.

Immersion in the Soundscape

Immersion is an important variable in the experience of sound content. Strong immersion in sound is easier when using headphones than when, for example, listening in the kitchen using an FM radio receiver. Noise-cancelling headphones can be considered as a particularly powerful tool for immersion in sound. Along with hearing aids, isolation headphones, and similar devices, they are meant to filter the sounds of the surroundings, and quiet and harmonize the soundscape for some human purpose. People use them for example in noisy places like airplanes and city streets.

Hagood (2011) discusses how listeners create a soundscape for suppressing the presences of others with the help of noise cancelling headphones. Other researchers too have documented the individualistic character of headphone listening. Simun (2009, p. 922) shows how people in London use the MP3-player to shape their environment and build protective shields around them. Burns and Sawyer (2010, p. 97) show that people use the portable music player as a defense mechanism against encounters with other people. Bull (2007, p. 5) reminds us that the experience of the solitary individual can be desirable and considered pleasurable or positive. “The desire for solitude in the automobile is mirrored in the desire for solitude in the street and the home as many retreat into the most private spaces of their already privatized home.” Groening (2010, p. 1331) argues that “the contrary impulses of moving through the world while retreating from it” are produced by the economic and social structures of corporate media. It seems reasonable to presume that noise-cancelling headphones strengthen and deepen the development of “protective” listening practices, and the way this is accomplished should be described with greater sensitivity.

In theoretical terms, listening (with headphones on mobile phones) is a cultural competence as much as a perceptual ability. Like the act of reading a book (Iser, 1991), the act of listening to media sounds is largely an interpretational act with contextual information being as important as the sounds themselves. It involves the understanding of codes and indexical sounds, music, speech, and environmental sounds. Listening and production are to a large extent a matter of adequate responses to genres of sound content, but with the added complexity of navigation and outdoors interaction coordination. Stockfelt (2004, p. 89) presents an analysis of “adequate listening” and argues that daily listening is more conditioned by the listening situation itself than with the music (or other sound content). “The symphony that in the concert hall or on earphones can give an autonomous intra-musical experience, tuning one’s mood to the highest tension and shutting out the rest of the world, may in the cafe give the same listeners a mildly pleasant, relaxed separation from the noise of the street.” It follows from this insight that producers of location-based stories must not only have knowledge of conventions and genre characteristics, but also be sensitive to the geographical, place-based, situational characteristics of the individual listener’s location.

Temporal versus Locative Plotlines

In this article we presume a situation where culturally adept listeners carry a smartphone and listen to a sound narrative in noise-cancelling headphones, and we ask: What happens when the plotline is structured in space, not time? How does it feel to listen to content that is partly created by the listener’s own movement in the specified area? In radio and podcasting, most of the content is structured in time, but it can also be structured in space on a suitably human scale.

It is important to specify what is meant by location-based content. Figure 1 shows how a locative plotline differs from a temporal one. The squares represent what Chatman (1978) calls “kernels” and the circles represent “satellites.” These two elements have different statuses for the progress of the story. Kernels are events that have to be present in order to make the story consistent and move it forward. They cannot be omitted without the story losing momentum. Satellites are events that can be very important for context, color, and realism in a story, but that can nevertheless be left out without the reader losing track of the plot.

As the top illustration shows, a temporal plot the sequence of events will always be in linear time. This gives the storyteller full control over the development of the plot, and the elements can be lined up, for example in perfect order. But the bottom illustration shows that a location-based plot will be different. The sequence of events is determined by the movement of the listener in space, and while the kernels can be situated close to each other, there is a lack of clarity about what happens next. The story is presented as a pattern of kernels spread out in space, much like in a hypertext. But where a hypertext becomes plotted by the users’ activation of links, the user of a locative narrative makes a plotline as he is moving around in the physical environment – in our case, with the augmented soundtrack as the main narrative device. The layers of the story can, but do not need to, interfere with each other. At the micro level, the Auditor software controls the sequencing of the playback: the user triggers sound by movement, but the lines of six different dialogues are audited by the system to ensure that no dialogue is played twice before every dialogue line of the narrative is completed. In this way there is a plotline that goes toward closure, although there is unlikely to ever be full closure.

The Auditor Medium and its Content

The project group built an original information system named “Auditor” to explore the narrative potential of sound guided by micro-positioning technology. Auditor is an app for iPhones that presents sounds based on information from a grid system where Bluetooth beacons are laid out and can be used to trigger a variety of sound events. The sound events are triggered according to the listeners’ position and movement through the enclosed space (this could be a courtyard, a public square, a shop, a stadium, etc.). The current version of Auditor uses Bluetooth transmitters, which transmit a low energy radio signal every second, which can be used to position mobile devices that support iBeacons – Apple’s implementation of the Bluetooth LE technology.

The content in the Auditor app is based on sound alone. There is no visible interface other than the physical environment where the users orient themselves “in” the content. The listening experience is interactive in the sense that the listeners’ walking movements influence the development of the soundscape, and the way the different sounds are presented. There is a high level of synchronicity between walking movements and sound playback, and very little disturbance by external sounds if listeners use the recommended noise-cancelling headphones.

The app can be defined as a form of Sonic Augmented Reality. It is intended to give the listener a sense of being somewhere else, while at the same time having this new space as a layer augmented into their physical surroundings such as sounds from a railroad station heard in a courtyard where disused railroad tracks remain in the pavement. To some extent there is a link between the sonic space and the reality of the listener.

The Auditor app was constructed in order to conduct field tests of how locative communication in sound works at the micro level. We wanted it to be able to “transport” the listener to another place by acoustic means, and with a high level of immersion. During the test, listeners used a smartphone and wore noise-cancelling headphones.

Distinct fromthe information system just described, the project also created a locative sound installation called “Railroad Dialogues.” The content is constructed so that listeners can move around within it, and must therefore be flexible enough to allow a range of movement types (crisscrossing, meandering, standing still, covering ground systematically). The listeners “walk a narrative” with an episodic structure. There is no closure, there is no temporal movement from start to end like Aristotle postulated.

Now that we have both elements, the information system and the narrative channeled through it, we can explain how the distribution of sound elements take place. The story space is based on three types of sound events that we call “environment” (at the macro level), “zone” (at the meso level), and “points of dialogue” (at the micro level). van Leeuwen (1999, p. 15) reminds us that sound-dubbing technicians in radio and film have for decades divided the soundtrack into three zones – close, middle and far distance. With stereo, augmented reality, and noise-cancelling, it is possible to explore this three layer tradition in an ever more nuanced way. van Leeuwen presents his own theory of sound layering, resting on the distinction between figure, ground, and field. “Three distinct groups of sound can be heard. The Field is the sound of cicadas, hence a continuous ‘broadband’ sound, a kind of drone. The Ground is formed by a variety of birdcalls, hence by more discrete, individual sounds which nevertheless continue without noticeable gaps throughout the track. The Figure is the cry of a single howler monkey, more intermittent, and only entering after a while” (van Leeuwen, 1999, p. 18). We adapt his three layers for the “Railroad Dialogues” in the following way.

The first type of sounds can be called “environment” (field/at the macro level). Everybody who listens hears these sounds. They are background sounds that play all the time, with no variation. In the “Railroad Dialogues” there is a background sound from a railway station. This is a professionally produced stereo track, which is looped, and continues to play throughout the experience. Figure 2 shows a rough outline of the scene with six ibeacon-transmitters around a square. These transmitters are used to position mobile devices and control sounds at three levels.

The second type of sounds can be called “zones” (ground/at the meso level). There are birds chirping, a musician at one end of the station, a carpenter working with his tools, and noise from a crowd. These are soundscapes that have a place to be; they belong to a certain area of a railroad station. They give character to specific zones in the overall soundscape. These sounds change as the user moves around. They are panned when the user turns, corresponding to the mobile device’s accelerometer. These sounds are crossfaded slowly, when a user moves between zones. As Figure 3 shows, they delineate areas where individuals or groups of listeners can hear it, based on being in the area.

The third type of sound is what we call “points of dialogue” (figure/at the micro level). The six dialogues deal with topics that one would expect to hear at a railroad station, e.g., people talking about a train trip to Paris, a bus ride to Lillehammer, just arriving in Bergen, driving a mobile home, a strange story about flying with balloons, and having had a bad train journey. The voices all speak on the phone, so the listener only hears one half of the conversation in a limited area, as shown in Figure 4. When the listener moves around the different parts are played in accordance with changes of position. The result is a mix of different calls together with changes in the background sounds. Several different voices might be played at the same time, and the sounds of the people speaking are also panned when the user turns. Every listener hears their unique combination of speech acts based on the way they move through the space, and this distinguishes the points of dialogue from the environment sounds heard in the same way by everybody, and the zone sounds to which you may return and hear in the same way as last time around.

Figure 2. The Environment Sound From a Station is Played at the Macro Level, Without Dependence on the Grid Laid Out by the Bluetooth Transmitters.

Figure 3. The Four Zones as Represented in the Grid, Along With the Environment Sounds.

Figure 4. The Six Points as Represented in the Grid, Along with the Environment and the Four Zones.

The Field Trial Method

The Auditor and its content were tested by young adult Norwegians in 2015. We selected this demography because they have advanced media skills and interests, and they are early adopters of a range of new apps and interfaces for the smartphone. They listen to podcasts, watch You Tube videos, enjoy 3D visuals, have tried Virtual Reality games, Augmented Reality services. They are comfortable using headphones in most situations through the day. In our sample equal gender quotas apply, with a representative selection of students and young working adults.

Forty-two informants were given a “treatment” of the “Railroad Dialogues” and split into two equal groups that test two different versions; simple versus complex. The simple version A is in mono, it is limited to playback triggered by the users’ position, but there is no panning of the sounds and only one voice at a time. The complex version B is in stereo, effect sounds and voices are panned according to the user’s change in orientation, and the version plays two or more sounds simultaneously. The quantitative analysis of the survey in particular deals with findings about the difference between version A and B on the sense of immersion.

The trial was set up as follows: the informants knew next to nothing about the technology or the “Railroad Dialogues” before they began, nor did they know that there were two versions of it. They were all asked to “rationalize” and make sense of the experience, and they knew they were going to be asked about it afterwards. Some immediately understood the technology, while others had only a vague impression of how it worked.

We evaluated the listening experience, and registered three main phenomena: the 42 informants’ immediate response after testing the prototype as given in a survey questionnaire, their reflexive description during conversational interviews with 10 interviewees, and finally, logs of their movements collected by the mobile software.

The informants were young adults between 20 and 30 years of age. Many were students, but also working people. Fifty-two percent were women, 67% had a bachelor’s degree, and 21% had a master’s. The users were all quite accustomed to headphones and listening with smartphones as 98% used a smartphone for listening to music and sound, 94% used a PC or Mac, and 29% listened using headphones for up to 2 hours each day, 24% for up to 3 hours, and 31% for more than 3 hours.

Figure 5. Respondents who use a Particular Sound Media Weekly or More Often. The Y Axis Shows the Actual Number of Respondents out of 42 who Reported a Particular Sound Medium.

Figure 5 shows the number of informants who claimed to use a particular sound medium weekly or more often. Almost all used Spotify or similar services quite a lot, but none listened to sound art weekly or more often.

Experience of the Sound Content

This section presents the experience of the sound content, that is, the episodic soundscape the “Railroad Dialogues,” as reported by the respondents. The analysis has three parts: 1) responses about immersion and movement, 2) responses about the content, 3) responses about the speakers in the story.

Walking the Soundscape: Responses About Immersion and Movement

“You walk like in a computer game, you are not really into what is happening, but you observe it. You are a visitor, you eavesdrop,” says interviewee 9 (p. 3). Another of the testers says: “I felt that when I walked and the sounds and voices started, that I was no longer in a cold backyard in bad Bergen weather. I felt very much that I was in a train station, and it was just as if I were walking around at a train station in reality. I could control what I heard by walking to different places” (interviewee 3). Yet another interviewee characterizes the experience like this: “Really weird. It was great fun to walk in a place and feel that ‘you are not where you are.’ But I had difficulty immersing myself in this other place. I would have had to keep my eyes closed” (interviewee 5).

Figure 6 shows to what extent the respondents recognized the intended location, which in the “Railroad Dialogues” is a railroad station. They were allowed to suggest several alternatives, but most seemed to understand that it was a station, although almost half believed it also could be out in the streets.

Figure 6. The Degree of Recognition of the Location. “Station” is the Correct Answer. The Y Axis Shows the Number of Respondents out 42 Who Reported a Particular Location.

We asked them a series of retention questions, relating to what specifics they could recall from the content when filling in the form approximately 5 minutes after their experience. The respondents were asked about the number of distinct voices they heard, the number of distinct environmental sounds, and the number of distinct places the sounds came from. The median in the data was five voices, four environmental sounds, and four distinct sources. There was some deviation in these numbers, most in the number of sound sources. Six voices, four environmental sounds, and six sources was the correct answer, and many respondents were not able to identify or recall the number of sounds heard. It is also interesting to notice that as many as 27 out of 42 were not sure or did not believe that the heard sounds were dependent on location, although most respondents became aware of their influence on the change in sounds as they moved around.

No listening event has the same chronology of movements in the grid, and this means that each person hears the content in a unique way. Some respondents wanted to stop walking when listening attentively, and understood that in most places they would hear one end of a phone conversation, and listen to the voice and story of one of the characters. Inversely, moving around means that you pay less attention to what we call the “point of dialogue” elements, and become more
engaged in zones and the station environment.

Interviewee 3 did not feel “centered” in the narrative space, and this was a problem for her. “I did feel like I was there, and it was almost like I wanted to turn when someone talked to one of the sides. At the same time it was a little difficult to feel where one was located in the space. Maybe I would like a more distinct feeling of moving towards something, or that I was more clearly positioned in the space.” Like this user, other interviewees also wanted to be more “centered” in the storyworld. They wanted to have a better sense of their own position in the virtual sound space, and to have better feedback on their movements within it. This would be possible with a more complex grid, and even more careful audio mixing.

Several interviewees noticed the “zone” as a produced phenomenon in the experience. “It was not as if I listened to a lecture of 10 minutes while walking senselessly around in a circle, rather it was pre-determined in that specific areas express specific sounds” (interviewee 2). Interviewees also said that the zones should have been easier to identify, with more characteristic sounds, and thereby also better demarcations between them. Zones and points of dialogue could have been mixed more seamlessly – so that dialogues receded and increased more realistically in relation to the zone sounds.

We asked a number of questions about the feeling of being immersed. Our analysis finds that none of the answers correlates to the variables age, gender, education, and sound media experience, and also not to each other. But an index variable (IMMERSION) summarizing all these 7 questions is in fact correlated weakly to the version used (p = 0.029). The number of distinct movements (counting how often there were changes in nearest beacon = MOVES) correlates strongly to the version used (p = 0.001), and the users of the advanced version B moved around more than the users of the simple version A. There is a very strong correlation between time spent by the user and the number of moves. But there is no correlation between time and version. This implies that those who used version B in fact moved around more in the same amount of time, and by implication they moved faster around in the enclosed space.

There is no observable connection between the respondents’ physical behavior and how they reportedly felt about the experience. The MOVES count could potentially be a proxy for immersion and if we accept that, we may conclude that the advanced stereo version B creates a more immersive experience. However, it is also interesting that IMMERSION and MOVES do not correlate, which again may mean that these variables measure different properties of the situation. IMMERSION is a numerical interpretation of a survey-based experience report, whereas MOVES is an objective measure of movement in the space.

Interviewee 2 wants to have a pause button, so he can answer the phone during the listening situation. But he immediately senses a dilemma. “What if I meet an old friend? I stop playing and keep walking alongside him, but I can’t just continue to listen 400 meters down in the street. You have to be aware that the content is bound to places. If you want to have a pause, you also have to stop walking.” Interviewee 5 says almost the same thing: “Let’s say you are out listening to content on Auditor. You are going to and from school, or something, and this means you would not get the whole story.” Our interviewees point out several such puzzling problems in “The Railroad Dialogues.”

It’s Not a Story: Responses About the Content

Interviewees were overall quite confused by the story, and had difficulties staying engaged in it. Interviewee 5 says: “At first I thought the different dialogues were related, but when you get little pieces of different voices saying things that don’t connect, it is also more difficult to remember anything. There are no hooks to hang it on. You can’t build on something that was said before.” Another interviewee suggested that it would be a good idea to have a suspenseful crime story emerge from the dialogues.

We asked what the informants thought about the possibility to influence the playback of sounds by walking around in the enclosed space, and we gave them six concepts to describe it by. The respondents found the option confusing, but not frustrating and they had a sense of doing something exciting and fun (see Figure 7).

Interviewees did not want to call this a story, when prompted they called it “an ambient environment,” “sound installation,” or “art.” Interviewee 9 interestingly said “It is less of a story than a situation. I had to try to take in the whole situation as I listened.” This goes well together with the three layers of the soundtrack, which cultivates environmental presence more than a crime plot or anything like that. Interviewee 9 goes on to say that “It would be very generous of me to call it a story because I didn’t know what order to walk in in order to get an impression of it as a story. There was no character introductions, no development of a story, no connection between where I walked and what happened.” Interviewee 2 says “It didn’t strike me that this was a story until I had walked around the area a few times and noticed that the key topic that turns up again was travelling, and that the sounds were from a singular place that didn’t change. It’s not like we are taken from war torn Kosovo in one zone to a quiet weekend in the mountains in the other. The sound theme was consistently from a traveler’s place in an urban environment with lots of people talking on the phone” (Interviewee 2).

Figure 7. The Different Feelings Reported From Being Able to Influence the Story. The Y Axis Shows the Number of Respondents out of 42 Who Reported the Different Feelings.

Clearly, the experimental content is felt as too spatially fragmented to feel like a traditional story, with too few links between the points of dialogues.

I’m Eavesdropping: Responses to the Speakers

The degree of immersion in the content also has to do with the kinds of connection or whether identification is felt towards the characters and their personalities. It is well known that the timbre of people’s voice, their tone of voice, pauses, and other personal characteristics influence the mood and engagement of listeners. It is interesting to note that “The Railroad Dialogues” has no direct address to the listeners, since they are all addressing interlocutors on the phone. The implied listener “overhears” half conversations among arbitrary people in the railroad station, and in this sense is not positioned in a warm, engaging social space – such as that of a morning show on the radio.

“The Railroad Dialogues” consist of six dialogues spread out across the grid. Interviewee 7 summarizes their person gallery quite well: “The continuity was gossip. I overheard people when walking past. It was messages from person to person, whether they were informing each other or just talking. There were so many different voices and dialects, and I hardly noticed that the topics changed.”

Interviewee 3 told us that the characters have a tendency to disappear in the rich ambience. “I was more concerned with the experience than what they were saying. I didn’t feel like I was supposed to make a story out of it.” One of them suggested that the characters on the phone should have much stronger personal characteristics so that they stick, for example being very angry or being a little child of 5 years. Interviewee 3 nevertheless felt a certain level of identification with the speakers. “They talked with strong conviction, some quietly and others loudly and with enthusiasm, so in a way I felt I learnt to know them a little bit.” Nevertheless, Interviewee 3 did not notice the content of the dialogue very well. “I was more concerned with the experience than what they actually said.”

Interviewee 4 finds it difficult to engage with the speakers’ stories. “It wasn’t so easy to notice personalities. I heard a girl of about 20 talking about a drunk night out that didn’t end well, and I thought ‘I don’t bother to listen more to this,’ because such stories don’t interest me. One of the characters would then be ‘annoying young girl talking about annoying things.’ There was another character too; a man who talked about a road trip. I imagine him leaning back in an armchair, very relaxed and jovial, and talking loudly in the phone about the trip.”

The interviewees seem to say that it would be good to be invited into a more emotional and social sphere. Interviewee 8 suggests that one way of increasing the sense of personality was to create more “thematic dialogues,” so that the listener can walk from a zone with work topics into another where school, the bus, a party, or another social situation is thematized.

Experience of the Prototype

In this section we analyze the respondents’ reactions to Auditor’s information system. We address three aspects of the technology: 1) the quality of the media interface; 2) ideas about genres and topics that would suit Auditor and; 3) locations where Auditor could be installed.

Media Interface

There was a positive attitude observed towards the non-visual interface and the use of noise-cancelling headphones. Interviewee 8 says “I am a big fan of noise-cancelling headphones. You get really into it instead of being aware of all the other things going on around you. My concentration is automatically directed to what I listen to.” We asked if it would be better to have a graphical display with more control buttons, but this was overall not encouraged by the interviewees.

From the quantitative analysis we found that there is a correlation between a positive answer to a question regarding whether the user would like this service in their everyday surroundings, MOVES (p = 0.008), and the IMMERSION index variable (p = 0.016). This means that those who have a richer experience using this technology also see a better potential for the technology.

Genres and Topics

The interviewees were asked to imagine useful purposes for a locative medium like Auditor. They mentioned genres that already enjoy large markets for location based services, such as information apps for tourists and museum visitors. Tourists could learn about the history of Bergen in important public places, or museums could use it to present their artifacts, buildings, etc. One interviewee suggested that the university could use it as a “cool” way to help students memorize course literature (mnemo-technics).

In the questionnaire, we asked the testers to select from a list suitable content genres for the Auditor medium and found the distribution as given in Figure 8.

Respondents associate this technology mostly with facts, probably related to the tourism and museum apps that are commonly seen. Traditional radio genres, like a radio play, documentary, and music were prioritized higher than news, while the most talkative genres were considered least suitable. We are interested in the potential for news, and interviewee 8 was able to imagine a radio station that used locative principles to present facts and news. “In a special area you get special facts, like news updates. And there could be live streams from events like an opera or a football match.”

Figure 8. The Suitability of Different Genres According to Respondents. The Y Axis Shows the Number of Respondents out of 42 who Considered a Particular Genre to be Suitable.

There were several ideas for what can only be described as “new genres,” especially in the field of Augmented Reality. Here the young people’s high media literacy comes into play, with descriptions borrowing from computer games, Virtual Reality, and 3D references. Interviewee 9 imagined a genre where the real sounds in your environment are augmented by completely different soundscapes. “The Bergen light rail could be full of people when it is almost empty, and vice versa. On Friday night there is lots of stress, with people shouting, crowds mingling, and then you want it to be quiet in an interesting way. Or the carriage is almost empty, and the Auditor presents virtual chit-chat or jokes from the people sitting around you.” Interviewee 9 is very creative, and also suggested a crime story for sonic AR taking place on the night train from Bergen to Oslo.

Suitable Locations

Once respondents learned that the Auditor consists of a physical grid of Bluetooth beacons distributed in a limited area, it was interesting to know where they envisioned such a system being installed. Figure 9 shows the distribution on a list of six choices.

Figure 9. Where Would be Suitable Places for This Kind of Technology. The Y Axis Shows the Number of Respondents out of 42 who Found the Place Suitable.

Public areas like squares and public transportation system scored highest. This seems reasonable in light of ordinary expectations of what a medium is. Interviewee 9 says: “The Bergen light rail would be an interesting place. You could make different situations to eavesdrop on, for example having it completely crowded with football supporters, or somebody telling a joke behind you.”

Conclusion: A Way Forward for Radio

In this project we have designed and tested the Auditor, a prototype of locative radio with original sound content. Summing up the evaluation it seems that our respondents are comfortable using the headphones and the app interface, and that the medium dimension is therefore rather successful. Our informants enjoy the noise-cancelling interface; they feel immersed in the sounds and can imagine a variety of topics and locations for the Auditor to be used.When it comes to the three-layered locative content, however, the informants have a more mixed reaction. It felt “really weird” to have to walk around in order to trigger and maintain the story development – they felt disoriented and emotionally isolated from the speakers in the soundscape. Some of these issues are directly related to the specific content of “The Railroad Dialogues,” and can easily be avoided if new content production ensues. Despite the mixed response, this experiment has disclosed a range of emerging principles for locative communication in sound. The project confirms that there is narrative and immersive potential for a locative medium that could supplement radio in the future, especially in urban areas.

We believe that media design projects of this type can have a societal impact. Given the current business environment in the Nordic countries, it seems that only public service radio could take responsibility for a comprehensive exploration of these possibilities. Public service radio can establish locative sound media in suitable locations and experiment with how to address large numbers of people, smaller groups, or individuals. Locative sound media can in particular address young listeners in their favorite places through their preferred interfaces. Young people listen while they walk and relax, and they typically listen on smartphones with headphones. Traditional radio does not reach this group very well at present, and services such as Spotify, iTunes, and other offline services dominate in this listener segment.

We envision that in the future radio listeners use a locative app built on the threelevel geographical grid for sound communication presented in this article. At the macro level, listeners can hear the latest national or local news, and the same news stories are transmitted to everyone in the same way. Furthermore, there are music streams that can be adapted to the users’ preferences. The music can be selected at a micro level using personalized playlists or at the meso level playing music curated by others. Finally, at both the micro and meso level, there can be content elements that are initiated by the locations the users move by, like as with “The Railroad Dialogues.” We believe that such a three-level grid for radio is also conductive to innovation in advertising as business models can be explored for geographically diversified advertisements. In summation, a complex sound media mix can be achieved, driven partly by one-way broadcasting, partly the users’ personal preferences, and partly by responding to the places where the listeners are located or moving through.

However, it is likely that broadcasters will consider these too ambitious or too radical to be allocated large resources. Brian Winston (1998) refers to the “law of suppression of radical potential,” and it implies that the main resources are likely to continue to be given to whatever was already perceived as the core activity.

References

Aam, P. (2013). Fjernsynsforskaren – fra kritikar til innovatør. Som døme: mediedesign av interaktiv journalistikk med levande bilde. Trondheim: NTNU doctoral thesis publication.

Bull, M. (2007). Sound moves: iPod culture and urban experience. London, UK: Routledge.

Burns, J., & Sawyer, P. (2010). The portable music player as a defense mechanism. Journal of Radio & Audio Media, 17, 98–108.

Chatman, S. (1978). Story and discourse: Narrative structure in fiction and film. Ithaca, New York: Cornell University Press.

Fagerjord, A. (2015). Humanist evaluation methods in locative media design. Journal of Media Innovations, 2, 107–122

Gripsrud, J. (2010). Television in the digital public sphere. In Gripsrud, J. (Ed.), Relocating television. Television in the digital context, 3–26. Abingdon, UK: Routledge.

Groning, S. (2010). From “a box in the theater of the world” to “the world as your living room”: Cellular phones, television and mobile privatization. New Media & Society, 12, 1331–1347.

Hagood, M. (2011). Quiet comfort: Noise, otherness, and the mobile production of personal space. American Quarterly, 63, 573–589.

Hevner, A, & Chatterjee, S. (2010). Design research in information systems: Theory and practice. New York, NY: Springer.

Hoem, J. (2009). Personal publishing environments. Trondheim, Norway: NTNU Doctoral thesis publication.

Iser, W. (1991). The act of reading. Baltimore, MD: Johns Hopkins University Press.

Jauert, P., & Lowe, G. F. (2005). Public service broadcasting for social and cultural citizenship. Renewing the enlightenment mission. In G. F. Lowe, and P. Jauert, (Eds.), Cultural dilemmas in Public Broadcasting Service (pp. 15–32). Gothenburg, Sweden: Nordicom.

Liestøl, G. (2009). Situated simulations: A Prototyped augmented reality genre for learning on the iPhone. International Journal of Interactive Mobile Technologies, 24–28

Løvlie, A. S. (2011). Locative literature: Experiences with the textopia system. International Journal of Arts and Technology, 4, 234–248

Nyre, L. (2014). Media design method. Combining media studies with design science to make new media. Journal of Media Innovation, 1.

Nyre, L. (2015a). Designing the Amplifon. A locative sound medium to supplement DAB radio. Journal of Media Innovation 2, 58–73.

Nyre, L. (2015b). Urban headphone listening and the situational fit of music, radio and podcasting. Journal of Radio & Audio Media, 22, 279–298.

Nyre, L., Ribeiro, J., & Tessem, B. (2017). Academic prototypes can lead to innovation in journalism. Journal of Media Innovation, 4.

Øie, K. V. (2013). Location sensitivity in locative journalism: an empirical study of experiences while producing locative journalism. Continuum, 27, 558–571.

Oppegaard, B. (2015). Mobility matters: Classifying locative mobile apps through an affordances approach. In Aguado, M. Feijóo, C. and Martínez, I. (Eds.). Emerging perspectives on the mobile content evolution (pp. 203–222). Hershey, PA: IGI-Global Press

Panetta, F. (2012). King’s Cross, London – Streetstories app for iPhone and Android. https://www. theguardian.com/help/insideguardian/2012/mar/21/kings-cross-london-streetstories-app.

Simun, M. (2009). My music, my world: using the MP3 player to shape experience in London. New Media & Society, 11, 921–941.

Stockfelt, O. (2004). Adequate modes of listening, in Cox, C. & Warner, D. (Eds.). Audio culture: Readings in modern music. New York, NY: Continuum.

Tessem, B., Bjørnestad, S., Chen, W., & Nyre, L. (2015). Word cloud visualization of locative information. Journal of Location Based Services, 9, 254–272.

Toffler, A. (1980). The third wave. New York, NY: Bantam.

van Leeuwen, T. (1999). Speech, music, sound. London, UK: Palgrave.

Winston, B. (1998). Media technology and society: A history from the telegraph to the internet. London, UK: Routledge.