To the top

Sound Media: Part I – The present time

Theoretical introduction to sound media

Part I – The present time

2. The acoustic computer: Nervous experiments with sound media
3. Synthetic music: Digital recording in great detail
4. The mobile public: Journalism for urban navigators
5. Phone radio: Personality journalism in voice alone
6. Loudspeaker living: Pop music is everywhere

Part II – Backwards history


2. The acoustic computer – Nervous experiments with sound media

The computer has rich opportunities for experiments in journalism and music, and my history must begin here. It is truly interesting that much of this experimentation takes place among ordinary people, at their home computer. Innovation takes place not just in the celebrated media companies such as the BBC, and not just in the laboratories of global corporations such as Microsoft, IBM or Xerox.

I have selected some of the innovations that have been made in sound media due to the computer, and will analyse them in detail. All the examples were found on the internet, through a process of browsing and searching. There is a vocal outburst from a Tasmanian headphone user on YouTube in 2007, there is a little song from a group called God vs. the Internet, which was issued on the music-publishing site in 2005, there is a professional radio reportage from in 2004, and finally there is a podcast from the US media site This Week in Tech in 2006. But first I will lay out the technological background in some detail.

The multimedia landscape
Already in the 1970s the computer was so central to Western societies that the notion of the information society had taken hold in public (Briggs and Burke 2002: 260). Alongside the computer the internet emerged as a military communication structure built to withstand a nuclear attack from the Soviet Union, but it was not taken up by the general public in the same way as the PC. The internet was really introduced to the public only in 1993, with the emergence of the world wide web (Gauntlett 2000; Miller and Slater 2000; Herman and Swiss 2000). The combination of the computer and the internet introduced an entirely new communication infrastructure into people’s lives.

The internet is a multimedium. It is a collection of cultural achievements that constantly mix with new interfaces. Among the old media that the internet emulates are letters in the post (email), the typewriter (word processing), newspapers (online newspapers), radio (web radio), television (webTV) and file cabinets, not to speak of all the commercial industries that established a strategic presence on the internet during the 1990s, and thereby created the online bookshop, the online library and so on. This diagnosis is well known, and the process has been described as ‘remediation’ (Bolter and Grusin 1999),’parasitic media’ (Williams [1975] 1990) and ‘rear-view mirrorism’ (McLuhan [1964] 1994). Even now, thirty years after the personal computer was introduced and fifteen years after the world wide web became commonplace, the parasitic development of new media continues.

As I stated above, my analysis relates quite particularly to internet experiments among ordinary people. An example can be the uploading of private photographs to Flickr, where people have found the strangest new ways of organizing and presenting photographs to each other. Indeed, a fair number among the population in Western countries are regularly trying out ways of using computer interfaces and software, at night, after work and during the weekends (Rheingold 2002; Manovich 2001; Turkle 1997). And it is not even necessary to know machine language to do so, because software nowadays always has a user-friendly interface where all functions are explained and can accommodate your preferred combinations. There is a solid dose of entrepreneurship lurking in the domestic sphere, and there is a genuine desire to contribute to the better functionality and more meaningful content on the internet (Delys and Foley 2006; Nyre 2007a). If you are clever you can even invent a new killer application, as American teenagers did with Google and Napster.

Domestic life also has much else to offer in the way of mass media. For relaxation, the average Western household has flat-screen TV, perhaps also a surround-sound system, not to forget the good old stereo set. These media are all immersive; the users lie back in their easy chair and surround themselves with the sounds and images. People also have portable equipment for media consumption, such as the car stereo and radio, and wearable equipment such as the iPod or Walkman. The strange acoustic space of the mobile phone is also relatively new. All these media create what Todd Gitlin calls a ‘torrent of images and sounds that overwhelms our lives’ (Gitlin 2002).

Figure 2.1: Timeline of computer sound.
Figure 2.1: Timeline of computer sound.

Figure 2.1 shows the audio platforms that I will discuss in this chapter (see also Nyre and Ala-Fossi 2008). Notice that the newest technologies are at the top. Below the line are the two important communication technologies on which the five sound platforms are built – namely the personal computer and the world wide web. It is easy to see that all the developments in question are quite recent, since none of the interfaces for sound on the internet were developed until the early 1990s.

The late arrival of sound on the computer implies that the sound interfaces are anchored in the graphical user interface on the screen and the hand-finger interfaces on keyboards and mice. It is impossible to locate a sound file and play it on the computer without using these basic interfaces. In the future it seems that sound will invariably be embedded in a textual-graphical-visual mix.

User-publishing sites, the newest platform on the list, are a splendid example of the textual-graphical-visual mix. The system is for videos, but it relies strongly on sound communication in the form of music and speech, and also on graphics, flash animations and written text. YouTube had its breakthrough in 2006, just a year after podcasting had been the new and hot platform. In the early 2000s the portable iPod and mp3 player were introduced, and people’s use of computerized sound soared. The mp3 format had been introduced in the 1990s, before file sharing and portable players, and people could start copying their LPs and CD’s to the computer and manage their music collection entirely on this new platform. The first commercially viable platform for sound on the internet was streaming audio, from the early 1990s, and it led traditional FM and AM stations to start streaming their station flow on the web (Simpson 1998).

Making contact with the public
From the 1980s to the late 1990s people in the Western world used modems to connect from their home computers to the internet. After they had turned on the computer, they would painstakingly log on to the internet by calling up the internet service provider (ISP) on a phone number. Now, with broadband connection, the computer is automatically logged onto the internet. But in order to emphasize the live character of the internet, the modem hook-up is a good place to start. The modem hook-up has an acoustics of its own, the strange beeping and whining noises while the modem is trying to synchronize the contact and bring people online. You can hear what it sounds like on track 5.

Track 5: Modern Sounds (0:27).

This is a good example of what I call equipment acoustics. The sound does not have an explicit purpose, but it nevertheless means a lot. It resembles computer game sound, and it also resembles punching the dials of the telephone and waiting for the connection. We can imagine the absolutely live transmission at the speed of light – 300,000 kilometres per second. The modem sounds have a crucial function but no cultural meaning, while techno sounds by artists such as Kraftwerk and Autechre are equally technological on the surface, but have both melody and rhythm, and are rich in cultural meaning.

John Naughton (1999: 16) argues that the wire snaking out of the back of the machine to the modem has changed computing beyond recognition, because it has transformed the computer into a communicating device. The modem makes it possible for the user to hook up with the public in a broad sense. Remembering that the internet consists entirely of modem and broadband connections, it is clear that these connections make the computer into a live medium. At least technically speaking it has the same temporality as radio, television, telephony, and satellite communication, and can contribute to the public sphere in the same way. When the connection is on, we can dispatch messages and read incoming mail, news, etc., but when it is off we have no contact with the wider public, and we must repeat the material we have already stored on our computers.

The fact that the internet is a live medium is an important feature of its public success. The domestic user can monitor public events as they unfold, and the internet also makes it possible for users to cultivate a strictly personal circle of communication, for example on Facebook. The social value of instant connection with others is great, and most people are curious to know what their communication companions have been up to since the last time they were in contact. They can keep track of developments in their field of interest month after month and year after year.

Bolter and Grusin (1999: 197) argue that, although almost everything changes on the internet, one thing remains the same, and that is ‘the promise of immediacy through the flexibility and liveness of the Web’s networked communication’. The stability of live interaction on the internet strengthens it tremendously as a social technology. Online interactive communities can gather from anywhere in the world and engage in what they presume to be a stable collectivity. The benefits of global connection were pointed out by Joseph Licklider et al. in 1968: ‘They will be communities not of common location, but of common interest. In each field, the overall community of interest will be large enough to support a comprehensive system of field-oriented programs and data’ (quoted in Rheingold [1985] 2000: 219-20). For example, the community of music lovers is large enough to support a great range of sub-communities, such as the fan sites for particular artists and sites dedicated to specific musical styles.

Outburst from Blunty3000
On the internet listeners and producers seem to consist of the same kind of people, instead of being neatly differentiated as active producers and passive listeners. If you have something to say in public, whether for personal or political reasons, the internet is at your fingertips. User-publishing sites present the user with more opportunities for expressing themselves in private and in public, and at lower cost of doing so than before. You can, for example, produce a video with software bought at the local computer store, and broadcast yourself on a user-generated website such as YouTube. There is also a general tendency for radio and television stations to capitalize on this engagement through reality TV with interactive websites, and email- and SMS-driven television (Livingstone 1999; Siapera 2004; Hill 2005; Enli 2007).

Youths with laptops at school. Illustration: Atle Skorstad.
Youths with laptops at school. Illustration: Atle Skorstad.

The first case study in this chapter is a video published on YouTube by a man in Tasmania. This is an example of public expression without editorial screening, something that was almost unheard of thirty years ago. Nobody else can stop you from publishing your stuff, although if it is considered harmful by the providers it will soon be removed from the site. Blunty3000 posted a video commentary on his YouTube area in April 2007. He sits in front of his webcam and argues vehemently that people who wear headphones in public should be left in peace, but all he does for visual effects is wave his arms and hold up a pair of headphones. Therefore the recording is fully comprehensible without the visual feed.

Track 6: YouTube: Blunty3000, 2001 (1:06).

You know, when you see someone looking like this in public, you see a person like me – remarkably like me, more like, maybe look, maybe look exactly like me – and wearing these things on my ears, and I’m in public, and I’m in a store, and I’m looking at a gang cover or something like that, or I’m in a frigging elevator. I mean, that’s the international sign for ‘I’m isolating myself from the rest of the world so I don’t have to talk to you.’ I want myself away, I’m listening to a podcast, or I’m listening to music or I’m listening to the sounds of baby seals being clubbed to death because it makes me giggle! Whatever! I’ve got headphones on, don’t bother me; don’t talk to me. You know I can’t hear you; I’ve got frigging headphones on. These things – if you see a person in the street wearing these things – consider them a cloak of invisibility. Don’t talk to that person, don’t approach that person!

Blunty3000 presents a tirade against people who disturb headphone users. He is loud and rude and sarcastic, and sounds like the internet version of a cowboy, shooting at what he wants to shoot at, and abiding by his own laws. Blunty3000’s behaviour resembles that of a stand-up comedian who plays a carefully rehearsed role, and indeed his exaggerated frustration and shouts suggest that this is a rehearsed monologue. In this sense it is quite professional after all. Five thousand visitors had heard this recording by the end of 2007, and some of them may even have thought about what he said afterwards.

Technically, it is a very simple production. A high-quality microphone picks up his voice in a domestic room, and it is recorded with audio software. In all likelihood this recording is pre-produced, which means that Blunty3000 performed his tirade in one take, and uploaded it to his space on YouTube without any editing other than starting and stopping the tape. His behaviour to the microphone is quite personal, at least in the sense that the indignation is his own. He does not purport to speak for anybody but himself, and listeners can hold him personally responsible for his words and actions. Although formally it is YouTube which publishes the material, people who listen to this performance will relate to Blunty3000 as the editor, journalist and technician, all in one person. Blunty3000 has published a number of monologues on YouTube, and his confrontational style has earned him a long list of derogatory comments from other YouTube users.

YouTube demonstrates that people have acquired techniques for public expression that were previously restricted to professionals. In particular, people are learning how to present themselves effectively in a public setting, using microphones, cameras and editing software to great effect. Thousands of people are in the process of developing rhetorical techniques that may in effect become new media, since journalists and other media professionals will ultimately adopt many of them for programme purposes. This parasitic activity is demonstrated well by news websites that now put great value on discussion forums, personal video submissions and photographs taken by ordinary people with camera phones.

The crucial novelty that makes people into journalists or, better, micro-editorial production units is the easy access to what is in principle a public sphere. It is very easy to publish your stuff on the internet. Maybe nobody bothers to listen, but you can in any case make yourself available.

Premature publishing
Young people in the 2000s are media savvy. They grow up with expressive interfaces such as microphones and cameras, not just loudspeakers and screens. Amateur publishing can also be found among music lovers, and many young people meddle with recording equipment, where they create more or less attention-worthy music. This can also consist of mash-ups, where people sample and edit works by other artists, and modify their original intentions for their own humorous or artistic purposes. If you are making a home movie you can import your favourite recorded music into the software and edit it to become a nice-sounding soundtrack. This craft has little or nothing to do with professional recording qualities.

On Acidplant, MySpace and other user-generated content sites, hundreds and thousands of files are accessible at a click. Most of these files can be thought of as demos, although professional artists use MySpace in particular as an advertising medium for their music. In the analogue era a demo tape was something that aspiring artists brought to a record company, and everybody knew it was of poor quality and would be re-recorded in a professional way if the record company was interested. Clearly the process of publishing music through file sharing and websites is parallel to the analogue demo-tape process, except that it is easier to produce the music with high quality and possible to publish it on a global level. Upload it to the internet, and it is there for everyone in the world to hear, in principle.

Steve Jones (2000:217) argues that ‘recording sound matters less and less, and distributing it matters more and more’ (see also Jones 2002). The next case study involves a teenager who makes a music recording at home and distributes it on Acidplanet. He composes a melody and lyrics, and he invites a group of friends to accompany him. This has been a typical teenage thing to do ever since Bob Dylan first inspired youngsters to write their own material in the early 1960s. The band is called God vs. the Internet, and they sing a song with religious undertones.

Track 7: Acidplanet: God us. the Internet, 2005 (0:48).

Yeah. Here we go.
May the circle be unbroken,
My bottle bye and bye,
I found Jesus taking all my troubles,
Now he’ll walk you side by side.
And I feel alright.
Together, together.
I feel just fine.
Oh Sweet Jesus.
Ha, ha, ha.


Again the production technique is very simple. Several microphones are rigged to pick up several sound sources in a controlled studio environment. There are perhaps four persons performing. A man sings vocals, somebody plays acoustic guitar, there is a synthesizer (perhaps overdubbed), a trumpet and a tambourine. The acoustic architecture is simple: it has slight reverberation that resembles a domestic room, like a den, a teenage bedroom, or perhaps an office. There are no professional production values in this recording, no real balancing or mixing, and very bad audio quality.

Nevertheless, the melody of ‘May the Circle be Unbroken’ is beautiful, and the trumpet sounds especially vulnerable. There is a certain helpless charm to the song. It seems like the band addresses other young people, who are presumably more relaxed and optimistic than the adults, and the combination of ironic distance and sincerity in the lyrics might indeed appeal to teenagers. At heart this song illustrates the amateur enthusiasm that finds regular expression on all kinds of user-generated sites. But with respect, the threshold for publication is low in this case.

Notice that on the web listeners can often talk back to the producers. In October 2006 the following comment was posted on God vs. the Internet’s area on Acidplanet by a person called Paul D. Richardson: ‘Sorry if this is harsh, but that was the worst thing I’ve ever heard. I had to turn my speakers right up just to hear it, and when I did what I heard would have made the blues masters roll in the grave.’

The BBC’s acoustic authority
Now there will be a stark contrast to amateur journalism and amateur music. The professional productions of broadcasters and the music industry have been under siege by the internet since the early 1990s (there are many analyses of this convergence; see for example Lowe and Jauert 2005; Leandros 2006; Kretschmer et al. 2001). Most notably, their traditional forms are being challenged by the amateur practices I have just described. The problem is that the internet’s platforms are radically more interactive than the recording and broadcasting media, and a hundred years of asymmetrical cultural techniques have to be redirected.

Can public service broadcasting still offer something that is exclusive? It seems that truly professional sound journalism is the only thing that public broadcasting services still serve up as an exclusive product. For many decades public service broadcasting was the hallmark of quality journalism in Western countries. And when it comes to sound media, the stamp of quality was the compact news bulletin, investigative reportage, dramatic documentary programmes, and not least expensive programme formats such as radio plays (which you will never find on the internet other than those from radio stations). High-end radio production gave public service broadcasting an authoritative presence in the public life of the West, and it continues on the internet (Jauert and Lowe 2005).

If you enter the BBC’s huge internet portal looking for high-quality radio programmes, you will be satisfied. The next case study is from BBC Radio 4, which brands itself as ‘intelligent speech’. It is a thirty-minute science programme called ‘Acoustic Shadows’, which deals with the varied art of acoustic design. The blurb on the website reads: ‘From the most reverberant room in the world to a chamber where sounds die the moment they almost leave your mouth, Robert Sandall takes a journey into the world of acoustics — its origins, its people and some of its amazing soundscapes. ‘When it comes to production values this programme is dramatically different from Blunty3000 and God vs. the Internet.

Track 8: BBC Radio Four: Acoustic Shadows, 2004 (2:02).

[Recordings of concert hall acoustics]
[Car door slams] – Okay, we’re pretty near the Indian Hill site now, we’ve
driven about two hours from San Diego, we’re out in the middle of the
desert. It’s hot. Watch out for rattlesnakes!
[Acoustic guitar — Ry Cooder style]
– Steve Waller is a sound explorer in a literal sense. His field of research is 
the rapidly growing one of acoustic archaeology, which involves him trav
elling the world studying the connection between ancient rock and cave
art and acoustics. Before they got the paints out Steve believes our ances
tors selected the sites they wanted to decorate for their potential as natural
echo chambers, a clear case of sound before vision.
– We’re standing at the bottom of this mountain that’s made out of basically house-size boulders. And in it is a fire-blackened cave that has Indian pictographs, which are basically painted rock art.
– Steve, we’re heading for that small opening a hundred feet further up?
– Oh, yeah, it’s one spot which they chose to decorate. That is where the
echo’s coming from. [Dahh!] It’s as if the sound is coming right out of the
mouth of that cave. If you think back to when the ancient peoples thought
that echoes were due to spirits speaking back, you can see that it’s as if the
rock is speaking to you, as if voices are calling out of the rock, and they’re
calling out right from that place on the side of the hill where they chose
to decorate. [Dahh!]


The programme is concerned with acoustic archaeology, and the reporter Robert Sandall has made recordings of indoors and outdoors acoustics which he uses rhetorically to demonstrate what acoustic archaeology is about. The listeners can sense the rocks and cliffs that the speakers walk between, and the reverberant ‘dahh!’ informs us just as efficiently about the topic as the speech by the two men.

The programme is post-produced, with careful editing together of three different types of sounds: the voice-over and outdoors speech; the environmental sounds of cars, walking, shouting in a reverberant space; and the guitar music. Seventy years of competence-building in radio journalism lies behind this reportage, and we can hear the accumulated skills of creating a seamless, well-dramatized entity out of a series of raw materials (see for example Herbert 2000: 193ff.; and McLeish 1999: 257).

When it comes to the protagonists’ way of speaking this reportage also conforms perfectly to the demands of classical radio journalism. The BBC journalist reads from a well-prepared script, and in this type of address the speaker is expected to function as a skillful animator of the facts and explanations contained in the script. Although the reading should be vivid and lively, there should be as few traces of the journalist’s personality as possible. In contrast, the interviewee should sound as if he is improvising his speech in a personal and intimate way, since he is after all not a professional journalist. But still, the behaviour of the interviewee should be harmonized with journalist’s speech. At the end of the excerpt, when Waller says that it is as if ‘voices are calling out of the rock’, he speaks with exactly the kind of enthusiasm that the journalist needs to complement his own reading parts. Both speakers were well aware that what they said at the microphone would be edited before it was put on air, and this made their behaviour relaxed and quite natural-sounding.

There are two professional qualities here that are often lacking in amateur recordings on the internet: the smooth, inaudible editing of multiple strands of sound, and the seemingly effortless and highly informative speech. My point is that high-quality reportage is the hallmark of public service institutions on the internet, while user-published content is made with much less sophisticated production techniques, and would not readily be taken up by public service institutions. This is not a surprising division of labour. The big broadcasting institutions have created professional journalism for over seventy years, and this long-standing tradition keeps journalism from truly resembling the amateur initiatives on the internet.

Public service broadcasting is often seen as a protector of democratic values and as the narrator of personal and social stories with relevance for the citizens (Carpentier 2005: 208;Winston 2005: 25Iff.;Williams [1975] 1990: 32ff). In having such important functions journalism rises above the communication that ordinary citizens can affect between themselves. Journalists work in a well-defined profession with trade unions and interest organizations, and they possess complex expressive skills involving writing, camera work, styles of speaking and moving around, editing, checking sources, complying with ethical guidelines etc In a cultural sense public service broadcasting will always consist of one-way communication, with a centralized editorial organization distributing their carefully made product to the masses. The BBC’s website demonstrates that this asymmetrical relationship works fine also on the internet.

Podcast frenzy!
In a matter of a few years from 2005, podcasting has become a standard option for listeners (Levy 2006: 227ff). By clicking on the ‘subscribe button on a website, listeners can regularly receive fresh instalments of their chosen audio or video programmes. I will go into the technical details of podcasting in a later section- here I will attend to the production values of podcasting.

The next case study is from a podcast-only service called (the acronym stands for ‘This Week in Technology’). The company produces its own original podcasts, and this makes it different from many of the podcasting services which distribute standard radio or television programmes on just another platform. I have selected a podcast that TWIT made about an event called the Podcast Expo in California 2006; it was made during the buzz and expectations of the big conference, and everybody is wandering around, testing, buying and selling podcast products.

There is an upbeat rock jingle at the beginning, and professional voices read the headings in a neutral style, sounding mechanical in much the same way as the voices that say ‘Mind the gap’ at underground stations. ‘Netcasts you love (a man). From people you trust (a woman).This is twit (a man-woman duet) . This soulless but informative way of speaking is a classical feature of American-style broadcasting. There are also several sponsored messages and when the actual programme begins it resembles talk radio quite a lot. The similarity to the production values of American commercial radio is quite striking.

Track 9: This Week in Tech: Podcast Expo, 2006 (2:31).

– And we’re live at Podcast Expo [-Yoohoo], I could say Netcast Expo, Doug Kay is here from IT Conversations, he’s gonna hand his tiara oft to me a tittle later on. [- Absolutely] Last year’s podcast person of the^ year. And if you should fail to, it could, succeed in your duties, you 11 be, III be runner up [- Okav, thank you, yes] -Actually, Doug has a big announcement, so we’ll get to that in just a second. Also with me Steve Gibson, another TWIT from Security now and Sitting next to Steve Gibson, the great Scott Warren from Mac Great Weekly in the Eyelifezone [- Hi everybody], he’s also a great aperture expert, and What we’re gonna do today is talk to a lot of podcast-ers as many as we can in half an hour before they take this stage away from us Podcast Expo is, this is the gathering of the tribes tor podcasting, the second year we’ve done it, about 25 hundred people, we’ve taken over a small part of the Ontario Convention Center, a lot of booths showing podcast software, podcast hardware. Broadcasters General Store is here and thanks to them we’ve got audio on this podcast, they let us [— Which is really handy for a podcast]. It certainly makes a difference, they let us their Alesis Multimix 8 Firewire.

TWIT produces mainly live-on-tape events, which are cheap and simple to create. In this case there is a row of industry men on stage, they are introduced by the host, and the host talks to them all in the course of the programme. This is an example of resounding studio acoustics. Several microphones are rigged on a stage, and the performance takes place in front of a live audience. The programme is mixed to pick up the performers and the audience reactions, and also to convey the size of the hall and its atmospherics. It all sounds authentically like the Ontario Convention Center in Los Angeles.

Podcasting is not a live medium like web radio, and among other things this implies that the raw material for a podcast can be heavily edited before it is launched to the public. Since the producers are well aware of this while taping the show, the mood of podcast programmes can be more relaxed and happy-go-lucky than traditional radio. But there are lots of similarities with radio, as I have already suggested. For example, podcasting mainly communicates in the form of twenty- to thirty-minute programmes, which happens to be the typical length of a traditional radio programme. Some podcasters make three- or four-minute installments, which is the typical length of a radio commentary piece.

Listening to the message of the TWIT podcast, I have to say that it sounds quite partisan. By reporting so enthusiastically from the Podcast Expo, TWIT promotes its own platform every minute of the way. The speakers want recognition of podcasting as a medium, and this podcast is a good example of the intimate connection between equipment manufacturers and editorial production. This is not critical journalism, this is an expression of a common interest in expanding the market that is quite typical of new internet media. The pod-casters try to sell equipment and programmes, and perhaps even establish a new medium with social practices of its own.

The Podcast Expo illustrates the driving forces of technological innovation in the media. The modern mass media are built on competitive lab experiments in the military-industrial complex and commercial enterprises, and the motivation is basically the same in the podcast industry. There is an intense pursuit of better functionality and greater efficiency and more diverse areas of use. Companies such as Microsoft, Apple, Nokia and Google try to create the next killer application, like the iPod was. You can ask who will win the competition, but the truth is that the competition will never end; it will only change into being about something else.

I will end this section on a critical note. It is important to consider that the optimistic moods of advertisers, PR companies and the broadcasting stations may be purposefully unrealistic. In the article ‘The Mythos of the Electronic Revolution’, James Carey and John Quirk argue that there is an idealizing rhetoric embedded in the very fabric of electronic communication, which they call ‘the rhetoric of the electrical sublime’. This is, they say, an ethos ‘that identifies electricity and electrical power, electronics and cybernetics, computers and information with a new birth of community, decentralization, ecological balance, and social harmony’ (Carey and Quirk 1989:114). In their view technological life includes a clever ideological and commercial staging of roles for people to believe in, where the various appliances are seen as necessary for succeeding in one’s life involvements. Carey and Quirk refer to this as an ethos that goes like this: ‘Everyman a prophet with his own machine to keep him in control’ (ibid.: 117).

The computer sound medium, 2008
Now I will turn to a more precise analysis of the medium I am talking about: the ‘computer with internet’ medium. Regarding sound production, the computer medium is quite symmetrical in a technical sense. It is truly the same equipment that is being used by the important BBC journalists and the amateur musicians. Both of them can, for example, edit and mix their production on their laptop in the evening. Since all parties can in principle publish and distribute messages, it is difficult to say that it is a linear medium with production at one end and reception at the other.

But although the medium is quite symmetrical in access and opportunities, there is a big difference between the professional training of journalists and musicians, on the one hand, and the lack of it among ordinary users, on the other. As I have suggested earlier in the chapter, we can hear this by comparing the BBC’s ‘Acoustic Shadows’ with the shouting monologue by Blunty3000 on YouTube.

Figure 2.2: Model of computer sound media.
Figure 2.2: Model of computer sound media.

Figure 2.2 shows the technical environment of sound communication on the internet. The mixing board at the production end symbolizes the great creative control that producers and journalists still typically have in comparison with listeners. The mp3 player at the reception end symbolizes the flexible ways in which listeners can use the sounds from the internet (and computer). Notice that for both parties almost all the vital functions are contained on the computer in the form of software. The figure shows that there are essentially two platforms: the personal computer and the network between those computers. I have argued that, when it is hooked to the internet, the computer is a medium of live distribution. But the issue is more complex. The computer has a storage and playback feature that always works, whether there is a connection to the internet or not, and this is simply the computer itself. In addition there is the live connection that can be turned on or shut off by the user.

The computer is sometimes called an Überbox, which refers to its tendency to contain all kinds of other media. And, indeed, all the traditional interfaces of sound media are clustered around the computer, for example mixing boards, sound-proof studios, LP turntables, video cameras, microphones and synthesizers, and all these familiar interfaces could have been drawn up in the figure. This clustering is possible because the computer platform is what Alan Turing called a ‘universal computing machine’ (Rheingold [1985] 2000: 45ff.) that can run any arbitrary but well-formed sequence of instructions. It can, for example, encode and decode written messages, or sample and store photos, sound and video. Whatever it is, the computer can process it according to its basic algorithms, which rely on the numbers 0 and 1 in endless combinations. The rich acoustic and visual experiments at the computer are translated into binary signals during production and are converted back to the human realm during playback.

The digital signal carrier is so miniaturized and blackboxed compared to analogue signal carriers such as the LP that it almost seems to not exist as a material fact at all (Simpson 1998). It can be contained in all kinds of microchip devices; and it seems completely immaterial apart from the file icons that pop up on the screen when, for example, you put a memory chip in the USB port.

The most striking feature of the double platform of computers and the internet is the web of connections that it offers. It creates access between all parties that have a modem or broadband connection. Remember that the internet is based on the same principles as telephony, except that it does not transfer analogue voices, but digital information. That old network has been in operation in most countries for over a hundred years, and the wires are well established in the home. This feature to some extent explains the rapid rise of the internet. Users can also hook up to internet providers through mains electricity, satellite dishes or cable networks.

The processing capacity of the computer (with internet connection) can be considered a public resource. When you don’t use your processing power it goes to waste since, for every second, hour and day that it lies inactive, it could have been used for some calculating purpose. The processing capacity could have been lent out at night – for example, to scientists who could use it to analyse raw data from radio telescopes that scan outer space for signs of intelligent life (see Some people download music and films and podcasts all through the night and stockpile enormous amounts of cultural products that they will never consume in full, but which nevertheless serves the purpose of not letting their time-bound information resource go to waste.

I have already discussed the rhetorical potentials of web radio and podcasting. Here I will describe the two new platforms in comparison with traditional broadcast radio, in order to get their special features across as clearly as possible. Web radio and podcasting can be considered sub-platforms in the great multimedium which I call ‘computer with internet connection’. However, since this mother medium has such a stable presence, the sub-platforms can for all practical purposes be treated as self-standing media platforms (just like email software, web browsers and other appliances).

Audio streaming was a groundbreaking live technology for the internet (Priestman 2002). Notice that this is not the same as downloading, since no files are transferred to the computer. The audio streaming player made it possible to listen to the sound while receiving it, and in effect this introduced radio on the internet. As I have already noted, all websites rely on visual guidance for the listener, and web radio is no exception. The listener must search the web to find web radio stations, and the streaming feature is packaged in a rich visual environment of news, schedule information, contact information, and so on. Nevertheless, web radio re-creates that fundamental quality of broadcasting which is called ‘live at the point of transmission’ (Ellis 2000; Hendy 2000), albeit in a telephonic network instead of through terrestrial transmission. Although there is a buffering process between the streaming servers and the listener’s computer that can delay the signal by up to ten seconds, web radio in effect transfers sound in the same temporal manner that FM and AM radio has always done (Priestman 2002: 9). Web radio is a cheap form of publishing and distributing sound, and it has greatly lowered the threshold for establishing new editorial outlets. Services can, for example, be made without large initial investments for small groups scattered around the world (Coyle 2000). The internet has become the main delivery system for thousands of web-only radio operators and an important supplementary platform for practically all radio broadcasters.

In contrast to web radio, podcasting is not a live medium, because the message has to be completed in every facet before it is published and distributed. There is no streaming process; instead, the file is automatically downloaded to the computer, and it must be actively deleted by the user. Because podcasting relies on downloading, nobody expects the podcast programme to be interrupted, for example, by breaking news. The podcast platform has no readiness to respond to current events, except that of course a new installment can be published sooner than originally planned if some pressing event makes it opportune.

The subscription feature of podcasting is important to notice, since it is quite different from web radio’s live streaming. The listeners get their radio in the mail, so to speak. This is very different from traditional radio, which has always been characterized by the here and now of the public sphere. The listener receives new programmes on a regular basis, for example once a week, and can bring the recordings out into their everyday surroundings and play them back on the iPod (Berry 2006). All the major radio stations offer this service for most of their programmes, and it has led to an increase in listening to talk and information programmes among people who would previously not listen much to the radio. Notice that mobile listening to podcasting contrasts with the way in which people listen to streaming audio, where (in 2008 at least) they are more strictly bound to the stationary computer or the laptop. Typically, the podcast user will listen while doing something else, just like radio programmes and music have always been enjoyed.

Meet the pirate
File sharing has led to great innovations in the art of listening – innovations that keep us in control of our music (Hacker 2000). Admittedly these innovations have taken place in dubious ways. Peer-to-peer networks have long challenged the music industry by allowing music to be shared without compensation to the rights holders, and without any sales of CDs or other physical media by the record companies. This alternative music industry was introduced with sites such as Napster in the late 1990s and the Pirate Bay in the 2000s (for elaboration, see Alderman 2001; Sterne 2006; Rodman and Vanderdonckt 2006).

Music lovers can make playlists for their Walkman or iPod, just like people made mixed tapes to play in the car in the analogue era. People’s playlists typically have a single song focus rather than in-depth attention to whole albums. This also goes for podcasting, where you select your favourite shows instead of listening to a live flow. This reduces the producers’ and artists’ control of the role of the individual items in a larger album context. Playlists can be organized by genre, year of release and many other variables that give the music lover greater control over their act of listening than ever before.

Since there is so much music on offer on the internet, the listener can in principle cherry-pick their music and compile the perfect record collection. Once the music is downloaded to the computer it can be organized in an audio library. These activities have increased what Paddy Scannell calls the ‘personalization of experience’. He argues that the media are ‘something that individuals can now increasingly manage and manipulate themselves through new everyday technologies of self-expression and communication’ (Scannell 2005: 141).

However, the cherry-picking by music lovers is not something new. As the historical part of this book will show, a highly sophisticated culture of listening to records had developed already by the 1930s, and it was strengthened with the arrival of stereo music on LP in the 1960s.This culture lives on among the LP and CD lovers who still listen to a whole album in one concentrated act of listening, and who have a solemn reverence for their albums.

In the context of file sharing on the internet, music lovers can build a really big music collection at low cost. This is what Jacques Attali (1985: 101) calls ‘stockpiling’. People buy more records than they can listen to, and the pirate most definitely downloads more music than he can listen to. The storage capacity of computers in the late 2000s is great, and avid music lovers can have 20,000 to 30,000 songs in their file cabinet. Presuming that an average CD holds approximately fifteen songs, this equals something like 1,300 to 2,000 CDs. Although people basically create their own private record collections, they can of course share them with anonymous others through file sharing software. The music collection is a part of the public domain for as long as the user makes it available.

Some music lovers relate to music less as an expensive and fragile commodity and more as a huge standing resource. Some feel that it is so easy to download music that the actual file is not worth caring too much about. There is no need to build up a personal collection of files if you can serve yourself online at any time, the argument goes.

The search engines of the internet bring sound recordings to hand more easily than ever before in the 130 years of sound media. Amazon and eBay are places to look for CDs to buy, while iTunes and other companies present legal music. Of course, thousands of pirate sites distribute music to anybody for free. The internet functions as a standing reserve of sound, or, more precisely, of references for sound, that the listener can choose to launch or download. Websites such as Allmusic or Rhapsody have plenty of background information about songs and artists, and there are websites specializing in transcriptions of pop lyrics. Compared to the information that can be supplied on a CD cover, the internet has a radical potential for informing the user about the cultural and historical setting of the music they listen to. Very often people search in an open and curious way, limited only by their perseverance in pursuing their interests. Notice that a search can also be conducted through visual cues on a website – pictures of artists, logos for radio stations and other graphical material. Often a quick glance tells you what you need to know.

Sound browsing is also possible. Allmusic provides excerpts of 5.5 million songs which can be chosen from standard browse and search procedures, so people can listen to a thirty-second snippet and evaluate the music before they buy or download it. Notice that this is not really an auditory search, since this would imply that the user enters an excerpt of, for example, a guitar sound, and the search engine would find all the songs with the same type of guitar sound. But the Allmusic monitor after all means that you can steer your search for musical pleasures by attending to the actual music, and not to written recommendations or information that the graphical interface alone can supply.

The computer is a miracle
It is difficult for ordinary people to learn how the computer and internet actually work. You are confident that you could explain it comprehensively if you had the time and money to educate yourself and study it at length, but since that is not possible, or not in your interest, you have to rely on the functionalities by default.

Rather than interrogating it ourselves, we are likely just to accept it all as a functional fact of life. There is a tension in this way of living with things, an uneasy or hesitant or even reluctant acceptance of the great functionalities and increased opportunities. This is trust in technology. It does not come about by conscious thought processes in every instance of contact. On the contrary, it comes about because of withdrawal from explicit thematization (Nyre 2007b). Paddy Scannell describes this notion of habituation in the context of broadcasting:

The language used to describe the invention of radio first and, later, television expressed over and over again a sense of wonder at them as marvelous things, miracles of modern science. Their magic has not vanished. It has simply been absorbed, matter-of-factly, into the fabric of ordinary daily life.

(Scannell 1996: 21)

People don’t think much about the computer’s strangeness, and don’t have time to study it in full technical detail. For most people the incomprehensible sediments into a habit and its complexity vanishes. Alfred Schutz (1970: 247) writes: ‘The miracle of all miracles is that the genuine miracles become to us an everyday occurrence.’


3. Synthetic music – Digital recording in great detail

Synthetic sound is all around. Even grandmothers listen to techno beats and weird sounds of synths, samplers and computers. Two innovations in particular have made these technological sounds possible, namely multitrack editing and the digital generation of sounds. These intricate cultural techniques have been around since the 1970s, but with the computer they have become a mainstream phenomenon (and only then, in my perspective, does a technique become truly interesting).

Three case studies will be presented, all of which in different ways demonstrate the hyper-technological character of modern music. First I will analyse a densely multitracked rock composition by the group haltKarl from 2007 and present two versions from different stages of the production process. Secondly, I analyse a completely synthetic techno beat by Autechre from 1995 where it sounds as if no microphones have been used at all. Finally I will analyse a passionate crooning performance by Beth Gibbons and Portishead from 1994. Her performance is just as human as the human voice can be, despite its being embedded in a dense flow of sampled sounds and drum machine beats.

The synthetic media landscape
Westerners in the 2000s are a media-sawy people. It seems that nothing can surprise us when it comes to hearing and understanding recorded sounds (and images). People can hear sounds from outer space, sounds from inside an ant hill, and sounds from the volcanic inferno of the earth’s core. In the context of visual media, we have seen the insides of a womb and the hurricanes on Jupiter in colour. Impressive though these representations are, they are strictly documentary. People are also used to another type of media experience: animations in sound and images that are created entirely inside the technologies, and which nobody would mistake for something that exists outside the media. When dinosaurs charge down the road in Jurassic Park (1993) there is no doubt that the images are non-documentary, and the same is the case for TV promos, commercials and programme intros that are often highly advanced when it comes to graphical effects. This type of animated reference is often used for entertainment and aesthetic effect, where there is no moral imperative to represent the world in a realistic manner. However, there is heated discussion about ethical problems in digital animation, since now the technical possibilities for manipulation are almost limitless (Kerckhove 1995; Gitlin 2002: 7Iff.).

When describing the synthetic production of music I want first to connect to an ongoing cultural process that parallels the visual animations of TV and film. American and European music lovers enjoy the products of an industry with a joint creative history that goes all the way back to the cylinder phonograph in the late nineteenth century. This means that there are well-developed aesthetic sensibilities among the general public, and during the last three decades these sensibilities have shifted towards greater acceptance of technological sounds. Hip-hop, techno music, electronica – whatever trademark is put on the music, it has none of the acoustics of the concert hall, but all of the acoustics of the computer. These techniques of production have become influential on people’s tastes in general, and are used in commercials, on film soundtracks, in computer games, and so on. The music lover has turned towards textures and timbres, and considers them enjoyable in their own right.

Figure 3.1: Timeline of digital recording media.
Figure 3.1: Timeline of digital recording media.

Figure 3.1 lists the main platforms that are necessary for the creation of synthetic music. There is clearly a thematic overlap with the platforms that were discussed in chapter 2, for example regarding file sharing and mp3, but this chapter focuses more strictly on high-quality music production. Underneath the timeline I have listed visual animation media – computer games and animation formats based on software such as GIF, QuickTime, Shockwave and Flash. These visual animations can be accessed on the internet and played on a computer. It is instructive to start my discussion of synthetic sound production on the background of visual animation. To create the illusion of movement, an image is displayed on the computer screen, then quickly replaced by a new image that is similar to the previous image, but shifted slightly. Each image can be constructed entirely on the computer or be a montage of real footage and animation. This technique is identical to the way in which the illusion of movement is achieved with traditional television and motion pictures, except that every pixel must be plotted by a programmer instead of being automatically registered by a camera lens. In principle, computer sound is designed just like the computer animation of images.

One of the really big changes in music recording happened quite recently. During the 1980s the introduction of the CD started a slow but ultimately complete replacement of analogue equipment. From the 1990s there was file sharing, and people started getting used to music in the form of computer files. The change created fruitful conditions for the development of synthetic techniques.

Audio editing software made it possible to manipulate the sound with graphical interfaces, particularly in the software layout as it appears on the screen. Cut and paste editing was a very important innovation, and it started influencing music production from the early 1990s. Before that it had become widespread in word processing software such as WordPerfect and Word (Manovitch 2001).

The MIDI standard for programming of music scores and instructions was the first commercially viable digital technology for synthetic sound creation in the early 1980s. MIDI stands for ‘musical instrument digital interface’, and this protocol enables items of electronic music equipment to communicate and synchronize signals with each other (Chanan 1995: 161, 357). In combination with samples of instrument sounds, a given MIDI composition can be played back in any type of sound, for example, the violin, the guitar sound or falsetto singing. Notice also that computer games such as Pac-Man and other stand-alone devices had their own sound-generating principles before the advert of MIDI.

Hard-working rockers
Before I describe digital recording in detail, I will remind you that recorded music is still a physical thing, despite its fugitive digital existence. Recording ‘turned the performance of music into a material object, something you could hold in your hand, which could be bought and sold’, argues Michael Chanan (1995: 7). In addition to the potential for making money, recording allowed musicians to hear themselves as other people would hear them. They could listen very carefully to several takes before they decided which was the best one, which they then released to the public. During the decades artists have cultivated this technique, and it is now a completely integral part of record production, as can be seen by the central location of monitor loudspeakers in the control room. The activity can be called analytic listening.

In this section I will attend to the way analytic listening influences the recording process, and my point is that it is crucial to the painstakingly careful construction of sounds, bit by bit, sample by sample, instrument part by instrument part. Musical events are first recorded straight through, and afterwards they are altered and combined with other musical events in the progress of editing, to construct the desired musical rhythms, melodies and harmonies. All the time the producers listen carefully to the sound and discuss it among themselves. Layer upon layer can be added or removed according to the producers’ changing creative vision. Even low-budget recording sessions in an attic involve a tremendous modification and manipulation of the musical sounds.

Track 10 is from a Norwegian recording in process. The group is called haltKarl, and since it has not yet released any CDs it is completely unknown to the wider public. The band will probably aspire to national fame sometime in the future. This song has actually been under development for around five years, and we hear two versions: the professional version from 2007 and a demo version from 2005.

haltKarl in the studio. Illustration: Atle Skorstad.
haltKarl in the studio. Illustration: Atle Skorstad.

Track 10: haltKarl: Almost GUIs, 2007 and 2005 (1:51).

Both my lungs are spoken for
Breathing water in through the pores
Hormones might get the best of me
Black men might just have bigger feet
All my girls got the same disease
And all my girls want a better deal
All my girls cry
All my girls go
Under water I can hear them.


The music is in rock style; it sounds rather like heavy metal with beach harmonies, or like a mix of the Red Hot Chili Peppers and the Beach Boys. The three-man band plays guitars, bass, drums and synths, sing solo vocals and harmonies, and use all kinds of digital effects.

The sound is highly controlled, which is a result of the slow composition method that has been used. It is as far from a live performance as one can get, although it actually sounds much like a live performance. The clarity and simplicity of the sound belies the immense depth of layers in the song and its production process. The band owns a large array of stomp-box effects for the guitar and bass, software plug-ins, outboard synthesizers and effect processors that can modify the signal from any type of source into whatever sound is wanted. haltKarl is exploiting the powerful computer processing capabilities of relatively cheap modern-day computers to the full.

Regarding its production technique, ‘Almost Gills’ has an interesting chronology which I will go into in some detail. The song has been recorded twice, in two completely different versions that nevertheless sound much the same at first hearing. haltKarl is a determined band, and in 2005 its members were satisfied with the melody, lyrics, harmonies and detailed instrument arrangements, but not with the production qualities of the song. With the structure of the song set in stone, they started preparing a new, professional on, where all parts, instrumental as well as vocal, were rerecorded. On loundtrack we first hear the new but still unmixed version from 2007. It recorded in a semi-professional studio with a relatively short production od, in contrast to the 2005 version, which has a far less professional sound was recorded in basements, cabins, living rooms and even bedrooms over a number of years.

There are several reasons for analysing the strategies of an unknown rock group from Bergen. The production techniques used are now so typical as to be almost the industry norm, despite the fact that they are extremely time-consuming. The slow construction of a song, layer by layer, is just as common in rock or hip-hop or techno, among struggling artists at the home computer as well as the big established acts in expensive recording studios such as the Record Plant in Los Angeles. The process of improving the sound can go on for months or years before the artists are satisfied. The album OK Computer by Radiohead (1997) is a case in point. In fact, this construction process is now so widespread that it is the starting point for the cultural perception of music not just among the musicians themselves, but also among ordinary people. In other words, the entire public is well skilled in enjoying sound with these highly complex characteristics.

Techno acoustics
The next case study will relate to acoustic characteristics. Artists design the acoustic properties of their recordings very carefully, and this can be called the acoustic architecture of sound media. In chapter 1 quoted Ross Snyder ([1966] 1979: 350) referring to musicians as architects of a spatial habitat ‘in which contemporary man will live and move and have his being’.

Acoustic architecture is intuitively understandable as relevant to concert recordings and other cases where an actual space is picked up by microphones. But there are more difficult examples. The electric guitar and the Hammond organ produce sound via internal electronics; so where are these sounds located? It would be strange to say that the sounds of the electric guitar resonate somewhere inside the instrument, as if it were just like a saxophone or a drum. This problem of reference became more pronounced with the synthesizer and its programming of tone-generators. Sounds then became internal to the technology in the simple sense that they did not resonate in a real room and were not picked up by microphones. They were created wholly on the inside of the equipment and sent directly to the tape or the loudspeaker.

The music of the German group Kraftwerk is a good example of at least partly synthetic music. Mark Cunningham (1998: 281) comments on this 1980s techno sound: ‘If there was a shared emphasis among the varied styles thrown into the charts from the early to mid-Eighties it was one of unnatural sounds, created electronically by sequencers, drum machines, synthesizers and the advent of sampling.’ Few if any acoustical instruments are played on Kraftwerk’s albums; the only sounds that are definitely from the world outside digital construction are the voices of the singers.

‘Dael’ (1995) by the British duo Autechre is an example with fully synthetic acoustics. This English band is well known to techno freaks but virtually unknown beyond their hardcore audience.

Track 11: Autechre: Dael, 1995 (1:09).

[No lyrics]

It creates a very precise sound which is ‘written’ in layers of rhythmic elements speeding up and slowing down, with a dark melody behind. There are strange modulations that sound like a zipper being drawn and a repetitive rhythm, and the overall effect is metallic, sharp and cold. None of the sounds are analogue, and there are no vocals or analogue instruments such as a guitar or trumpet. Notice that the analysis of what the band has done to create the sound can only be based on speculation, since there are so many ways to make music with digital software (and MIDI plug-ins). It is really impossible for anyone to establish the production process of this type of music.

This has something to do with the fact that synths and MIDI signals have a purely technological acoustics; and this in turn implies that the recognition of spaces and sounds among music lovers is restricted to their knowledge of synthetic instruments. Notice that MIDI carries instructions written in a computer language that all modern synthesizers, drum machines and other digital processors can understand (Honeybone et al. 1995: 23). The proliferation of MIDI implies that popular music in general has ventured into a synthetic timbre-management. The activities at the mixing board and the computer can now be considered as important as the sounds of traditional musical instruments.

This development has concerned music lovers for several decades already. In 1990 Andrew Goodwin claimed that, with sampling and other digital developments, the authority of the artistic statement has been reduced: ‘The most significant result of the recent innovations in pop production lies in the progressive removal of any immanent criteria for distinguishing between human and automated performance. Associated with this there is of course a crisis of authorship’ (1990: 263). As I have suggested, it is difficult to identify the performers, and to point out what their musical accomplishments actually consist of. Goodwin claimed that sequencing and sampling technologies have cast such doubts upon our knowledge about just who is (or is not) playing what that some bands place comments such as ‘no sequencers’ on album covers to retain their status as ‘real’ musicians (ibid.: 268).

In the late 2000s it is another story altogether. Music lovers are now so used to the synthetic techniques that very few would consider them unreal or interior to musical sounds picked up with microphones in the traditional way. Indeed, the members of Autechre are not at all afraid that their status as musicians will be compromised by the synthetic way of playing music. On the contrary, their reputation in the techno community is rock solid because of the seriousness with which they approach the art of computer music.

The passionate crooner
A third fundamental feature of recorded sound, along with space and time characteristics, is the personality of the artist. In the old days, with singers such as Edith Piaf and Vera Lynn, there was a strong emotional connection between listeners and artists. But can listeners relate to persons in such an intense way when the music is so marked by synthetics and computerization?

My next case study is a trip-hop song by Portishead from 1994. Portishead is an English band well known to lovers of contemporary Western pop and rock. They were part of the Bristol sound, named after the city in the west of England where such bands as Massive Attack originated. It was characterized by ‘minimalistic arrangements, dub-influenced low-frequency basslines, samples of jazz riffs, keyboard lines or movie soundtracks, and drum loops -“breakbeats” characteristic of hip hop and rap from the ghettos of American cities during the 1970s and 1980s’ (Connell and Gibson 2003:100). In the song ‘Glory Box’, the lead singer Beth Gibbons makes her voice stand out as particularly human against a background that is particularly technological.

Track 12: Portishead: Glory Box, 1994 (1:03).

I’m so tired, of playing
Playing with this bow and arrow
Gonna give my heart away
Leave it to the other girls to play
For I’ve been a temptress too long
Just …
Give me a reason to love you
Give me a reason to be a woman
I just wanna be a woman.

This is a lush electronic sound. The band creates a dense flow of drums, bass, synths and electric guitars, and there is an LP sound effect that makes it seen as if the music is being played on a turntable. There is also a lingering reference to several other compositions, for example Tchaikovsky’s ‘Swan Lake’ and the 1970s pop song ‘Daydream’ by Franck Pourcel’s Orchestra. This goes to show that, although ‘Glory Box’ may sound very modern and alternative because of its technical aspects, the song follows a long melodic tradition. Notice that ‘Glory Box’ contains a sample from Isaac Hayes’s ‘Ike’s Rap III’ (liner notes on the CD).This is the most direct way in which Portishead refers to other musical works, since a sample is after all a little piece of a recording made by another artist (for implications of sampling, see Chanan 1995: 161 and Goodwin 1990: 258ff.).The listener would have to be quite familiar with Isaac Hayes’s work to recognize the sample, though.

Beth Gibbons sings in a good voice, crooning just like pop artists have done since the 1930s (see chapter 9). As I have already suggested, she manages to sound like an especially vulnerable woman in this stark electronic setting. It is an outstanding case of a craft that appeared in the 1970s, where female singers such as Joni Mitchell and Kate Bush projected gentle human moods against a background of a complex rock production. By adding the LP sound Portishead signals a retrospective acknowledgement of this previous era in pop music history. My conclusion is simple: listeners can relate to singers in the digital age with as much trust as they did before. Listeners are likely to think of Beth Gibbons’s exclamation ‘I just wanna be a woman’ as an expression of her autobiographical frustrations as a woman.

However, a conservative music lover might argue that these hi-tech recordings are not as authentic as the old recordings of Edith Piaf or Louis Armstrong. Singers nowadays are assisted tremendously by voice technologies, such as modulating the voice to be in the right key, while in the past artists always had to be good to sound good. But if artists lose their credibility by manipulating their voices and putting them into a technological context, then there is very little credibility in modern music. Clearly, since the human voice can be modified just as easily as any other sound, it cannot be denied authentic qualities any more than other modified sounds. The alternative would be to suspend trust in absolutely all sounds recorded after approximately 1970, since have all been manipulated.

Among music lovers the traditional view has been that the human voice is untouchable. In the 1980s there was a certain pop design philosophy, practiced for example by the Human League, that allowed all sounds apart from the human voice to be synthetically created (Cunningham 1998: 291).

There is something extra valuable about the untouched voice, it seems. Basically, a voice is personal in the sense that friends and acquaintances can recognize it as belonging any more than other modified sounds. The alternative would be to suspend one’s trust in absolutely all sounds recorded after approximately 1970, since they have all been manipulated.

Among music lovers the traditional view has been that the human voice is untouchable. In the 1980s there was a certain pop design philosophy, practised for example by the Human League, that allowed all sounds apart from the human voice to be synthetically created (Cunningham 1998: 291).

There is something extra valuable about the untouched voice, it seems. Basically, a voice is personal in the sense that friends and acquaintances can recognize it as belonging to one unique individual among hundreds and thousands of other persons. In this way it points to a name, a set of personal characteristics, etc. To say that a voice is recorded means that people who know the person beforehand will recognize the recorded voice as belonging to that unique person. If the voice sounds alive and rough, with traces of whisky or drugs and the background of a nightclub coming through in the recording, the chances are that the artist will be felt to be authentically mediated. The human voice has an ethos of its own, an emotional, personal appeal that most musicians do not dare to disrupt. This conforms well to traditional ideas of’authenticity’ in pop and rock music.

Notice that, strictly speaking, a hi voice cannot be synthetically created. With a synthetic voice the sounds have not been generated in the vocal chords and oral cavity of a flesh and blood person, and this remains the case regardless of the fact that the sound may resemble that of a human voice very much, and may even be mistaken for one if the processing is very clever. It still does not have a source reference outside the realm of digital construction.

But although a voice cannot be constructed, it can be modified. An example of this can be found on Radiohead’s Kid A (2000), which contains all kinds of humanoid sounds and blurred transitions, from the obviously human to the obviously synthetic. The increasing use of voice modification technologies makes it interesting to enquire about the limits of the voice’s trace back to a body and a personality. If the producer changes the frequencies, at what point does the sound stop referring to a unique individual? How much reverberation can a voice sustain before it becomes unrecognizable as a trace of a body? Is there a minimum duration for which the voice must be present in the mix in order for the listener to be able to relate to it as tracing a personality?

There are no definite answers to these questions, but there is a definite tendency among music lovers. During the 1990s this type of voice modification entered the mainstream of pop music, and famous artists had huge hits where their voices were heavily modified. In 1998 Cher released ‘Believe’, and took the risk of offending some listeners’ sense of authenticity because of the way her voice was manipulated. Her producer correctly presumed that her 1990s fans would be able to hear her new recording as unproblematically representing the real Cher. In an age of widespread listener sophistication such a strategy is not a great risk to take for a popular artist. Danielsen and Maaso (forthcoming) have analysed Madonna’s ‘Don’t Tell Me’ as a good example of the specifically digital sound of modern music production. Another striking example is provided by the US artist Beck, who varies his vocal style so much that it is hard to ascribe to him an identity based on voice, and he is often associated with a postmodern pop style. On the CD Midnite Vultures (1999) Beck sings in at least three different ways. He has a kind of tough guy funky voice and a slick groovy voice, and also makes use of short bursts of a hoarse screaming voice. All these are obviously processed and overdubbed, and they are sometimes superimposed on each other.

The voice is regularly modified so much that it almost doesn’t refer to a human, but becomes a sound in a strictly material sense. I wall rely on Simon Frith to conclude how music lovers relate to this phenomenon. He argues that people engage songs by responding to the materiality of the body while singing: ‘Singing is a physical pleasure, and we enjoy hearing someone sing not because they are expressing something else, not because the voice represents the “person” behind it, but because the voice, as a sound in itself, has an immediate voluptuous appeal’ (1981: 164). This is the bottom fine of modern music listening.

The recording medium, 2008
As in chapter 2, I will outline the infrastructure of the medium in question. If the professional studios are taken as the starting point, the recording industry-is still based on a highly asymmetrical technical set-up. There are complex sound-proof studios and advanced computer systems that amateur musicians can only dream of using and that most ordinary music lovers do not even know exist. The asymmetry between production and listener techniques in effect makes up two highly advanced, but separate, sound environments. Just as there is high-quality music recording, there is high-quality domestic listening.

Figure 3.2: Model of the digital recording medium.
Figure 3.2: Model of the digital recording medium.

Figure 3.2 shows that there are two different distribution platforms in the current set-up of the recording medium. The CD distribution system relies on industrial copying of CDs, which are sold through music shops on the high street or through online services such as Amazon and shipped to the buyers in the post. Music lovers can listen to CDs in a range of settings, but the most advanced setting is still the high-quality stereo set with two loudspeakers centrally located in the living room. In addition to CD distribution, there is file transfer on the internet, either legally through services like iTunes or illegally on pirate services. While downloaded music can of course be listened to on the high-quality stereo set, listeners typically download music to an mp3 player and carry it with them in their daily life. This robust distribution platform seems to be taking over from the more vulnerable CD distribution system. Some music lovers still purchase LPs and play them on the old-fashioned turntable, but this platform is not included in the figure.

This chapter mainly revolves around synthetic sound, and this type of sound relies on digital carriers. I have stressed that recorded sound has become perfected to the extent that it need not refer to natural acoustic spaces at all in order to communicate. The track by Autechre demonstrates this in the simple sense that all sources are produced inside the computer, and can therefore be perfectly represented in the computer’s code.

This synthetic quality can be explained by making a historical contrast. In both tape and digital recording the ambient sound is picked up by microphones and converted to electromagnetic impulses. Digital media have one more conversion process, namely the sampling and encoding of the electromagnetic signal into binary digits. This can be done optically by a laser head (CD) or magnetically with a spinning disc (hard disc), or with solid state electronics (memory chips). Not least, a programmer can write a sound signal entirely in code, without any external influences (White 1997).

In digital recording there is no physical contact between the microphone pickup and the code on the disc, and consequently no mechanical degeneration of audio quality. In comparison, the analogue signals of tape and LPs are like satellites in low orbit, and it is only a matter of time before they are destroyed. When rearranging a sound array on magnetic tape, the tape had to be physically cut or re-recorded to another tape deck. ‘Every time it is copied or displayed, it suffers irreversible damage. Its signs are abraded and come closer to being mere and useless things’ (Borgmann 1999: 167). It was not until the advent of digital coding that sound could be considered ‘perfectly contained’ in a storage medium.

The more often an analogue signal is sampled, the more accurate its digital representation and consequent reproduction will be. The coding standard colloquially referred to as ‘CD quality’ samples each second of analogue signal 44,100 times, and in professional recording there is typically a sampling rate of 96,000 Hz (Thompson and Thompson 2000: 321ff.).The sampling process ensures that ‘there is no discernible difference between the sound recorded in the studio and the signal reproduced on the consumer’s CD system’ (Goodwin 1990: 259). In a technical sense at least there is a stable and ‘perfect’ signal equilibrium between the producing and the listening end. In the late 1980s the digital signal of CDs was marketed with the slogan ‘Perfect Sound Forever’ (Harley 1998: 255).

Sound on the screen
Musicians can create sound on synthesizers, and they are therefore liberated from the restrictions of studio acoustics. But they are also liberated from an enormous amount of wiring and apparatus that were necessary in the analogue era. Indeed, there are such a large number of plug-ins and digital devices that the entire work of production can take place in front of the computer screen. In this creative environment the features of the signal can be sculpted in enormous detail. Brian Eno has described the multitrack studio as a microscope for sound (Cunningham 1998:334), and this description is highly relevant for contemporary audio software.

To illustrate the functionality of software audio I asked haltKarl to supply a facsimile of the composition ‘Almost Gills’ as it looks when spread out on a computer screen in Cubase. It demonstrates well the contemporary work environment for musicians.

Figure 3.3: Screen shot of composition in audio software.
Figure 3.3: Screen shot of composition in audio software.

At the bottom of figure 3.3 we can see the symbols for starting, stopping, pausing and winding back or forward, and for many producers they provide an immediate association with the obsolete interface of the magnetic tape machine. It is a standard strategy in graphical design to make new interfaces resemble the old ones wherever possible. Another example of this is the faders to the top right, which look very much like those on an analogue mixing board. The design also borrows structures from notational systems in acoustics with the curves that display volume and other physical characteristics. Other features that are not shown in figure 3.3 borrow the structure of musical notation systems, and the musician can plot notes directly into the interface.

To the left of the figure we can see that the tracks are organized in a series of horizontal rows (much like tables in Excel), and this is the main tool for multitrack production. Each track is an independent file, and this means, for example, that the volume, the frequency, the speed of reproduction, and all kinds of effects can be manipulated independently for that track.

During the multitrack work process the software is alive with animation. The tracks that are selected for visual presentation will flow through the screen from left to right. In audio software there are many features that depend on the animated representation of soxund in the graphical display. One of the most striking is the fact that each track can be zoomed into in great detail and processed accordingly. Each sound signal can be rearranged so that the chronology changes, the pitch and volume can be altered, and the recording speed can be slowed down or speeded up. All this can be done in minute detail. For each track the sound signal can be manipulated at the level of minutes, seconds, tenths of seconds, hundredths of seconds, or milliseconds.

My point is that digital sound production is anchored in computer graphics. The visual guidance has a radically creative influence, since the complex doctoring of sound in contemporary music could not have been achieved without it. The coordinate system and graphs that can be displayed on the screen have greater analytic detail than even the most skilled hands at the knobs of the tape recording machine could achieve. The combination of acute listening to the sounds and visual guidance on the screen becomes a powerful tool.

The mixing board is for many purposes embedded in this graphical interface. It has been an instrument in the recording process for several decades, and as I have stressed it has become an instrument almost in the same sense as the guitar or the piano. The mixing board is a blender, where any sound source can be channeled to an appropriate destination and all the signal streams can be manipulated independently of each other. For decades the mixing board forced technicians and musicians to develop skills in hand-ear coordination in order to work with sound. The mixing board was a huge electronic component, with hundreds of knobs and levers (see the cover of this book, where a Midas XL3 is depicted). Developments in the computer industry have greatly reduced the focus on such manual labour and increased the emphasis on hand-eye micro-movements.

Among producers there is now an expectation that they can do anything with sound on the computer. Paul Théberge says that the ‘infinite malleability of the sequenced data mitigates against the idea of a single, finished product; indeed, many musicians complain that in the studio it is difficult to know when to stop working and rearranging the sequenced material’ (Théberge 1997: 229). Since it is easy to make radical alterations even in the later stages of production, there is more decision-making than before, and recording becomes a question of initiative and the lack thereof. ‘When faced with a stunning array of possibilities, it becomes difficult to determine exactly what decision to make, what choice is the right one’, Steve Jones (1992: 153) observes.

Perfect sound
Finally I will focus on the way that music is consumed in domestic settings. Music lovers create a sweet spot in their living room by placing the loudspeakers so that they point to the same point in the room, and when they have adjusted the volume, bass and treble they lean back to enjoy the music. There is great public concern with the fidelity of sound, and it seems that CDs and the stereo system are still considered the best platform for sound reproduction. Poor sound is only a reflection of poor equipment. Less than perfect sound is somehow a result of one’s economic situation and other factors completely external to the CD medium.

But sound perfection only takes us so far. Imagine that you have just rented a car, and have to drive for a really long time. In the glove compartment you find a cassette tape by Bruce Springsteen, and you put it on knowing well that the old cassette played on a cheap car stereo will not sound anything like your CD system. But you really like Bruce Springsteen, and you are delighted at this happy coincidence. The experience of recorded music as such is more important than any individual platform for it. My point is that, when digital recording arrived, music lovers were already satisfied, and some people consider the warm and round sound of 1950s and 1960s vacuum tube amplifiers to be superior to current CD equipment, which is often considered cold and sharp-edged.

Platforms that are not related to the computer suffer a decline, and become increasingly obsolete among the new generations of listeners. The sounds created by Frank Zappa on magnetic tape with LP distribution are only present to us with their sound signals, and not with the manual techniques that were used to record and play them. If they are available only as mp3 files on the computer, they must be handled with the techniques of mp3 files on the computer. The listener does not crank up the mechanical gramophone to listen to Enrico Caruso on the internet. In this way the production and listening techniques of obsolete platforms slowly disappear. Even the CD player is fast becoming obsolete because of file transfer on the internet, although the music industry is trying to counteract this development.

On the other hand, there is no difference in the strictly auditory experience of recorded music. Let us say you download the Frank Zappa music as an mp3 file, and then you play it back through the stereo set that you got for Christmas in 1994. Although the storage and distribution platform has changed, the auditory experience in front of the loudspeaker remains more or less unchanged. The acoustic drama has been greater in earlier times, for example when stereo amplification replaced mono sound in the 1960s.


4. The mobile public – Journalism for urban navigators

Sound is very useful for mobile communication, and it is more useful than live images if you are driving a car or navigating in other ways. Urban dwellers in particular listen to the radio while they travel around the city. And there are professional news services that cater to this mobile public by constantly presenting updated information about relevant topics such as weather, traffic and public events in the city.

This chapter is concerned with the documentary realism in the programming of news stations. Their entire output is based on the implicit claim that this information can be trusted – that the navigators can continue their travel and rely on the station to present true and relevant information about crucial issues. To accomplish this big task, news stations have an efficient, centralized newsroom that is run according to strict journalistic procedures. Many people work in concert to create a constantly updated news service and a good flow in the programme. The case study for this chapter involves the news station, 1010 WINS in New York City, a station that has been run according to an all-news format since the 1960s and which is one of the oldest in the world of its type. I will analyse the main news and its distinctive sound as it appeared in the year 2001.

The mobile media landscape
Before I go into this type of broadcasting in detail, I want to sketch the wider perspective. Modern life in the West is full of urban navigation. People are walking, cycling, driving a car, taking the subway, going by train, going by boat, and flying aeroplanes, and we are all going somewhere in particular. As individuals move around the city, village or countryside they can only rely on their own skills of navigation. And information that media can supply is often very useful during these acts of navigation. Wireless communication is particularly interesting as a provider of relevant information for individuals.

Figure 4.1: Timeline of wireless media.
Figure 4.1: Timeline of wireless media.

Figure 4.1 shows that there is a wide array of wireless news media on the market. This chapter will focus on services that can be distributed on the platforms drawn in black: AM/FM radio, DAB radio and satellite radio. Although I will analyse radio journalism’s contribution, I will first describe the wider landscape in some detail.

Below the arrow I have listed a series of five media that have the potential to become potent news providers, or that have already been potent for many years, such as CNN. The advanced wireless technologies such as satellite GPS navigation are at the pinnacle of a hundred years’ worth of innovation in the transmission of signals. Although GPS has enormous editorial potential, it has not been properly exploited yet, and it has unresolved issues about protection of privacy. I will discuss the potential of GPS journalism briefly at the end of the chapter.

The good old terrestrial radio stations still dominate the mobile media when it comes to news and current affairs, basically because the free and simple reception of the signal is a continuing attraction for listeners. However, subscription-based satellite radio has been available in the USA since 2001. It is used mainly for nationally syndicated programmes, and less for the kind of all-news services that I will analyse here. DAB radio was introduced in Europe at about the same time as satellite radio in the USA, mainly as a higher quality platform for traditional content (Lax et al. 2008). Notice, however, that both satellite and DAB radio can transmit meta-data and short messages along with the sound, and, for example, music information, traffic updates and news feeds can be piped through to the small text display screen.

The mobile phone is perhaps the most versatile tool when it comes to navigation, since it is a personal medium which few people lend out to others, and the user therefore always has it easy to hand. You can contact other people in a crisis, or you can browse through websites on WAP. A WAP browse provides the same basic service as a web browser, but the interface is simplified to operate within the restrictions of a mobile phone, for example the small screen and low bandwidth.

Online news sites are quite prominent in the mobile media industry in the 2000s. A website can present text and graphics of high practical value, for example maps, directions, alerts, and so on. But although the user can read websites on laptops in wireless hotspots while travelling, news sites on the internet are truly mobile only if the user has a broadband-equipped telephone serviced by a high-speed mobile network. In many Western countries there are such services which give you the entire internet on your mobile phone. Clearly these digital services can help you to navigate more efficiently through a big city and to socialize with other people in new ways (Rheingold 2002).

When considering to news services it is important to mention all news stations on television, since they are the strongest competitor for radio news. While travelling, people typically listen to the radio, but when they get home they immediately turn on the TV and forget about the radio until they leave the house again. The most famous all-news network in the world is probably CNN, which was started in the early 1980s but had its real international breakthrough during the Gulf War in 1991 (Gitlin 2002: 171). Later examples are BBC World, MSNBC and Bloomberg Television. But the idea of really quick updating of news was first established during World War II. Public demands for new information was a strong influence, and it forced the broadcasting stations both to have hourly bulletins and to become more professional at handling the news (Crisell 1997: 56). In the latter half of the twentieth century the process towards more frequent updating culminated with the 24-hour news format. In a time-bound medium this is the maximum limit.

All news all the time
1010 WINS is a 24-hour all-news radio station for the greater metropolitan area of New York. It compiles journalistic information about the following thirteen counties: Manhattan, Brooklyn, Queens, Bronx, Staten Island, Nassau County, Suffolk County, Fairfield County, Bergen County, Essex County, Long Island, Westchester County’ and Rockland County. This list alone tells us something about the level of detail in the station’s journalism.

Before going into detail about journalistic production, I will describe the sheer sound of this type of radio. It ticks along like a clock through the day. Every full hour there is a high-pitched beep to signal the time, and then comes a brass fanfare over a Morse code-like percussive rhythm. A male voice says: ‘All news all the time. This is 1010 WINS. You give us twenty-two minutes, we’ll give you the world.’ The ticking sound continues as the news reading commences. Notice that five different journalists speak during the first minute.

Track 13: 1010 WINS: Top of the Hour, 2001 (2:24).

– Good morning, 67 degrees at 7 o’clock, it is Wednesday May 2nd, I’m Lee Harris, and here’s what’s happening. Con-Ed reportedly wants to fire up the city’s most polluting power plant to deal with the energy crunch, but environmentalists say that move is illegal.
– Say it ain’t so, Joe Namath. Would the Jets really move to LA? This is John
Montone, I’ll have fan reaction from Carlstad.
[Back to Lee Harris]
– Police and prosecutors are looking into the fatal shooting of an unarmed Newark man by an Irvington police officer who killed another man during a traffic stop four years ago. Super model Nicky Taylor remains in critical condition with severe liver damage following a car crash in Atlanta. And you probably don’t know any of them, but we are told millions ot Americans are taking up yoga to deal with all the stress in their lives.
– This is Accuweather meteorologist Dr Joe Sobel. Sunshine will help temperatures get to near ninety today and tomorrow. Could be record highs.
– This is Steve Torre.The Devils win at overtime in Toronto, to grab a two-one lead in their second round series. The Yankees behind. Mike Michina shot out the twins. The Mets rally to beat the Astros at Shea.
– I’m Ron Emmeda, and Rupert Murdoch moves closer to buying
Hughes Electronics from GM. And April was not a kind month for the
nation’s auto makers.
[Back to Lee Harris]
-WINS Newstime 7.01. [A synth marks the time, and a keyboard rhythm ensues] Traffic and Transit on the one’s now. Sponsored by New Jersey Transit. Here’s Pete Taunello.

The excerpt ends with a long traffic report that I have not transcribed here, but it will be dealt with in detail in the next section. The most important thing to notice is the lively sound of the news flow. There is a clear rhetorical strategy behind it. The output is typically heard on car radios with poor reception, on clock radios in the bedroom, and on cheap mono radios in the kitchen. The station caters explicitly to this acoustic contact, processing the presentation to sound crisp and intelligible, but at the same time coarse, solid and sturdy. The news sound is kept up hour by hour, and year by year, and only the commercials create breaks in this endless flow of news. The characteristic sound means that regular listeners can easily find the station on the AM band.

The work environment in news stations is very different from that ot recording studios. Broadcast studio environments are busier and more stressful for the producers and performers alike (see Herbert 2000: 22ff.). Instead of having all the time in the world to edit and mix the sound, they are organizing a never-ending live transmission. The idea of a second take is absurd, as it has always been in five radio.

The all-news format is not built to create sustained listening throughout the day but to create frequent revisiting whenever people might need an update (Crisell 1994; Shingler and Wiennga 1998; McLeish 1999). 1010 WINS typically completes a full circle of news, traffic and weather in around twenty minutes. All news stations present themselves as a standing reserve of updated, relevant, fresh information in a designated area, building their credibility on catering to the information needs and interests of their listeners at any time.

Since the format is trained towards news stories with short-term relevance, there is a sense of urgency and immediacy to the programming. This could be called practically live programming. The output has been scripted and read during the last ten to twenty minutes, and that is sufficient for a sense of real-time progression to be incurred.

Stations that choose to adopt this format are typically based in big cities, where the population is dense enough for it to be profitable to set up such an editorial service, and where the surroundings are actually complex enough that there is a real need for guidance among the inhabitants. The stations can draw on strong feelings of local community among their listeners and they can also cultivate this sense of community in their programmes/People’s feelings about community, about territory, work and weekends, roads and traffic, memory and play, and what might be happening across town’ are seized by radio so that it can ‘map our symbolic and social environment’ (Jody Berland, quoted in Hendy 2000: 188).

The message is very important on news stations. It might involve a suspension of the alternate parking rule, the name of the newly elected mayor, or a terrorist attack; and in no case is the sound produced as a background feature. This is different from much of the contemporary radio and music output, which is often produced as a comfortable living space for listeners. The practical implications of the information form the main issue for listeners, and consequently news production consists of detecting events that take place in the public sphere, describing them according to the relevance for the listeners, and radiating them back to the pubhc sphere. The journalists talk about the features of the city in a way that presumes their listeners are familiar with its outline, and the common reference created in this way can metaphorically be called a map of the city. The suburbs are distinguished from the city centre; the seashore, rivers, bridges and roads are outhned, and within these implicit geographical borders the political and commercial organization of the city is followed up carefully.

Figure 4.2: A map of New York
Figure 4.2: A map of New York

Get mobile today
The car is an extension of the person’s ability to walk, or, as McLuhan spectacularly states:’it is an extension of man that turns the rider into a superman’ ([1964] 1994: 221). The car is one of the few settings in which radio can get the listener’s full attention. A British driver says: ‘My best moments of radio happen in the car because I am stuck with it. Television and printed materials are out of the question and, unless I want to listen to tapes, radio becomes the central focus of my attention’ (quoted in Shingler and Wieringa 1998: 111). When driving the listener is open to new information about the geographical area he moves around in.

The car listener is a splendid example of an ‘urban navigator’ and will be my main protagonist in the next sections. A navigator is simply a pilot or a guide. This is a person who is in transit from one place to another and checks the coordinates of an unknown or at least a confusing area by the use of instruments or memory, and he either carries other people with him or travels alone. The navigator’s destination varies greatly through the day and the week, but some routes are driven very often, such as the route to work.

I will describe a scenario where a car driver who lives on Staten Island is about to go to work in Brooklyn. The navigator leaves home in a Staten Island suburb at 7 a.m. and turns on 1010 WINS. A slow drive to downtown Brooklyn lies ahead, first onto the Staten Island Expressway and then across the Verrazano-Narrows Bridge to Brooklyn. After crossing the bridge, if there isn’t too much traffic, he can drive on the Gowanus Expressway all the way to downtown Brooklyn. During the morning rush hour this stretch of 20 to 25 kilometres will take approximately one hour.

People driving from Staten Island towards Brooklyn around 7 a.m. on Wednesdays will move slowly, especially on the Gowanus Expressway. On 1010 WINS Pete Tauriello says: ‘Bumper to bumper delays in Brooklyn, Lee, on the Gowanus from the Belt right up to 38th Street, where we have a box truck broken down, blocking the left lane.’ Since there is no mention of the Verrazano Bridge, the driver can presume that traffic on this stretch moves at a normal tempo, which is a good thing if there is trouble on the Gowanus. ‘Bumper to bumper delays’ is an unhappy word for everybody who has not yet passed 38th Street. It means that the speed may have slowed down to 5 or 10 kilometres an hour, which will at worst double the drive time.

By 7.30 our driver has entered the Gowanus Expressway. Hundreds of cars are standing almost completely still because of that accident on 38th Street in Brooklyn. With a touch of compassion Tauriello irrforms that traffic has completely stopped on the inbound Gowanus at 38th Street. As 1010 WINS moves towards 8 a.m., some listeners are so completely stuck in traffic that they do not really need more information. When traffic starts moving again they will notice without any help from the radio studio. And now a short message from our sponsors:

Track 14: Unknown artist: Get Mobilized, 2000 (1:00).

Get up, get out, it’s another day,
You’re on the road trying to make it pay,
Too many things to do and not enough time,
Don’t want to find yourself sitting in line.
Who wants to spend their life at the pump,
We’ve all got somewhere to go.
Fill her up fast and be on your way,
Get mobilized, get Mobil today.
Pick up the kids, pick up some bread,
Gotta keep moving to stay ahead,
Sometimes it feels like you live in your car,
Seems like you drive all day and never get too far.
Fill her up fast and be on your way,
Get mobilized, get Mobil today.


This little tune was put on air in the USA by Mobil Oil. The unknown band tells an encouraging story about everyday life on the road, and the melody is reminscent of a thousand country-and-western songs. The purpose of this ad is to make listeners recognize the need to be on their way, and associate this important feature of their lives with Mobil Oil.

Advertisements such as this recommend people to act as navigators so that the oil company can earn money on petrol. The advertisement shows that the navigator is also a rhetorical phenomenon; it is a role that is staged in the public sphere in order to influence the citizen to behave according to the role, and it connects well with what James Carey calls the rhetoric of the technological sublime: ‘Everyman a prophet with his own machine to keep him in control’ (Carey and Quirk 1989: 117). People are led to believe that the greater control and power they have over their fives, the better their lives are.

In contrast to the ad, 1010 WINS’s editorial reports most definitely have use value. The traffic report may influence the pattern of traffic through the area and inspire reflections on strategic possibilities among listeners who are not in the middle of the tailback, but who know the routes. This is to say that the news station offers a pragmatic mapping of a given area. When 1010 WINS say they will give you the world, they mean the world within practical reach, which in this case is New York City. It would be absurd to syndicate this kind of programming on a national scale.

Breaking news
Suddenly the breaking news jingle disrupts the routine progression of the day. Our driver on the Gowanus Expressway instantly knows that something bad has happened. When a local station presents breaking news, the news may be directly relevant for the navigator’s next move, since it may imply that he or his family is in real danger.

Brooklyn, September 11, 2001. Illustration Atle Skorstad.
Brooklyn, September 11, 2001. Illustration Atle Skorstad.

Track 15: 1010 WINS: Breaking News, 2001 (1:23).

[Synth-based heading] ‘Breaking news now on 1010 WINS.’
James Faraday: ‘This just came into our news room: A plane has crashed into the World Trade Center. Let’s get this live update from 1010 WINS correspondent Joan Fleischer. Joan, what do you see?’
Joan Fleischer: ‘Well, I’m standing on the top of my roof and I’m looking at the World Trade Center and there is a huge hole in it, and there is a fire in the building right now, huge smoke pouring out of it and things are falling from the building itself […]’
James Faraday:’Allright, 1010 WINS correspondent Joan Fleischer on the scene in Lower Manhattan … er … any emergency personnel on the scene as of yet, do you see?’
Joan Fleischer: ‘I can’t see anybody but I hear the fire trucks … and … I heard the plane very close to the top of the buildings. I looked outside and I saw it hit and it exploded immediately’
James Faraday: ‘Did you manage to see what kind of plane it was?’
Joan Fleischer: ‘I couldn’t tell, it looked like a smaller plane but I couldn’t tell, no I’m not really sure. I would say it wasn’t a huge jet, but it was a plane that sounded hike it was a fighter jet overhead and then I saw it explode close to the building.’
James Faraday: ‘Are you able to see any wreckage on the ground from where you stand?’
Joan Fleischer: ‘No, I’m too high up.’

We hear the very first attempt to describe the fact that a passenger plane has crashed into the World Trade Center, before anybody had really understood what was happening. The anchor is located in Midtown Manhattan where the studios are, while the witness, Joan Fleischer, speaks on a landline telephone from her apartment in Lower Manhattan. Her apartment overlooks the Twin Towers, and she is in the perfect position to describe what happens. Our driver in Brooklyn also has a perfect view, but from a much greater distance.

I will analyse the technique of journalistic eyewitnessing at length. The radio woman and the car driver are both witnesses to the events, but in two different ways. There are two meanings to the notion of witness; seeing something in a passive way, and saying something about what you see (Peters 2001: 709). Joan Fleischer is thrown into the latter role, and is charged with the responsibility of finding the words to describe this unexpected and harrowing event.

As the scale of the disaster dawned on people, the entire city ground to a halt. The moving public of New York was immobilized by traffic jams and overload in all communication systems. Drivers would be even more completely stuck in their cars when the mobile system crashed, with nothing better to do than listen to the radio news and ask themselves what will happen next.

As we all know, the September 11 events became global news in an instant, and all news companies in the world reorganized their schedules to capture what was going on in New York. The same is true for 1010 WINS. Along with other local news stations they were in a good position to report from the disaster. They had several roaming reporters who had to record on tape and drive back to the studio to publish them on air, because the wireless networks for telephony were down. John Montone at one point reported through a public telephone booth, playing his cassette recording close to the mouthpiece.

In New York on this particular day there were thousands of witnesses, but since the mobile phone network was blown out very few could give describe what they experienced there and then. If we compare their situation with that of Joan Fleischer at 1010 WINS, it is clear that most of the listeners in the greater metropolitan area were at a safer distance. Indeed, most New Yorkers would have no first-hand experience of the disaster, but only hear about it on radio and TV. John Ellis (2000:11) points out two typical qualities of this experience: powerlessness and safety. The listeners on September 11 were powerless in the face of the attack, but could feel relatively safe (except that there was fear of more attacks in other locations).

But even at a distance they would be profoundly touched by what they saw. Paddy Scannell (1996: 101) distinguishes between witness and victim in a way that helps me to get the point across. The witness might have seen the whole thing first hand, but is nevertheless at a certain distance from the event. The victim, on the other hand, is at the centre of events in a far more volatile way. ‘It’s the difference between direct and indirect involvement, between something happening to oneself and seeing something happen to some others. ‘The implications of being a victim fitted the urban navigators in New York City during September 11 quite well. The victim engages in a self-oriented behaviour where he feels sorrowful or angry in and for himself (ibid.).This is how the drivers on the Gowanus Expressway and other New York arteries must have felt after some hours of gridlock, with new and disturbing events constantly being reported on the radio.

Joan Fleischer is an administrator at the radio station, but as soon as the newsroom is made aware of her ideal location she steps in as an eyewitness. The authority of her report relies on the timbres of her voice. In her shock and dismay we have proof that this is actually happening, or at least the listener would have to be very suspicious if they were to claim that her eyewitness reports are staged.

In rhetorical terms, the eyewitness’s credibility relies on making the listener believe that the speaker has direct access to an aspect of the world as it is currently developing. The eyewitness makes the listener believe that what is said is actual and real. This classical notion of documentary realism comes across in Harold Mendelsohn’s 1962 claim that live news and information programmes ‘allow the listener to participate vicariously in the great events of the day’, and that the listener identifies more strongly with the speaker ‘merely by virtue of having been a witness to the same happenings’ (quoted in Fornatale and Mills 1980: xvii).

The next case study displays eyewitnessing at its most extreme. The listener is likely to be touched not only by the physical drama of the events that are described, but also by the psychological drama of the reporter’s effort to describe it.

Track 16: 1010 WINS: Vie South Tower Collapses, 2001 (2:16).

Lee Harris:’… called it quit while they were ahead there. And all of this, the World …’
Joan: ‘Oh wait, oh my God there is … [scream] … oh my God, the building fell!!! Are you there? The building just fell!’ Lee: ‘Which, which building?
Joan: ‘Oh my God! The south building fell, the south building just crumbled from the top [shaky voice] ohh my God, the building just fell! The entire World Trade Center, the south building, just fell. I just saw the whole thing. Oh my God! oh my God! I can’t see anything, but the whole thing went down. Oh my God! Ohh, I saw the building crumble, it’s all the way down. I can’t see at what point it is still standing. Oh my God, ohhhhh [several seconds of gasps and sighs]
Lee: [shaky voice] ‘If what Joan is saying is true, the tower has just collapsed …’ [Beeping sound — time signal at 10 o’clock]. This is 1010 WINS …’ [beeping sound]
Joan: ‘Ohh my God, the building just crumbled.’
Lee: ‘This is 1010 WINS, New York. One of the Joan: ohh … completely down] two towers, the south tower of the World Trade Center, has just … crumbled, collapsed in a pile of dust Joan: Ohh], this approximately one hour after it was hit by an aircraft.’
Joan: ‘… the second building … the second building completely down. Joan speaks to a man] … you heard the … me too!’ Lee: ‘Uhh … a situation that started bad just gets worse and worse and worse. The World Trade Center, south tower, that was hit by a plane and wrecked by an explosion, approximately an hour ago, has totally collapsed. The north tower is still standing, but the World Trade Center south tower has collapsed. Let’s go live to CNN for a moment.’


Notice the effect of the outdoors acoustics that is picked up by Joan Fleischer’s mouthpiece. There is a frantic wailing of fire engines, and clearly something very dramatic must be happening for so many to be in the same spot with their sirens going at the same time. This is a good example of how environmental sounds lend documentary authority to a news report.

Joan Fleischer’s eyewitness account of the event is highly engaging. As suggested, the way she utters her words is in itself a symptom of something dramatic. The hesitation and stammering is completely different from the news reporter’s regular way of speaking on air. She is emotional, hurt and shocked -all that a news journalist is supposedly not. Goffman (1981: 223) says that the traditional news reader’s role ‘requires the performer to set aside all other claims upon himself except that of presenting the script smoothly’. But this type of neutral address is impossible for Joan Fleischer, and her emotional reaction is of course exactly what makes the eyewitness report so convincing.

Lee Harris is safe in 1010 WINS’s studios in Midtown Manhattan, but he nevertheless struggles to cope with the situation. He is almost unable to speak coherently, and he sounds quite distressed. When he says ‘a situation that started bad just gets worse and worse and worse’, we can easily hear how shaken he is. After summing up the disaster he says ‘Let’s go live to CNN for a moment’, and it sounds as if he is in desperate need of time out.

How does his behaviour compare to the journalistic procedures? It is a rule of radio journalism that the news reader should sound lively, confident and sensitive so that he becomes even more credible as a source of information/The overall process should give the listener the impression that the broadcaster’s talking to him rather than reading at him. It’s prepared, of course, but it should sound spontaneous’ (McLeish 1999: 65). Lee Harris demonstrates that in breaking news situations the speech can actually be spontaneous all the way down, and this makes the journalism highly credible. Sometimes, as the journalistic procedures break down completely, the trustworthiness of the production skyrockets.

The radio medium, 2008
It may seem strange that the medium of radio is not discussed until chapter 4 in a book about sound media. However, the computer with its internet connection and enormous possibilities for digital manipulation had to be presented first, because computers also influence the production values of radio in our time as well as the listeners’ media behaviour. But radio is still a self-sufficient medium with the same strengths and weaknesses as it has always had.

There are four different platforms involved in the medium of 2008, all with their own gigantic distribution system. These are terrestrial transmission, satellite transmission, mobile networks and the internet. I will focus on the two platforms that are most directly important for traditional radio services, namely terrestrial and satellite transmission (the internet’s functionality was discussed in chapter 2, and the mobile phone will be discussed in chapter 5).

Figure 4.3: Model of the wireless sound medium.
Figure 4.3: Model of the wireless sound medium.

Figure 4.3 shows the basic interfaces and platforms in modern radio. Programmes are created with microphones, mixing boards and computers. The signal goes to the transmission stations, where they are radiated out through satellite systems in the sky or radio towers on the ground. Imagine the great distance between the satellite in the sky and the radio receiver on the ground, and you see the scale of the operation. The signal is received on quite small and handy receivers, which are either portable (in the car, for example) or wearable on the listener’s body. One of the key features of both satellite and terrestrial transmission is therefore that the listeners can roam around freely in the environment without having any reception problems. It is the same thing with mobile telephony, since this also relies on signal transport through the air.

Interestingly, the atmosphere around the earth is a most important carrier of information. It is used for military and commercial communication, and people are in fact constantly exposed to ultra high-frequency signals that criss-cross our bodies at the speed of light.

Terrestrial radio could also be called ground broadcasting. Whether they carry digital signals such as DAB or analogue signals for FM and AM, the waves spread in a line-of-sight fashion into the natural environment. The signals go straight ahead, and do not curve around hills and mountains; nor do they conduct along the ground as lower frequency waves do. There is no way to pick up American broadcasting in Australia late at night (except by web radio through the internet). It should be noted that ground broadcasting is therefore very precise and localized, and if a station has only one transmitter in a hilly area many people will fall outside the zone of reception. If a station has national coverage it is only because dozens of transmitters and hundreds of signal repeaters have been built all around the country. AM and FM broadcasting is especially suitable for local and regional media outlets. Terrestrial radio will be analysed at length in later chapters.

Satellite radio could also be called sky broadcasting. When the signal leaves the radio station it goes to an up-link antenna which maintains a link with the satellite, and is fed with new information all the time. Geostationary satellites have a very high data capacity, and can offer hundreds of stations simultaneously to potentially millions of people. The satellite orbits in the direction of the Earth’s rotation, at an altitude of approximately 35,786 kilometres. For reception equipment on the ground a satellite with geostationary orbit always appears in the same location in the sky, and satellite is therefore a very practical platform for transmitting signals to the same geographical area over a long period of time.

Satellite reception allows a listener to roam across an entire continent, listening to the same audio programming anywhere they go. The antenna must have a clear view to the satellite, and in areas where tall buildings, bridges, or even parking garages obscure the signal, repeaters can be placed to improve it. The receiver interface resembles the TV receiver in many regards. There can be a recording functionality, programme pause and resume, and a graphical display of available services. An important difference is that there is no satellite dish with a fixed position, and the reception is therefore much more mobile.

The receiver for terrestrial radio looks much the same as that for satellite radio. This is especially true for DAB. Both satellite and DAB signals can carry radio text or dynamic label segments from the station, and can give journalistic information such as song titles, music type and news or traffic updates, and of course also commercial messages. Advance programme guides can also be transmitted. A similar feature also exists for FM in the form of the radio data system (RDS), which can transmit the station’s ID and also retune the cat-receiver to stations that have regular traffic bulletins.These features are not very spectacular compared to those of a mobile phone, and it seems clear that much more experimentation is needed to achieve true innovation in terrestrial broadcasting.

GPS journalism
I will end this chapter with a comment on GPS, which may be the thing of the future regarding news and information services for urban navigators. The Global Positioning System (GPS) is based on a gigantic infrastructure of satellites in outer space. There is a constellation of at least twenty-four medium earth orbit satellites that transmit precise microwave signals, and the system enables a GPS receiver to determine its local time, location, speed of movement and direction of movement. This information is produced by triangulating the position and time of several satellites.

The ability to determine the receiver’s absolute location allows GPS receivers to act as a surveying tool or as an aid to navigation for ships, cars, freight lorries, aeroplanes, and so on. Much of this communication is completely private, and there is little or no coordination of people via GPS. Museums and zoological gardens can use location data to present information about artworks or caged animals. The visitors carry a hand-held device and when they get near to the object in question a pre-recorded tape or video starts playing automatically; if they leave the zone the recording will stop. If the focus were changed to a more direct experimentation with social coordination, GPS journalism might be a rhetorical technique with great potential for the future.

Technically speaking there could be a GPS chip in every new radio receiver. Notice the resources that would be put at the disposal of journalism: the time, location, speed of movement and direction of movement of every single listener, registered continually. The journalistic potential of such a databank is enormous. Of course such a resource could also be exploited cynically for purposes of surveillance or blackmail.

There are no widespread journalistic techniques in this field yet, and no established listener techniques either. But the technology harbours features that would greatly empower the urban navigator. Imagine that you could log into a station community where all the topics and all the information is organized to assist you at your present location, and new batches of information are presented when you move into a new location. The listeners are likely to feel an even stronger identification with the community of which they are part than they do with what stations such as 1010 WINS are able to do now. With GPS journalism the station can administer the busy life of urban navigators in a safe and controlled way, at least during the calm period until the next breaking news turns everything upside down.


5. Phone radio – Personality journalism in voice alone

Talk radio is a mixture of voices from the phone and voices from the radio studio. Radio has always had a bias towards personalities, and now it is stronger than ever. The radio industry thrives on the opinions, anecdotes, jokes and confessions of ordinary people.

This chapter is concerned with the centralized, editorial procedures of professional talk stations, and in particular focuses on what could be called personality journalism. I will analyse not just how journalists present themselves to the public, but also how lay participants do it. There are three case studies: an old lady in London calls a current affairs show on LBC Talk in 2004; two young girls from Crewe (UK) call to participate in a quiz on BBC Radio 1 in 2001; and a depressed husband calls the psychiatrist Dr Joy Browne on 710 WOR in New York. Their behaviour was diagnosed long ago: McLuhan ([1964] 1994: 299) wrote that ‘radio affects most people intimately, person-to-person, offering a world of unspoken communication between the writer-speaker and the listener:

The talk media landscape
Talking is a fundamental means of communication between people. It is among the first communication skills we learn, and is already well developed at the age of five. My point is that talking is not an expert activity like computer programming, news journalism or record production, where ordinary people often cannot contribute in a valuable way.

On the contrary, speech is one of the forms of communication that is truly democratic and inclusive. But the media have historically cultivated professional, normalized speech of the type that can be called ‘recital’ and ‘role play’, and which is not very personal. However, with the emergence of the mobile phone and all kinds of reality shows on television since the 1990s, there has been a rhetorical turn towards ordinary talk and behaviour on radio also. This tolerance for ordinary communication is much stronger than, for example, in the 1960s, when the phone had no particular role at all in radio programmes, and everything had to be spoken according to received pronunciation.

There are reality shows and singing contests, and quiz shows where anybody can sign up. For adeast ten years the general public has been trained in singing, dancing, eating, arguing and displaying all kinds of (traditionally) private behaviour in public. Radio has a modest role in this huge endeavour, but none the less a vital one (see Scannell 1991). Radio has mobilized ordinary people, predominantly on the mobile phone, which gives a particularly direct and intimate contact between the private and the public. The sheer volume of talk radio is amazing. There are hundreds of dedicated talk radio stations scattered locally around in the USA and Europe. Phone-ins are among the cheapest forms of programme production, since there is no costly reporting and editing and no expensive traveling for the crew.

Figure 5.1: Timeline of talk media.
Figure 5.1: Timeline of talk media.

Figure 5.1 shows how stable the media involved in talk radio have been during the last thirty to forty years. The programmes are distributed either on AM/FM, DAB or satellite radio (see chapter 4), which I conceive of in general terms in this chapter as ‘radio’. For the radio industry the only significant change in the context of talk radio is the introduction of the mobile phone among ordinary people.

Under the timeline I have listed three media set-ups that also convey lots of talk. Of course talk shows on television have existed so long that they influence conversations all across the nation, and the telephone medium is so obviously a part of our lives that we don’t even think about it as a medium. The final category is internet phone, which involve networks of voice communication provided by companies such as Skype and MSN. Callers can congregate in a number of ways; for example, they can make conference calls, send chat messages, or go to one-on-one mode and talk in private.

It is important to notice that the mobile phone is a personal medium in comparison with the house telephone, which is a shared domestic medium. The mobile phone has been studied extensively in latter years (see, for example, Katz and Aakhus 2002; Ling 2004; Mercer 2006; Katz 2006). There is much greater room for privacy and intimacy from a speaking device that people carry with them at all times and only hesitantly lend to anybody. In less than twenty years the mobile telephone has gone from being rare, and an expensive piece of equipment used by the business elite, to a pervasive, low-cost personal item. In many households the individual phone has completely replaced the traditional landline telephone.

There was little or no talk radio in the strict sense before the 1970s, and then mainly in the USA. But there was an upsurge in the USA after the repeal of the ‘fairness doctrine’ in 1987. This doctrine was enforced by the Federal Communications Commission and required that stations provide free air time for responses to any controversial opinions that were broadcast, and the repeal provided the opportunity for partisan programming that had not previously existed (Douglas 2002: 491). Talk radio in the USA gained ground after this time, and in many other countries there is also a strong tradition of tabloid as well as political talk radio (see, for example, Ross 2004; O’Sullivan 2005;Tolson 2006). If we look all the way back to the 1950s, radio was not telephone-driven at all. In the golden age of radio there were lots of radio personalities, but they were all professional and had rehearsed every second of their performance in advance. It was all more formal, more exclusive, and more marked by role play.

The lively sound of voices
The first case study is a morning talk show on LBC. Nick Ferrari’s morning show is a staple of the station’s weekday schedule. There is no music, only talk, jingles, commercials and news at the top of the hour. At the start of this particular programme Ferrari asked what the listeners thought about the erection of mobile phone masts in their neighborhoods. First we hear a jingle in which a woman praises Nick Ferrari in no uncertain terms and leaves the impression that the host is something of a flirt. The voices of the two speakers are cheerful and energetic, and seem perfectly suited to the start of yet another busy day in London. Their everyday voices are full of pitch changes and lively intonations. This form of talk sounds nothing like radio did in the golden age.

Track 17: LBC London: Nick Ferrari with Rosemary, 2004 (3:05).

– Rosemary’s in Kew. Hello Rosemary.
– Hello.
– Morning.
– Morning.
– How are we today?
– I’ve got a crick in my neck today.
– We need you to jump up and down if England win.
– What?
– You weren’t listening, were you?
– No.
-You know, have you ever heard of John in Bellingham, one of the callers?
– Yes.
– He’s out of the prison now, which is good. And he has this rather unhealthy obsession that, if England win tonight, he wants to imagine you jumping up and down with glee.
– He needn’t bother, cause I don’t watch that bloody football!
– You didn’t let me finish. Jumping up and down with glee, naked as the day you were born!
– I should think so!
– Ha ha ha.
– I wouldn’t like the sight myself.
– Ha ha ha. Bless you Rose, ohh, you are lovely. Now, what’s wrong with you and phone masts?
– Well, we’ve had one put up the top of our road.
– Seven sevens, Rosemary.
– Pardon.
– Seven sevens. Come on.
– I don’t know.
– I’m trying to keep your brain sharp [snapping his fingers].
– Don’t do this to me in the morning!
– Ha. It’s forty-nine. You wrote them out in a little handbook a few months back.
– I know, but I haven’t got it here.
– All right. Keep going. I’ll just occasionally throw questions at you. Sorry, go on.
– We’ve had one put up the top our road, which faces the high road, and our road was …
– Capital of Spain?
– Madrid.
– Yea. That’s good. Just to keep your brain active. It’s like an NHS sponsored scheme I’m doing for you. Go on.
– Anyway this thing went up, they closed up the road and the blessed crane came. And I always take the local papers and read the notices, public notices, so I know …
– Three pounds seventy-five in old money?
– What?
– Three pounds seventy-five in old money?
– Oh do leave off!


And so it continues. Nick Ferrari is the master of ceremonies, and one of his specialities is to play tricks on his callers. He seems to be free of all expressive regimes, and just going wherever his next whim takes him. In contrast, Rosemary from Kew doesn’t know the tricks of Ferrari’s show very well. She is nonplussed by his playful, whimsical behaviour, and during the conversation we can hear how there is a tuning up of the relationship between the professional and amateur personalities, so that in the end they are on the same level.

Rosemary did not call in to Ferrari s show to play games, she called following his invitation to talk about mobile masts. It turns out that she campaigned successfully against a mast near her local school, and the telecom company actually had to tear it down. But Ferrari is not really interested in phone masts, he is more interested in confusion among the callers. It is enjoyable for listeners to hear people who are nonplussed, and in this sense Rosemary is the perfect caller. But she has a quick mind, and as she warms up we can hear how she begins to handle the situation better. This is especially noticeable when Ferrari asks her to jump (naked) up and down with glee if England wins, and she says ‘I wouldn’t like the sight myself’. If she calls the show again, she will be better prepared.

Phone radio was a new genre in the 1990s. There is still a learning process in the general public about how to act in the public sphere when you are in the safety of your own home. Really inexperienced speakers who are suddenly cast on air will have a challenging job aligning their manner of being to the manner of the programme. More skilled and streetwise speakers will project a personality that they have used successfully before and that they can also use in media settings. The cult of listeners who regularly participate in phone-in programmes will have a more coherent stock of behavioural characteristics in reserve. They are familiar with the host’s typical manner of being and can readily adopt his favoured caller attitude. They have learnt the right technique for talking on the radio in the capacity of just another ordinary person.

Rosemary is an inexperienced phone personality. It is tempting to argue that this makes her more natural-sounding than Ferrari; at least she displays fewer mannerisms of role play and a more spontaneous reactions than he does. Along with other radio amateurs she might spontaneously laugh at something, regardless of whether it is socially acceptable to do so, and she may not even realize that this laughter has a public impact.

In everyday life the characteristic verbal behaviour is more clear-cut. Basically, verbal behaviour is not performed to a microphone or camera, although it is of course experienced by other humans, and can be full of tactics and hidden agendas. But still there is no intentional role play for the public sphere. There is something special about recorded and transmitted behaviour, since it reaches the ears of thousands, perhaps millions, and can be repeated and analysed as long as anybody cares to do so.

Professionals sound natural on radio because audiences are used to the way in which they project themselves; they are just as familiar with the presentation as they are with news and factual information. There has been a historical development here. Anders Johansen (1999:167) points out that during the history of broadcasting there has been a blurring of distinctions between the speaker and the topic of the speech, between the personality and the social role. Radio communicates more by psychological symptoms than by clear messages (like news bulletins), and these psychological symptoms create new conditions for the communication of credibility. The subtly suggestive is favoured at the expense of what is demonstratively told. John Langer (1981: 361) points out another aspect of this: ‘Simply put, what is important is not so much what you actually say, but the fact that you can be seen saying for yourself. The very act of speaking for oneself is a type of disclosure. Talking for yourself and being “caught” and recorded doing so individuates you, makes you a personality whether you are the Prime Minister or last week’s national lottery winner.’ Erving Goffrnan’s (1981: 296) notion of self-reporting also touches on the psychological dimension of radio speakers. Many people make frequent reference to their own passing thoughts and feelings while talking about any given topic. The speaker is meant to be heard as an expressive field on which the symptoms of attitudes, emotions and fancies play out spontaneously.

Shouting and having fun
Personality journalism relies quite heavily on the sounds of raw human energy. The vocal chords of hosts and guests alike are often red-hot, and in addition there is a constant trickle of sound effects (ticking clocks, gongs, fanfares, etc.) and bits of pop and rock music. Personality journalism thrives on semi-chaotic social situations in the studio and between interlocutors on the phone. There are studio shows where the host and his guests quarrel, shout and scream, and regularly slam the studio door. If the host makes a technical mistake he is likely to talk about it in order to make it natural. What used to be offstage has now become a great attraction on radio.

The goal for the editors is to create a sense of noisy, youthful energy, and the host is the mam vehicle for the noise. In the USA the term ‘shock jock’ is used for hosts that are particularly rowdy and crazy, and who often talk more with their co-hosts and studio guests than with the callers. Andrew Tolson (2006:120) also refers to the ‘zoo aesthetic’ of much talk radio.

The breakfast show on BBC’s Radio 1 from six to nine in the morning has a long tradition of shock jocks and zoo aesthetics (Tolson 2006: 113-16). The next case study is from the Sara Cox Breakfast Show in 2001. In contrast to Nick Ferrari’s show, this programme plays pop music all the time, and the audio is heavily compressed to give it a rough sound. Quizzes are a standard element of phone radio, and Sara Cox’s show is no exception. There is a daily segment called ‘Mate or break’, where Cox speaks to two people on the phone. Notice that I have transcribed only parts of the dialogue.

Track 18: BBC Radio One: Sara Cox with Nicola and Rachel, 2001 (3:50).

– Nicola, have you any idea of the sort of forfeit that you might make 
Rachel do if she messes up this morning?
– Yes, I do. Quite a while ago Rachel went to a pyjama party at a local
nightclub, and she got pulled over by the police on the way home, wearing her pyjamas, and they asked her to step out. So I think that I should make her wear her pyjamas all day, around the town and on a night out with me.
– Yeah, we like that one. Send us the pictures! I’ll try to get a dozen
polaroids out of this. Right then, ladies, are you ready?
– Okey dokey. I’ll just remind you, Nicola, just to really make Rachel’s nerves worse. Basically Rachel has got to answer five questions correctly. She’s got forty seconds in which to do it. And if she gets one wrong then you don’t win the back catalogue, and you don’t win the gig tickets to the Charlatans.
– Right!
– You know what I mean. So it’s time to get pretty angry just in case Rachel messes up.
In case, yes.
– Are you all right, Rachel?
– No, I’m nervous.
– No pressure at all, but your friendship is on the line here.
– Alright. OK.


And then the quiz begins: Name one of the three number one albums by the Charlatans. What was the result in last night’s game between Manchester United and Olympiakos? Whose album out this week is called Wheel of Life! What’s the name of the Charlatans’ guitarist? Rachel answers only the last question correctly, and she will have to wear her pyjamas on the forthcoming weekend.

How do the speakers come across on air? The two girls Nicola and Rachel giggle and chuckle like most teenage girls. They both make sounds typical of insecure speakers, such as hesitation, moaning and stuttering. Their behaviour on air resonates quite realistically with the everyday life of most listeners, who presumably would be just as nervous if they were to participate. Sara Cox is folksy and enthusiastic and tries not to be condescending towards Rachel. She talks in a carefree manner and has a hoarse, deliberately inarticulate style, sounding as if she was out partying hard the previous night. She is a deft master of ceremonies, and balances the role of confidante with that of the authoritative quiz host.

As suggested, personality journalism feeds on a rough and rowdy style of speaking. This also means that callers are sometimes put in a tough spot by the host. Ian Hutchby (1991: 74-5) argues that hosts can pursue controversy, find something to argue about in what the caller is saying, and adopt a stance of professional scepticism. Stations can attract big audiences based on confrontational talk, and many programme hosts quite consciously offend their listeners. But civil society has limits to the behaviour that a radio programme can cross. In 2007 the American talk show personality Don Imus referred to a team of female basketball players as ‘nappy-headed ho’s’, which is a highly derogatory term for African-American women. His mother station CBS Radio sacked him within a week of this statement, but later the same year he was hired by WAJ3C (Radio World 2007). This story goes to show that, at the same time as the zoo aesthetic and its breaching of social norms is sanctioned, it can also increase the market value to radio stations of the personality in question.

Confessing the blues
Confessional programmes can convey great intimacy because radio communicates in sound alone. There is often a fear of embarrassment in direct visual confrontation, for example if people are challenged to take part in a theatre play or invited to appear on television. Since radio is completely non-visual this embarrassment is avoided, and there is greater room for projecting your usual personality.

Phone radio has long invited the listener to ‘work through’ big and small private problems that they are facing (Ellis 1999). Their private lifeworld will in a sense be thematized in public. The process of working through is typically considered a positive activity, and stations can address almost any private psychological, sexual or political question without getting into trouble with their listeners.

The next case study is from the morning psychiatry show ‘Dr Joy’ at the AM station 710WOR, which is located in New York but syndicated to stations all over the USA. There is only talk and no music in the format, but of course there are jingles, commercials and top of the hour news. But first a welcoming promo: ‘Thinking about fooling around with that girl at the office? Your husband drives you nuts? Dr Joy Browne is here to help solve all your personal problems at 800-544-7070.’

Barney on Saturday night. Illustration: Atle Skorstad.
Barney on Saturday night. Illustration: Atle Skorstad.

Track 19: IVOR New York: Dr Joy Browne with Barney, 2002 (3:41).

– Barney, you’re on the air on Dr Joy Browne. Hi!
– Hi! Can you hear me OK? I’m on a cell phone.
– So far, so good.
– I’m sitting in my house, so. OK. I’ll just start with a question. I don’t know if I should leave my wife or not.
– Dear, how long have you guys been married?
– Almost ten years.
– What’s making you so unhappy?
– Well, um, a little over a month ago she started having this affair with this woman, who, which she’s worked with.
– When you say affair, what do you mean?
– Well, it’s kind of complicated, urn … my wife has anxiety attacks.
– OK. That’s interesting but irrelevant at this moment. Can we go back to … I said what do you mean by affair?
– OK. She started going out with this woman friend…
– No, no. Just tell me. Do you think she is having sex with another person’
– Well she, what they told me when I confronted them was. I don’t know how to put it, but, my wife would, er, finger this other woman. OK? And they told me that’s all that happened, and it only happened a couple of times.


The conversation continues on the CD, but the transcription ends here. Barney uses the opportunity to fink up to the wider world, and discloses a problematic feature of his life to Dr Joy and the listeners. It so happens that his wife is involved in a lesbian relationship, and he now wants a divorce. Barney does not present himself in the same way as Dr joy. He is not an entertainer; he exposes a problem in his life as best he can.

Although Barney sounds sad and miserable, he does not sound ashamed about being so candid about his problems in public. He does not appear to be a cynic or a victim; rather, he seems to have a pragmatic attitude to his life crisis. Barney does not really talk to the public; he talks sincerely to Dr Joy in order to formulate and hopefully understand his own problem better And Dr Joy is even more pragmatic than he is. She is problem-oriented, down to earth and businesslike, and tries to focus on the things that would make a difference m Barney’s life. Three minutes is all he gets. This is an efficient piece of emotional counseling that would have taken much longer at the doctor’s office.

As a personality journalist, Dr Joy Browne makes it her task to lure the caller to be open and confessional. John Langer describes the regime of acting naturally in the context of television shows. The talk show creates a carefully orchestrated illusion of casualness, and there is a leisurely pace about everything: ‘The host and guest engage in “chat”. During the course of this chat, with suitable questions and tactful encouragement from the host, the guest is predictably “drawn in” to making certain “personal” disclosures, revealing aspects of what may be generally regarded as the private self (Langer 1981: 360). This strategy is as old as television, if not older.

Margaret Bruzelius (2001: 192) calls the USA ‘a nation ready to talk’. Psychiatry shows attract great audiences with their unsentimental approach to personal relationships. In addition to Dr Joy there is, for example, ‘Dr Laura’ on the US satellite station XM Radio. Such shows make the most of listeners’ readiness to ventilate their personal relationships in public. Along with the increase in phone radio, reality shows and other user-generated content, it seems that people do not consider the public arena to be distant and inaccessible; on the contrary, it is almost too easy to hook up with it, especially since people bring their mobile phone with them on their bodies. Andrew Crisell (1994: 194) suggests that the therapeutic aspect is fundamental to the genre of of phone-ins: ‘It is possible to regard not only confessional but all types of phone-m as therapeutic in their effects.’

There is potential for serious division in the psychiatry type of programme. The host is the expert and authority on interpersonal behaviour, regardless of whether she is a trained psychiatrist; and the caller is a subordinate, troubled person regardless of his education and wisdom. But if the listener suspects that the host is insincere — a cynical businesswoman making easy money off of people’s misery and lack of self-confidence — then the mood can go sour. To the unwilling listener Joy Browne sounds condescending, and it appears that she uses a routine voice for friendliness that she uses on the frustrated callers. It sounds like a tone of voice she would take on if she were to talk to a small child. In general, the personality journalist risks being heard as only apparently friendly, and this may become so sickeningly obvious to the unwilling listeners that they change station and never return.

The phone radio medium, 2008
Phone radio obviously relies on two separate public platforms, those of broadcasting and telephony. This mix of a mass medium and a personal medium makes programming more symmetrical in strictly technical terms. Here I will attend to the way that the mobile phone is technologically coupled with radio. Radio’s technological basis is also analysed in chapter 2 (web radio and pod-casting) and chapter 4 (terrestrial and satellite transmission).

Figure 5.2: Model of the phone radio medium.
Figure 5.2: Model of the phone radio medium.

Figure 5.2 shows that radio production is completely computerized. The studio microphone and mixing boards are hooked up to a computer system with software for launching jingles and commercials, and for manipulating the live feed before it goes on the air. From the studio computer the signal is sent to the transmission tower (or streamed on web radio or sent to a satellite). Listeners receive the signals on small portable receivers as described before. In addition to this transmission platform the station can feed mobile or landline calls into the studio, and this feature in principle allows any hstener to become a caller. The wireless phone network allows the listener to call the station from any location and any social situation.

As figure 5.2 suggests, now that the mobile network is hooked up to the radio medium, there is an enormous hinterland of potential speakers on radio. In practice all the world’s phones could be connected through the telecom companies’ infrastructures and subscription services. In principle any caller can reach any person in the world who has a phone number. The terminal will be connected to the central switchboard through a line (landline or wireless), where it will be hooked up automatically with the number requested. The phone company can monitor and record all conversations that take place in the network.

Like ground broadcasting, which I discussed in chapter 4, mobile telephony signals go through the air in a line-of-sight manner. Signaling takes place in a grid of transmission stations placed closely together, with reflectors that support the signal and make it reach nooks and crannies in the landscape. The stations are positioned so that they create an overlapping pattern of access zones, so that in practice the user can roam about freely and without caring about where they are (Ling 2004: 9).

In contrast to broadcasting, the telephone networks are common carriers for private messages; that is, they are constructed for one-to-one personal contact, and not for mass distribution of public messages (Mercer 2006). This implies that telecom networks do not necessarily have the editorial institutions and journalistic competence of radio and television stations. Although people can access broadcast signals on their phone, this is typically a completely separate functionality from that which the telecom operators cater to. There is no radio programming through the telephone networks, but there could have been.

The coupling of phone and radio seems to take place the other way around, by radio embedding phone conversations in its programming. This venture is based on commercial collaboration, where the phone networks supply the infrastructure to route messages to the broadcasting stations’ internal systems, and the station takes care of all editorial procedures involved in the communication with callers. Callers can contact the radio station in three ways: by calling on speakerphone, by sending SMS messages, or by sending pictures. Along with these limited opportunities, where the speakerphone is by far the most versatile, new journalistic genres have arisen.

This brings us into the studio environment. In most stations there are only two or three people working on a phone programme. The producer has overall administrative responsibility and must make sure both that all elements of the programme are ready in due time and that callers and studio guests are ready to speak at the right moment. Again, depending on the scale of the operation, there may be one or several moderators, who take calls from the listeners and screen them all to decide who is suitable to be put on the air. The moderator may take note of vital information about the participant, so that the host has something to start from (Rosemary lives in Kew, and Ferrari has learnt this before she comes on air).

These shows must by definition be live, since they rely on callers to respond to the invitations made at the beginning of the show. However, American stations in particular have emergency plans because of the social risk-taking involved in talk radio, especially regarding the uttering of swear words and sexual profanity. Most shows have a time delay which allows the producer to switch to a recorded element if the stations moral regime is breached. Typically, a jingle or commercial element is launched, and during those ten to thirty seconds a new caller has been prepped and is ready to go on air (Boyd 1988: 240—1). If this social control by recording were impossible, American talk stations would probably allow far less participation from laypersons on the air.

The public switchboard
The purpose of phone-in programmes is to create a conversational flow, but not to create a dialogue. Editors and journalists are the bosses, and the callers are at the editors’ mercy. Talk shows rarely allow callers freedom to tell the whole story or to follow their argument to a natural conclusion. In the radio industry the caller is basically one item among all the others needed to create an attractive and enjoyable programme. The host relies on a producer to monitor and select callers for the show, but the host is nevertheless the public switchboard who maintains the connection between callers and listeners live on the air.

The host must speak on two communicative levels simultaneously, but must primarily address the public of listeners. For decades it has been standard journalistic behaviour to talk with and not to the listener. This presumably draws the listener into the mood of the programme in a better way than a more formal address would do (McLeish 1999: 65). Secondly, the host must talk with the caller about the topic at hand, and also guide them through the procedures of a quiz or calm them down during a heated debate. If there is a problem, the relationship with the listeners is always the number one priority, and the conversation with the caller will be unsentimentally terminated.

The journalist’s job is essentially to convince the callers and listeners that they are companions, and maintain this conviction indefinitely. A returning listener has discovered something attractive in the host’s personality that makes it worthwhile to keep on listening. For decades the talk station’s strategy has been to create trademark personalities for the different shows. The longer a certain personality has been kept in charge, the more credible it becomes for the regular listeners, at least as long as there are no scandals to corrupt the situation. The personality can be built over the whole lifetime of a radio performer, as in the case of Howard Stern and Don Imus, and there can be great bargaining value in having a strong public identity as a radio personality.

In regard to personality building, it should be noted that electronic media rely fundamentally on the repetition of the same type of performance. The repetition soon becomes a familiar, often comforting ritual (Carey 1989). Repetitive journalistic procedures have always been a prerequisite for loose and ‘natural’ talk on the part of radio personalities. The host will always perform in the same well-known acoustic space, in the same tone of voice, talking about more or less the same topics year after year, speaking the same kinds of formulaic sentences over and over again. Within the safety of a given set of procedures it is quite easy to create improvised and spontaneous behaviour.

Let me connect to history on this score. The rhetorical construction of companionship is as old as broadcasting itself. All radio and television programmes project some kind of social or personal mood, even news bulletins. This is because broadcasters have long since learnt that listeners easily develop a personal identification with the performers in radio and television. ‘A host of notions of “being genuine” and “being yourself” dominate the legitimating rhetoric of broadcasters’, Espen Ytreberg claims (2002: 492—3). Not surprisingly, textbooks can routinely point out that ‘radio is, for many of its audiences, a life-long friend and constant companion’ (Shingler and Wieringa 1998:110). In the context of television Joshua Meyrowitz argues that personalities can become friends. ‘Viewers come to feel they “know” the people they “meet” on television in the same way that they know their friends and associates. In fact, many viewers begin to believe that they know and understand a performer better than all the other viewers do. Paradoxically, the para-social performer is able to establish “intimacy with millions'” (1985: 119). Paddy Scannell (2000) argues that broadcasting addresses ‘anyone as someone’. His point is that, even though everybody in the audience receives the same message, it is formulated in quite intimate terms, so that each listener will feel that it is addressed to them in particular.

Caller initiative
‘Do you know the answer? Dial this number’, the host pleads. ‘Do you have a problem you want to discuss on air? Call us right now.’ This insistent appeal suggests that in the 2000s there is a big demand for callers. Any listener may become a caller; it depends only on his initiative. Since transistor radios can be taken anywhere the opportunity to listen is ever-present, and since the mobile phone is of course also a constant companion, listeners can turn into callers at any time.

The caller has shown initiative in calling the station, and this is important. There is a big difference between making and receiving a call (Hutchby 2001: 111; Ling 2004: 132). If you make a call you know who you are calling, you know what you want to talk about, and you have chosen a time that suits you. If you receive a call you may not know who the caller is (although mobile phones can display both name and number), you may not like the reason for the call, and it may not be at a good time for you to talk.

Despite showing strong initiative by calling in, the caller has to go along with the programme’s procedures, such as the rules of ‘Mate or Break’ on BBC Radio 1. As long as the caller follows the rules he is in control of his appearance in relation to the host and other participants. This allows the caller not to get too hung-up about his public performance; and he can focus better on the task at hand. My point is that procedures increase the likelihood that the caller will come across as a relaxed and credible person.

Radio is more welcoming than television when it comes to letting one’s guard down and acting without too much caution. As suggested, the absence of visual mediation is important for relaxed behaviour, but another important reason is that the phone mouthpiece is so familiar to the caller that it further lowers the threshold for participation. The caller’s contribution is to behave exactly how they always do. Buck Owens comments ironically on the communication skills required by mass media in the song ‘Act Naturally’ (1963), which was also covered by the Beatles. ‘We’ll make a film about a man that’s sad and lonely, and all I have to do is act naturally’, he sings. This rule of behaviour works well not just in the medium of film, but also in radio and television. The best way to take part in the media is to seek out the situations in which your personality is suitable, and proceed to act naturally. If you are sad, make a call to a sad radio programme. Clearly the threshold for being accepted in phone radio is not very high, as Owens with feigned self-irony points out: ‘I hope you come and see me in the movie. Then I know that you will plainly see the biggest fool that ever hit the big time. I’ll play the part but I won’t need rehearsing, since all I have to do is act naturally.’ However, the rule of ‘act naturally’ is symptomatic of very intelligent public communication which should not be underestimated.

The listening zone
Finally I will turn to the listener’s perspective. Compared to a face-to-face relationship, that between the hstener and the radio speaker is rather superficial/The person on the radio is a person with whom one can be close without having to tolerate all of the disadvantages of closeness’ (Fornatale and Mills 1980: xix).The social relationship is amputated, and this is often the main attraction. If the radio is on for many hours, the voices may provide more of a humanoid mood than a direct encounter with other people.

To the extent that the listeners are attentive, they will relate differently to the two different types of speakers – the host and the callers. The host is the centrepiece of the programme, and in most cases listeners already know the personality well. The callers come and go through the programme, and since they are a dime a dozen they can be treated with less patience by the listener. However, listeners may still identify more passionately with callers than with the radio personality, since callers are ordinary people just like the listeners, and may talk about life experiences that are quite recognizable.

There is another reason why listeners may bond quite strongly with callers, which I have already touched upon from the caller’s perspective. The mobile phone has a characteristic sound, with a narrow bandwidth, and it gives off bursts of noise when somebody speaks too loudly or the connection breaks up. The bad sound of mobile phones on air will be recognized by listeners from their own mobile phone conversations, and it invites them to recognize their own communication environments and their everyday phone conversations. Because of this, the listeners can feel closer to the callers on radio than they might otherwise do.

What about the listening zone? People often listen to talk radio in the deepest recesses of the home, for example in the bedroom late at night. This is a zone completely outside the public realm, and it is the scene of uncensored emotions, including loneliness and desperation. The listener knows that the programme is for ‘anyone as someone’ (Scannell 2000), and that it is only a para-social relationship (Horton and Wohl [1956] 1979), but he listens all the same. My point here is that the listeners need not take up a citizen’s attitude. There is no need to relate the programme’s output to the larger public sphere in order to have a valuable sociable experience. On the contrary, the listener is likely to become more entangled in his own life involvements and his own current mood.

Here is the core of personality journalism on radio. The purpose is to accompany the listener’s life with the right moods at the right time, and in this sense always to charge or energize them and help them along in their lives. In this sense there is a positive influence of radio for listeners. Andrew Crisell (1994: 212) points out that, ‘because we continue with our lives while we are listening, its content is, as it were, transplanted into our own existence and adapted to our own purposes.’

But the listeners do not only bond with the programme in a positive way through enthusiasm and endearment with the host. Identification feeds just as well on antagonistic emotions. Erving Goffman (1981: 247) points out that there are always angry ears out there. He suggests that audiences are not only easily offended by faults or remarks, but that they actively seek out faults that might be offensive. People’s annoyance may blind them to the fact that the host plays out a contrived personality, and what annoys them may be exactly those personality traits that are put on as role play. The hostile engagement may go on for years without the listener switching to another station. On the contrary, the unwilling listener may build up a sophisticated private dislike of the personality. This is the freedom of the listener in a nutshell; you don’t have to change station, you can just dislike radio personalities over time. For radio stations it is a win-win situation: if people really like the show they will keep listening, and if they really dislike the show they will keep listening.


6. Loudspeaker living – Pop music is everywhere

People enliven their day with loudspeaker music, and they share it enthusiastically with friends, colleagues and family – inside their homes, in the car and at all kinds of public events from discos to sports venues. And there is a whole industry that caters to this penchant for musical timbres.

This chapter starts from the presumption that, whether people like it or not, Western media cultures are now quite saturated with recorded pop music. Music is disseminated in social situations in an industrial way by commercial companies and individuals alike. I have chosen four media configurations for closer scrutiny: 1) muzak in commercial locations such as shopping malls; 2) music radio in semi-public and private locations, such as the hairdressers and taxis and commercial vehicles; 3) internet jukeboxes involving a high level of personal choice in music, for instance Pandora and Last FM; 4) disc jockeying in clubs, pubs, cafes, etc.. with limited attendance. This chapter has no sound examples. If vou like, it is an oasis of silence in an otherwise noisy book.

The music media landscape
I am tapping into a rich cultural field when focusing on music in public settings. It has always been important to socialize and have fun with music, for example at the fairground, where the lively atmospherics of violins, accordions, singers and music boxes was integral to the experience (Lanza 2004).When it comes to music, relaxing moods have always been sought by ordinary people and serviced by professionals. Music is often peaceful, and music often allows you to take a break from the real world. Another way of saying this is that music can be pacifying in both good and bad senses of the word; it can be used for voluntary relaxation and for psychological manipulation (Riesman [1950] 1990; Mathieu 1991).

The soundscape of the twenty-first century is different from that of the nineteenth century. Now there are aeroplanes thundering overhead and the constant drone of traffic, which doesn’t even cease at night. In the nineteenth century there was the clattering of horse-drawn wagons in the cities and shouting and screaming in the streets. Both societies are what Schafer call low-fi soundscapes; there is a constant hum of different noises that drown out the characteristics of individual events. If somebody cries for help they may not be heard. In a hi-fi soundscape, on the other hand, every single event comes across more clearly. Schafer argues that ‘the quiet ambience of the hi-fi sound-scape allows the listener to hear farther into the distance’ ([1977] 1994: 43). There are still hi-fi soundscapes, for example natural surroundings such as high up in the mountains or out at sea in a sailing boat. Everything stands out, and the cry of a pelican a mile away is heard clearly against the background of relative silence.

Figure 6.1: Timeline of music media.
Figure 6.1: Timeline of music media.

Figure 6.1 shows the most common platforms for the dissemination of recorded music in the social and semi-public sphere in Western countries. Internet jukeboxes are a relatively new phenomenon, and were not widespread until streaming audio had been perfected in the late 1990s. While they appeared much later in many European countries, 24-hour radio services started to penetrate everyday life in the early 1960s in the USA. Notice that, in contrast to the previous timelines, figure 6.1 covers not just the contemporary period from 2010 back to 1975 but the entire historical span back to 1870. Recorded music played in clubs, bars and restaurants has been a normal thing since the 1930s, when sound could first be amplified in a commercial way. Analogous to these systems for public address are the speakers relaying propaganda in a country such as North Korea, which have the effect of a kind of acoustic brainwashing.

Below the arrow I have listed stand-alone jukeboxes, MTV and karaoke because they are related phenomena. Jukeboxes in cafes have spun hit records for customers for seventy years. MTV and other music video television stations are interesting because they are used socially by teenagers who demarcate their taste in artists and music. Since the rise of music video television in the early 1980s the concept of the video jockey has been popularized, and these figures have great influence on musical trends and charts. The idea of a dedicated video-based outlet for music meant that both artists and fans found a central location for music events, news and promotion (Millard 2005. 339).

Karaoke is interesting because of its intensely social character, where colleagues and friends drink and sing along at their favourite bar. To be more specific, karaoke, which spread from Japan to the United States and Europe in the 1990s, is a form of entertainment where amateur singers sing along to a well-known pop song using a microphone and a PA system. The voice of the original singer is removed or reduced in volume, and lyrics are usually displayed on a video screen along with a moving symbol that guides the singer (Wikipedia 2007,’Karaoke’).

Although MTV and other providers of music videos tend to dominate in the public eye, the sound media have an important independent function that goes further than that of audiovisual media. It is because they have no visual imagery, and no need for screens and monitors to be set up, that they can be even more deeply embedded in daily life, and supply a pure sensory experience of which listeners are often not consciously aware. Because of their versatility they contribute greatly to the creation of low-fi soundscapes in modern societies. And the tendency is clear: the Western media compete to saturate ever more aspects of people’s social environments with music (and images). The forms of distribution and their musical timbres are so commonplace that the entire medium is often perceived more as a piece of furniture than real music (Lanza 2004; Barnett 2006).

Traditionally, listeners have been projected as active consumers, with a distinct taste in music that they enthusiastically pursue. An example of such a resourceful cultural figure would be Nick Hornby and the culture of record listening he describes in the novel High Fidelity (1995). His characters play records on the stereo to his liking, and project the moods and timbres that feel right for them. They are in command of their media environment, and proud of it too. I suspect that this attitude is not necessarily so widespread. On the contrary, I think that the industry (consisting of record companies, radio editors, club owners) projects this type of listener in a rhetorical way, because it is a flattering position for the consumer to be in. If you don’t like the music you can always turn it off, the industry rhetoric implies. In this chapter I intend to show that things are not quite that simple.

Muzak at the mall
The word ‘muzak’ has broken free from its corporate parent and become a term for easy listening, middle of the road, or elevator music. Any type of repetitive music can be called muzak.
The Mall of America in Minneapolis/St Paul is the second largest shopping centre in the world and attracts millions of visitors every year. It has a highly sophisticated loudspeaker system, with closed-circuit systems for each individual store and a public address system for the entire mall (Sterne 1997, Connell and Gibson 2003). This is exactly the kind of public spaces to which companies such as Muzak and the 3M corporation sell their pre-recorded selections.

There can be many intentions behind music formatting, but it will always be the case that a company tries to inspire a certain shared mood among the shoppers. This mood can be organized so that it has a rising intensity of rhythm (and volume) over a certain period of time, followed by softer music and then a new rise of a slightly different intensity and duration (see Lanza 2004: 48ff). Music can serve both as a welcoming mat and as a keep-out notice, Connell and Gibson (2003: 197) argue: ‘In 1999 a suburban shopping centre in the Australian city of Wollongong began playing Bing Crosby music to stop teenagers hanging out there.’ At the same time as Bing Crosby keeps teenagers out, he attracts the right kind of customers to the store, for example grandmothers buying Christmas gifts. Muzak corporations can play quite cleverly on various such types of simultaneous inclusion and exclusion based on recorded pop music.

Christmas at the shopping mall. Illustration: Atle Skorstad.
Christmas at the shopping mall. Illustration: Atle Skorstad.

Paddy Scannell has coined the term ‘dailiness’ to describe radio’s formatting scheme, but it also fits muzak quite well. He refers to a service ‘that fills each day, that runs right through the day, that appears as a continuous, uninterrupted, never-ending flow-through all the hours of the day, today, tomorrow and tomorrow and tomorrow’ (1996: 149). Producers conform to the real-time progression of the larger society in which they operate, and exploit it to put the right kind of music into people’s everyday settings at the right time. The public moods of Western societies vary with summer and winter; with workday and weekend; with morning, afternoon and evening. John Ellis argues that it is strategically important for media companies to understand the rhythms of the private sphere. Early on it ‘became important to know when the various sections of the population awoke in the morning, took their meals, returned from work, went to bed’ (Ellis 2000: 43).

I will go into the seasonal signature of muzak in some detail. The Muzak corporation has a service called Holiday, and customers can select from a long list of different religious and carnivalesque holidays-basically Western holidays respected in Europe, the USA and South America. The list goes like this: Christmas, Cinco de Mayo, Halloween, Independence Day, Mardi Gras, Oktoberfest, St Patrick’s Day, Summer Fun and Valentine’s Day (Wikipedia 2007,’Muzak’).

Let us select Christmas. At Christmas in the Mall of America, ‘White Christmas’ is a much played song, and it flows through the shopping mall in Bing Crosby’s smooth crooning tones (listen to track 25 to hear his voice). The music has a set volume. It is part of the interior of the store, and there is no way that the listener can turn up the volume if he likes a particular song, or turn it down for that matter. The customer leaves the parking lot and goes into the Waldbaum supermarket. The management has chosen a solemn style of Christmas song: ‘Silent Night’, ‘The Little Drummer Boy’, ‘Mary’s Boy Child’, ‘Hark! The Herald Angels Sing’, ‘The Twelve Days of Christmas’. For a while the songs are quite comfortable, but if you start to get annoyed your only option is to leave. The Gap store next door plays a selection of funny Christmas songs: ‘Rudolph the Red-Nosed Reindeer’, ‘Jingle Bells’, ‘I saw Mommy Kissing Santa Claus’, ‘All I Want for Christmas is my Two Front Teeth’. In the sports store they want you to buy skis, and you hear ‘Frosty the Snowman’, ‘Winter Wonderland’ and ‘Let it Snow’. Christmas songs are now part of the West’s low-fi background for two months a year, every year throughout our lives.

In the shopping centre listening is very often a secondary activity: the presentation is a background to other projects that receive the listener’s full attention. It seems that the presentation is ‘heard’ rather than listened to. The sounds become part of the surroundings, like a piece of furniture or the wallpaper. According to Carin Aberg, the secondary listening process is mainly aesthetic and emotional. There is little knowledge gain and little interpersonal identification (Åberg 1999: 77-8). Many customers in the Mall of America walk around looking for bargains, and some are chatting with a friend on their cell phone.

But notice that this is by no means an unsophisticated or foolish way of attending to music. On the contrary, secondary listening in public and semi-public settings is actually a highly responsive way of relating to sound. In this type of listening the perceptual surface is more interesting than the symbolic or linguistic depths, and this surface can be called timbre. Timbre is not measurable by acoustic instruments in the way that pitch and volume are, it is an expressive quality that is experienced subjectively and according to the individual’s skills of cultural perception. The repetitious relationship that people typically have with recorded music appeals more to a bodily engagement with rhythms and timbres than to any form of intellectual search for information or knowledge. You are more likely to hum along than reflect intellectually on the deeper meaning of the lyrics.

Music radio
In this section I will analyse the traditional FM music station and its way of saturating our lives with recorded music. Top 40 stations are both a barometer and an arbiter of musical taste, and radio airplay is one of the defining measures of success in the mainstream musical world (Hendy 2000:141). First try to imagine the scale of these operations. A hit station that started a 24-hour service in 1975 will by 2010 have produced 306,000 hours of music, and in New York City alone there are a dozen stations that have aired since 1975. The maths soon become overwhelming.

For decades music radio stations have branded themselves differently from each other to attract niche audiences. The main strategy is to produce a signature sound, a kind of aural trademark that will be easy for the listeners to recognize. Furthermore, stations promote themselves through a constant trickle of jingles and promos that serve to establish their trademark. The running order of songs through the day is organized in playlists, and voices are compressed and filtered to fit into the timbres represented on the playlist. Everything is produced to fit into the overall sound signature of the station. Because of this very rigid structure, DJ’s programmes have been called ‘automated reality’ (Hendy 2000:112).

In the radio industry there is a large menu of formats to be taken up, and a company will analyse the market potential of the available niches at length before a station format is launched. The main formats are pop, classical, middle of the road, country, oldies, classic rock, easy listening, dance, urban (hip-hop and rhythm and blues), jazz and progressive rock (for a more detailed list, see Hendy 2000: 100).

Andrew Crisell (1994: 72) stresses that, although a radio format is full of bridges, changes, and indeed even pauses, there is no narrative development. The format logic conforms largely to the notion of flow, which Raymond Williams ([1975] 1990: 90) pointed out was about to overtake the notion of programming. Discrete programme units are no longer the main product of radio; rather, a complex web of sequences comes about. First, there is the flow or sequence of listed programmes within a particular day. Second, there is the flow or succession of items within and between the listed programmes. Third, there is the really detailed flow within this general movement, the succession and/or overdubbing of sound arrays in the individual jingle, reportage or commercial (ibid.: 96; Jensen 1995: 108ff.). David Hendy quotes a station manager who says: ‘our programming is clinical and disciplined, and the way you do things in radio is actually more important than what you do.’ Radio is a ‘how medium’ where ‘style must come before content’ (Hendy 2000: 98).

Music radio is more accessible and controllable for the listener than muzak. Firstly, you can have a transistor radio in any imaginable private situation; and there is a wide selection of music stations on offer in most cities in the West, so that there is a good chance you can listen to your preferred style of music too. Many people have the habit of listening to radio on their Walkman, and music has permeated the lives of such people to a far greater extent (Bull 2000). Sarah Vowell, in an experiment of listening to the radio all the time for one year, writes:

The radio has become such a compulsive, constant companion that I’ve been home for around two hours, made three phone calls, cooked dinner, read a magazine cover to cover, and now, there’s a grating guitar instrumental coming out of the speaker and I don’t even remember turning the thing on.

(Vowell 1997: 199)

This type of background listening is comfortable for everybody involved. Music radio is played in semi-public venues such as the hair salon, the taxi or bus, and the dentist’s office. It masks less comfortable sounds, such as the noises of electrical appliances, street traffic and the chatter of people in the room. In many semi-public places the radio is always on, and the station has not been changed since 1985.

In contrast to muzak, radio (and internet) music allows the listener to adjust the volume, and there are many variations on who is in charge of the knobs. If you are wearing your iPod, you decide, whereas at home there can be disputes about both station choice and volume, and at the hairdresser’s you can only ask politely. Notice also that music radio is used in many outdoor settings that fall between the various categories. For example, there can be teenagers playing basketball in a back alley with a boom box blasting out hip-hop music.

Radio listening is almost always a parallel activity, and I will explain this important concept in detail. When listening is a parallel activity the listener’s attention shifts fluidly between the music presentation and some other ongoing project. Carin Aberg (1999: 77-8) says that parallel attention is likely to be both emotional and pragmatic. It is marked by short periods of sustained listening, sudden regaining of concentration after having been immersed in the project at hand, and a less premeditated situation of listening which could be considered more emotional, or mood-oriented as I would call it. Parallel listening is also widespread in the context of news and talk radio. Typically, you might arrange to do the dishes when a favourite programme is aired, but you may not be particularly attentive to the subjects that are discussed. Rather, you feel a social bond to the host and his voice, or the style of music that the station plays. You are taking part in a public mood that includes many more people than yourself, and this is a large part of the attraction.

I am stressing the function of music radio in social settings. One typical activity is to discuss the music, and argue about what is good and what is bad. At the hairdressers, if the station is changed, a discussion may erupt because someone gets annoyed: ‘Hey, that’s my favourite song you just interrupted!’ Listeners discuss the difference between good and bad cover versions of a familiar song. They discuss and recommend grooves, beats, melodies, harmonies and, in a genre context, psychedelic sound, funk sound, bossa-nova sound. They are familiar with the West Coast sound the Philadelphia sound, the Liverpool sound, and with record label references such as the Motown sound, the Atlantic sound and the Phil Spector sound, with Bob Dylan’s sound, Stevie Wonder’s sound and Joni Mitchell’s sound. The list could go on forever. The main point is that the concept of ‘sound’ is a vague and inclusive way of referring to music which modern music lovers know intimately.

The sentiments of parallel and secondary listening mean that artists cannot release their music without having reflected carefully on what kind of mood they will provide and what kinds of energy and attachments they want the listeners to feel. Although artists and companies cannot control the listeners’ sentiments, they can safely assume that the ‘sound’ of their music will be influential in its own right. Paul Théberge says that individual ‘sounds’ have come to carry the same commercial and aesthetic weight as the melody, and argues that ‘the “sound hook” begins to exert a force of its own, virtually demanding that any “authentic” rendition of the song be performed with the same or an equivalent sound’ (Théberge 1997: 195).

Internet jukeboxes
The newest form of loudspeaker living takes place at the computer, and relates to music websites on the internet such as Pandora and Last FM. They rely on streaming audio, and really require a broadband connection for the listener to enjoy it fully. The use of internet jukeboxes is still limited mostly to the stationary computer in the office or the bedroom, and the lap-top if there is a wireless connection. Very often this kind of music is heard via headphones, and it therefore has a more private feel to it than muzak in shopping centres or music radio at the hairdressers.

There has been a rapid development in the internet jukebox industry. In the early 2000s companies such as Sonicbox and Spinner presented services that resembled music radio more than anything. In the graphical interface the listener could chose between roughly the same music formats as on radio stations (rock, country, rhythm and blues, etc.).The music is played back automatically from a central database, and there are randomizing algorithms of various kinds to reduce the risk of unwanted repetition of songs. Two things in particular make these services different from radio. When using internet jukeboxes the listeners can switch between formats without leaving the site, whereas on the radio this would imply that the listener was changing station. Furthermore, advanced jukeboxes such as Sonicbox have a huge database of music and provide a nuanced system of sub-genres within each format, so that one can choose between, for example, twelve types of rhythm and blues. Radio stations on offer in a given location, on the other hand, might not play R & B at all. Websites can exploit central databases to create highly interactive choices for the consumer, and traditional FM stations cannot do this.

People are using the internet mostly in private settings, and this type of listening is quite different from that of public venues such as the hairdressers. Private forms of listening can be more surface oriented than public music, since the private setting allows the listener to indulge himself more freely in his favourite timbres. Notice also that the listener here chooses the setting and the musical content entirely of his own accord, and also controls the volume. Compared to muzak at the mall, this is a good starting point for indulging in your favourite timbres.

Music jukeboxes such as Pandora have introduced a remarkable interface for catering to the individual’s tastes in music. Basically, this site allows music lovers to create many different stations where they can select their favourite musical genres. Again, this software function is based on access to a tremendous backlog of popular music contained in centralized databases. Interestingly, in 2007 Pandora had to restrict its service to US audiences because of unresolved copyright issues.

Pandora is a kind of musical timbre analyser, and technically it is based on a categorization scheme developed by a group of music lovers in the USA. The company quite ambitiously calls the categorization work the ‘Music Genome Project’, playing on the Human Genome Project which analyses the entire human DNA structure. The implication is that Pandora attempts to capture all the different essential sounds of popular music so as to be able always to call up a song that suits the stated criteria. There are variables in the software that allow it to distinguish between different types of electrical guitar in Frank Zappa’s catalogue or the different ways that Frank Sinatra, Dean Martin and Sammy Davis Jr sing standard love songs.

We ended up assembling literally hundreds of musical attributes or ‘genes’ into a very large Music Genome. Taken together these genes capture the unique and magical musical identity of a song — everything from melody, harmony and rhythm, to instrumentation, orchestration, arrangement, lyrics, and of course the rich world of singing and vocal harmony. It’s not about what a band looks like, or what genre they supposedly belong to, or about who buys their records — it’s about what each individual song sounds like. This work continues each and every day as we endeavor to include all the great new stuff coming out of studios, clubs and garages around the world.

(Westergren 2007)

Thanks to innovations such as Pandora’s music genome software, the listeners have the world of music at their fingertips. They can create a station for disco, one for rock and another for hip-hop. The novelty is that the station presents songs according to the preferences that are continually put in, shown by accepting or rejecting the songs presented by the software. Let us say that you create a station called ‘soul singing’, and you start with Ray Charles’s ‘Hit the Road, Jack’ to give the machine an idea of what you mean by soul singing. Every time Pandora puts on a song that you feel is outside the definition of ‘soul singing’ you reject it by clicking a ‘thumbs down’ button, and the software learns to reject similar-sounding songs in the future. You may not approve of contemporary singing like that of Beyonce, and consequently you reject all songs with a contemporary, synthetic sound. According to the same logic, when Pandora puts on a song that you feel is right you give it the ‘thumbs up’. I would, for example, have scored most of Otis Redding’s songs very high on my soul singing station. This type of scoring can go on for as long as you like. If you get bored, and let the station continue to play songs without adjusting your preferences, it will go off on its own randomized trek through music history, and the soul may end up becoming folk.

The fourth type of music consumption that I will discuss is listening to loud music in an enclosed public space, where the main occupations are often dancing, drinking, taking drugs and getting a break from everyday life. This is an intense way of sharing music with others, and it is in particulary strong contrast to personalized music jukeboxes such as Pandora and listening to a Walkman or an iPod.

The acoustic architecture of clubs is carefully made to project recorded music only. For example, many clubs do not have a big stage for musicians, only a DJ booth built into the wall next to the dance floor. This set-up, much like muzak technology, also imposes the sound on the listeners, but there are several important differences. A club presents a predictable musical scenario for the regular guest; they know in advance what kind of music they will get, be it hip-hop, techno, disco, rock, tango or country. People often have to pay admission to get into clubs, which attests to their attraction. Club-goers often act in the capacity’ of music lovers, and the timbres of the music are high on their agenda, whereas muzak is certainly not your main interest when you go shopping.
The setting of club music can be highly charged with media effects. Rave parties have been organized since the 1980s, and DJs play electronic dance music to the accompaniment of laser light shows, images projected on the walls, and artificial fog.

The DJ is crucial to this type of music consumption. The DJ’s techniques in one sense resemble those of a radio host, but he can relate directly to the tastes of the crowd and is therefore far more flexible, and a good DJ will play along with a musical mood that is partly of his own making and partly based on the responses from the crowd. Since DJs are in such close contact with music lovers, they often pick up the new trends and promote them in the public sphere (for example, if they also have a radio DJ show), and they can also test out songs on the dance floor and see what kind of potential they have as dance hits (Connell and Gibson 2003:182ff.). Club DJs rely on a smooth transition between songs and use a range of techniques to accomplish this. Typically, the DJ shifts between two turntables (or other platforms) and therefore always has the next song ready before the previous one is finished. He listens on headphones to prepare the next segue between songs, and often also speaks into a microphone to introduce songs.

When it comes to the club-goer’s experience it is important to note that the act of listening is often a primary activity, which means that the music and its content – rhythm, melody and lyrics – is the main focus. It may also take place in parallel with other things, such as ordering drinks and chatting, but the music remains a crucial reference throughout. In primary listening the person will be concentrating through the duration of the sound; distractions will not be accepted, and there will be a substantial gain in pleasure and satisfaction (Åberg 1999: 77—8).

This is most evident on the dance floor, where club-goers are likely to give maximum attention to the music. This provides a more visceral and bodily experience than muzak and hit radio: imagine, for example, the bass drones of the subwoofer at a modern rave club. It is a very loud experience, and it hits you in the belly as much as in the ears. The loud volume is intended to shut out any type of speaking or communication except glances and touches between the people on the floor/Motion and escape is essential to rave and club culture’, argue Connell and Gibson (2003: 205). Another setting where the combination of movement and music is particularly strong is the aerobics class at a gym, where the music can be almost as loud as at a nightclub (DeNora 2000: 90).

The club-goers clearly engage in what I call timbre enjoyment. Theodore Gracyk supplies a productive way of thinking about timbre enjoyment. He argues that what attracts listeners to recorded music is the familiarity bred by hundreds of repetitions, the possibility of getting to know the material surface of the music in a sensual way. ‘We are free to savor and anticipate qualities and details that are simply too ephemeral to be relevant in live performance. When records are the medium, every aspect is available for our discrimination and thus for its interpretative potential’ (Gracyk 1996: 55). Here we touch upon a fundamental feature of sound media to which I will return in later chapters. There is a repetition of sounds going on which is very influential. Listeners know in advance that they will be hearing well-known hit songs over and over again, and the expected repetition is the main attraction. They expect to be musically satisfied only while they is listening, not after having contemplated the musical structures in some kind of intellectual way. It is in any event quite difficult to remember the subtle timbres of modern pop music. No matter how well we know the structure of a recording, the immediacy of listening is the main thrill. As Gracyk (ibid.: 61) points out, pop musics precise quality is only known perceptually, while perceiving it.

The loudspeaker medium, 2008
One thing in particular unites the otherwise disparate practices that I am describing in this chapter: the fact that the producers basically distribute music that was recorded by other producers. It is the art of recirculation, and it is equally true for muzak corporations, radio DJs, club DJs and internet jukebox programmers. Indeed, it can also be claimed to include the iPod user, who creates a private playlist and puts it on his iPod. The basic technique consists of creating a pleasurable and/or interesting loudspeaker experience for the listeners.

Figure 6.2: Model of the loudspeaker medium.
Figure 6.2: Model of the loudspeaker medium.

Figure 6.2 shows the structure of a closed-circuit system for sound amplification, and it essentially contains music recordings, a microphone for the DJ to speak and sing into and a computer through which the music is organized in playlists. Often there is a professional mixing board (or its computer equivalent) in order for the producer to play back music from several different sources in an efficient way, but this can also be done with the selector on a domestic amplifier. In really large loudspeaker systems, at a big nightclub in Ibiza or in the Mall of America, the music can be delivered by satellite or by dedicated broadband wires (Wikipedia 2007,’Muzak’). There are many small loudspeaker operations, of which the old-fashioned jukebox in the cafe is the most typical. The producers can project the sound according to two basic techniques that I introduced in chapter 1: loudspeaker placement and volume control. They decide which directions the sounds will go in, and can organize a sweet spot for a stereo system or 5.1 surround sound. They obviously also decide the volume at which the music is played. There are a variety of qualities in the reproduction of sound. Any ordinary loudspeaker contains bass and several treble elements, and reproduces the full range of musical sounds. It is also common to have subwoofers that reproduce the lowest part of the audio spectrum, and which are not often easy to hear but easier to feel. At the other end of the spectrum are megaphones, which have a muffled and narrow sound.

Loudspeaker systems can also be used to reinforce live public events in a specific locale or outdoor area, for example a political speech, an emergency message or a live concert. Public address systems have existed as long as there has been electrical amplification of sound (see Wurtzler 2007: 10—11). My interest in loudspeaker systems does not extend to these situations, not even such public events as a political rally or a live concert. These are often spectacular live events, with huge masses of people crowding together and reacting en masse. But they are localized in one unique place, and therefore are not really media phenomena in the way I conceive of them in this book. They do not rely on transporting the sound into other places (radio) or record it for later publication (music recording).

Earphone escape
The loudspeaker music media often have no documentary authority or transparency. Throughout this chapter I have stressed that the main experience for music listeners is to savour the surfaces and timbres of the sound, perhaps to dance and sing along as well — which is to say that the attention of the listener stops at the loudspeaker. The listener has no expectation that there is more than this, for example, that there are journalists, hosts or other responsible persons involved.

Earphones are the ultimate equipment for loudspeaker living. Notice that many interfaces for sound are designed to be handled by humans with their hands and eyes and ears in conjunction, but this is most emphatically not the case for earphones (or for loudspeakers in general). Loudspeakers are often almost invisible, since they may be embedded in restaurant walls, in car doors or in your ears if you wear earphones. Furthermore, loudspeakers typically stay in the same place for a long time, and become fixed objects in people’s everyday surroundings.

This point is most clear in the case of earphones (or headphones, which are larger and worn outside the ears), which provide the same stable sound projection regardless of the movement of the wearer. If you walk away from a loudspeaker on the kitchen table, the sound appears weaker and its resonance changes, but this is obviously not the case with earphones.

Instead of just being channels for sounds, earphones can take on the character of presenting keynote sounds in the listener’s everyday environment (Schafer [1977] 1994: 10). Consider a natural phenomenon such as a river running through a city. The water sounds are always there, and ‘have imprinted themselves so deeply on the people hearing them that life without them would be sensed as a distinct impoverishment’ (ibid.). In a completely mobile setting, the wearable music player creates a similar keynote sound. Recorded music in these cases becomes a fixture in a person’s life.

From the beginning of the twentieth century, when people could play records as often as they liked, there developed a new and more inclusive way of engaging with music. The listeners would become more and more interested in comparing one recording with another (and five programmes with other live programmes), rather than comparing them with performances in a concert hall or other non-mediated settings. After more than hundred years of this process of internal reference, intricate ways of listening have emerged among people in general. People have learnt not to listen to sound to identify the causes of the sound events. They listen without considering that the experience lacks shape colour or smell because they never expected it to have shape, colour or smell in the first place. Michel Chion (1994: 29) calls this kind of listening ‘reduced listening’. Here it is the timbres of the sounds that are Hstened for, and not their reference back to some historical event. Arnheim ([1936] 1986: 35) says that minute discriminations in sound alone are desirable because they enrich ‘the aural vocabulary by whose help the loudspeaker describes the world’.

Go to: Sound Media: Part II – Backwards history