CHI Nederland

14 januari 2003

Sound design for virtual environments

SIGCHI.NL presentation by Dik Hermes on 14 January 2003

report by Geert de Haan

The presentation by Dik Hermes brought past times to life: it was as if we were back at school at the introduction to hearing in the context of psycho-physiology. However, not for long; not only because so much has been found out in the past years about sound, but also because of the most enthusiast style of Dik, featuring literally tens of little and big demonstration. Here the magician presented us with a wooden bal and there he got a aluminium one out of his pocket. Here, the summary of his presentation with some comments added.

Dik started his presentation arguing that - If virtual environments are to be enriched with natural sounds, these sounds must be synthesized. Apart from the difficulty that you cannot just record all the sounds that you need, it is virtually (so to speak) impossible to edit such recorded sounds to fit your purpose.

However, even about relatively simple systems such as a ball rolling over a table, we do not know enough to calculate a natural sounds sound signal containing the perceptually relevant information about the size, the speed, the shape, the material and the roughness of the ball, and about similar physical properties of the table.

The presentation addressed some methodological issues which come about when we investigate the perceptual process taking place when we listen to such everyday sounds.

In sound-research land, there are three different approaches: the physical school, the psychological or psycho-acoustic school and the ecological one. Experiment psychologist will recognise these "schools" from vision. Although there are main differences and sometime difficulties between them, according to Dik it is of no use to have a war about this. Simply because all schools are very good at explaining particular features about sound. As an illustration, it is obvious that increasing the energy of a sound source increases the volume that we perceive. However, not all the time: remember that you have to adjust the volume of your walkman when the train stops at a station. Here we have a psycho-acoustic effect: there is less sound "in the air" and suddenly we experience the walkman as louder. Strange isn't it? Dik also showed examples of missing sounds: replace some sound on a tape by nothing and the result -for both speech sounds as well as classical music- sounds awful. Now, replace the gaps by noise, and even though speech still sounds funny, the music sound rather natural… only it seems that there is some additional source of noise present.

Three subsystems were discussed: First, the mechanical events causing vibrations; second, the radiation of these vibrations into the air where they become sound and are picked up by the ears and, third, the perceptual process which interprets the resulting vibrations of the two tympanic membranes into events occurring around us.

Dik argued that we not only need to understand this process in order to design a natural sounding, virtual auditory environment. He also argue that the design of virtual auditory environments helps us the opportunity to investigate what structured information in the sound signals we use when interpreting the signals as acoustic events. This and many other phenomena was illustrated by results from physical and perception experiments on the sources of structured information in the sound signal, where -as much as possible- spatial, temporal and spectral characteristics were varied. The main set of illustrations consisted of -simply- rolling balls on the table - slow, fast, light balls, heavy balls, wooden balls, aluminium balls (…balls everywhere). The main point was that, closing your eyes, it is very clear that one sound differs from another, although it is much harden to exactly identify what made the difference: the size of the ball, its velocity… Other examples were presented from recorded sounds on a PC (in-between Windows crashes that is).

Depending on context, we as listeners are able to attribute more or less weight to each of these sources of information.

Half way through the presentation, Dik changed to a new subject -at least to my ears- when he discussed research from psychoacoustics and auditory scene analysis - here the former is best understood as simply: sound-as-we-experience whereas the latter refers to how our perceptual system seems to analyse the 3D sound space that we are in, to understand what we hear. Basically we are able to perceive stereo sounds. However, much like but less sophisticated as the bat, we are able to use additional information from e.g. the reflection of sounds against the wall (relative to the original sound) and things like the time-delays within these reflections to form a more-or-less complete mental picture or internal model of the environment. In a sense these abilities are employed in 3D surround-sound system which allow us to have the experience like we are in a real concert hall. Note however thatthat is exactly what these systems do.

What they don't do however -and here we again learn how SIGCHI.NL presentations can save you a bundle- is improve the quality of the sound. When you want to listen to quality music, you better use a decent headphone (and keep the volume well below 10!).

Curriculum vitae

Dik Hermes works in Eindhoven since 1985. First at IPO/Institute for Perception Research. Since 2001 he is an assistant professor in the capacity group of Human-Technology Interaction at the Dept. of Technology Management. His expertise covers speech research, intonation, sound design, and sound for virtual environments.

Dik Hermes can be reached at the following address: (note the extra spaces to avoid spam):

Dik J. Hermes

D. J.

Dept of Technology Management, Capacity Group Human-Technology Interaction

Eindhoven University of Technology

P.O. Box 513

5600 MB Eindhoven