Research

Determinants of voice learning

R. Zäske, J.M. Kaufmann & S.R. Schweinberger

Recognizing people from their voices is a routine performance in social interactions that critically depends on the degree of familiarity with a speaker (Yarmey et al., 2001). It has been suggested that the processing of unfamiliar and familiar voices involves partially distinct cortical areas (von Kriegstein & Giraud, 2004) and differs qualitatively (Kreiman & van Lancker Sidtis, 2011). However, the neural processes mediating the transition from unfamiliar to familiar voices and the conditions under which voices are learned, remain largely unexplored. While forensic research has begun to study voice learning and recognition in the 1930s to improve the reliability of earwitness testimony for once-heard “unfamiliar” voices, this branch of research continues to rely on, almost exclusively, behavioural measures. By contrast, more recent neuroscientific research is strongly inspired by cognitive models of face perception. With respect to learning, these studies tend to look at short-term implicit effects of priming and adaptation. Accordingly, current models of person perception are void of learning mechanisms that are associated with explicit speaker recognition (Belin et al., 2004; Campanella & Belin, 2007). Thus, the applicability of these models to everyday face and voice recognition is limited.

Based on the notion that voice learning may be affected by characteristics of (1) the stimulus material, (2) speaker and listener attributes as well as (3) specific task demands, we will study effects of dynamic information in faces, distinctiveness and accents, speaker and listener age as well as selective attention on voice learning. To this end, we will relate behavioural measures of learning and recognition to electrophysiological and functional magnetic resonance imaging data which provide high temporal and spatial resolution, respectively. Taken together, we expect that the present studies will significantly contribute to our understanding of how voice representations are formed in person memory.