I See Me, You See Me : Inferring Cognitive and Emotional Processes from Gazing Behaviour [1 ed.] 9781443857741, 9781443854603

As one of the many by-products of Moore’s Law, personal computers have, in recent decades, become powerful enough to rec

173 99 2MB

English Pages 273 Year 2014

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

I See Me, You See Me : Inferring Cognitive and Emotional Processes from Gazing Behaviour [1 ed.]
 9781443857741, 9781443854603

Citation preview

I See Me, You See Me

I See Me, You See Me: Inferring Cognitive and Emotional Processes from Gazing Behaviour

Edited by

Pedro Santos Pinto Gamito and Pedro Joel Rosa

I See Me, You See Me: Inferring Cognitive and Emotional Processes from Gazing Behaviour Edited by Pedro Santos Pinto Gamito and Pedro Joel Rosa This book first published 2014 Cambridge Scholars Publishing 12 Back Chapman Street, Newcastle upon Tyne, NE6 2XX, UK British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library Copyright © 2014 by Pedro Santos Pinto Gamito, Pedro Joel Rosa and contributors All rights for this book reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior permission of the copyright owner. ISBN (10): 1-4438-5460-3, ISBN (13): 978-1-4438-5460-3

This book is for my sweet daughters, Maria and Matilde, and to a glorious future ahead. And to my dear wife, Fátima, as always. Pedro Gamito

I dedicate this book to my family for the unconditional support and to Petra, my Partner, for being a constant source of motivation. Pedro J. Rosa

CONTENTS Table Index ................................................................................................ ix Figure Index .............................................................................................. xi Acknowledgments ................................................................................... xiv Chapter One ................................................................................................ 1 Voting on a Face: The Importance of Appearance-Based Trait Inferences in a Political Candidate Evaluation: An Eye Tracking Approach Diogo Morais, Pedro J. Rosa, Rodrigo Brito, Inês Martins, Filipa Barata, Jorge Oliveira, Pedro Gamito, Fábio Soares and Catarina Sotto Mayor Chapter Two ............................................................................................. 21 Modelling Human Visual Detection of Anti-Social Behaviour John R. Elliott and James A. Renshaw Chapter Three ........................................................................................... 40 Eye of the Beholder: Visual Search, Attention and Product Choice Marija BanovLü3HGUR-5RVDDQG3HGUR*DPLWR Chapter Four ............................................................................................. 61 Cognition and Control of Saccadic System Anshul Srivastava, Vinay Goyal, Sanjay Kumar Sood and Ratna Sharma Chapter Five ............................................................................................. 86 Gaze Fixation Patterns in a Route with Obstacles: Comparison Between Young and Elderly Inês P. Santos and Leonor Moniz-Pereira Chapter Six ............................................................................................. 104 The Use of a Benchmark Fixation Deviation Index to Automate Usability Testing Jhani A. de Bruin, Katherine M. Malan, Jan H. P. Eloff and Marek P. Zielinski

viii

Contents

Chapter Seven......................................................................................... 125 A Multimodal Web Usability Assessment Based on Traditional Metrics, Physiological Response and Eye-Tracking Analysis José Laparra-Hernández, Juan-Manuel Belda-Lois, Jaime Díaz-Pineda and Álvaro Page Chapter Eight .......................................................................................... 149 Seeing is Believing: Perception of Disability and its Impact on Sport Marketing Carsten Möller, Hagen Müller, Peter R. Reuter and Roland Zumsande Chapter Nine........................................................................................... 171 Affective and Psychophysiological Responses to Erotic Stimuli: Does Colour Matter? Pedro J. Rosa, Cláudia Caires, Liliana Costa, Luís Rodelo and Ludmila Pinto Chapter Ten ............................................................................................ 191 A Pupillometric Approach to the Study of the Strength of Memory Signal Following Intra- and Interhemispheric Word Recognition Jorge Oliveira, Rodrigo Brito, Diogo Morais, Rita Lourenço Filipa Barata and Pedro Gamito Chapter Eleven ....................................................................................... 203 Intentional Dynamics in Deviant and Non-Deviant Sexual Self-Regulation from the First Person Stance Patrice Renaud, Kévin Nolet, Sylvan Chartier, Dominique Trottier, Mathieu Goyette, Jean Rouleau, Joanne Proulx and Stéphane Bouchard Chapter Twelve ...................................................................................... 232 Identifying Gaze Gestures from Noisy Image-Based Eye Movement Data Stefania Cristina and Kenneth P. Camilleri Index ....................................................................................................... 258

TABLE INDEX Table 1-1: Mean ratings on inferred traits per face category .................... 10 Table 2-1: Exported data sheet ................................................................. 29 Table 2-2: Mean Fixation Duration and Standard Deviations for Gaze Points for both groups ......................................................................... 30 Table 2-3: Values of the product of the standard deviations for Gaze points for both groups .................................................................................... 31 Table 2-4: Mean Fixation Durations and Standard Deviations for Gaze Points for video clips with incidents ................................................... 31 Table 2-5: Values of the product of the standard deviations for Gaze points before and after the incidents for both groups .................................... 31 Table 5-1: Percentage of duration of fixations for young adults and elderly, before task initiation ........................................................................... 91 Table 5-2: Percentage of number of fixations for young adults and elderly, before task initiation ........................................................................... 93 Table 5-3: Percentage of duration of fixations for young adults and elderly, during task performance ..................................................................... 95 Table 5-4: Percentage of number of fixations for young adults and elderly, during task performance ..................................................................... 97 Table 6-1: Profile of participants in the eye tracking usability study (Gelderblom, De Bruin, and Singh 2012) ......................................... 109 Table 6-2: Benchmark user classification criteria per task for each participant (# – Number of fixations, BU – Benchmark user) .......... 114 Table 6-3: Average task FDI and time per task for each participant (# – Number of fixations) ......................................................................... 118 Table 7-1: Menu Types........................................................................... 131 Table 7-2: Web style distribution. Presence (“X”) or absence (-) of usability parameters. “Go Home” (P1), “Go Up” (P2), “Web site Map” (P3), “Hover and Click” (P4), “Background Image” (P5), “Breadcrumbs” (P6) and “Menu Type” (P7) .................................... 132 Table 7-3: Correlation matrix of rotated components and questions ...... 137 Table 7-4: Significance of traditional variables ...................................... 138 Table 7-5: Significance of physiological and eye-tracking variables ..... 139 Table 8-1: FD in seconds by condition for AOIs related to the endorser 165 Table 11-1: Pairwise comparison based on LSD for GRAD and category of stimuli ........................................................................................... 216

x

Table 11-2: Pairwise comparison based on LSD for GRADV and category of stimuli ........................................................................................... 217 Table 11-3: Pairwise comparison based on LSD for GRADCV and category of stimuli ............................................................................ 218

FIGURE INDEX Fig. 1-1: Task 1 Stimuli .............................................................................. 7 Fig. 1-2: Sequence and competing visual stimuli (latin square; example) used in task 2......................................................................................... 8 Fig. 1-3: Average number of selections for “favourite face” across categories ............................................................................................ 13 Fig. 2-1: Eye tracking Setup ..................................................................... 27 Fig. 2-2: Video 2 Attempted theft from a parked vehicle (using participant 10 as the key) ...................................................................................... 35 Fig. 2-3: Video 6 – A mugging in broad daylight (using participant 10 as the key) ............................................................................................... 35 Fig. 2-4: Video 11 – Street violence (using participant 13 as the key) ..... 36 Fig. 3-1: Duration of fixations on chosen product and non-chosen products............................................................................................... 52 Fig. 3-2: Number of fixations on chosen product and non-chosen products............................................................................................... 52 Fig. 3-3: Response time during product choice ........................................ 53 Fig. 3-4: Time to first fixation on chosen product and non-chosen products............................................................................................... 54 Fig. 3-5: First fixation duration on chosen product and non-chosen products............................................................................................... 55 Fig. 4-1: Schematic representation of a reflexive saccade towards a peripheral target (black dot) ................................................................ 63 Fig. 4-2: Schematic representation showing saccadic eye movements controlled by bottom up and top down factors.................................... 66 Fig. 4-3: Schematic representation of reflexive saccades with fixation offset condition and introduction of gap between fixation offset and target appearance ................................................................................ 70 Fig. 5-1: Floor plan of the obstacle course with the 12 randomly arranged pylons .................................................................................................. 90 Fig. 5-2: Percentage of duration of fixations for young adults and elderly, before task initiation ........................................................................... 93 Fig. 5-3: Percentage of number of fixations for young adults and elderly, before task initiation ........................................................................... 94 Fig. 5-4: Percentage of duration of fixations for young adults and elderly, during task performance ..................................................................... 96

xii

Fig. 5-5: Percentage of number of fixations for young adults and elderly, during task performance ..................................................................... 98 Fig. 6-1a – c: Screen shots of the user interfaces of task 1 – 3 ............... 110 Fig. 6-2a – c: Heat maps of participants performing task 2 .................... 111 Fig. 6-3: Process diagram for calculating the FDI and mapping Benchmark Deviation Areas ................................................................................ 112 Fig. 6-4a – c: Gaze plot images from task 2 ........................................... 113 Fig. 6-5a – d: Euclidean distance clustering of fixations ........................ 115 Fig. 6-6a – d: Cluster polygons on the UIs ............................................. 116 Fig. 6-7a – d: Cluster with a high FDI count mapped back onto the UIs 119 Fig. 7-1: Attachment of GSR electrodes on the palm, EMG electrodes on the corrugator supercilii and on the zygomaticus major ................... 133 Fig. 8-1: Target stimulus for each group of participants ........................ 151 Fig. 8-2: Target stimulus for the two conditions .................................... 155 Fig. 8-3: Target stimulus for each group of participants ........................ 161 Fig. 8-4: Means with corresponding SEM of TFF data for the defined AOIs and both stimulus conditions ................................................... 163 Fig. 8-5: Visualization of the amount of fixation counts for both participant groups ............................................................................. 164 Fig. 9-1: The Self-Assessment Manikin (SAM) measures of pleasure and arousal ............................................................................................... 175 Fig. 9-2: Examples of experimental stimuli............................................ 176 Fig. 9-3: Perceptual complexity mean ratings for each condition .......... 178 Fig. 9-4: Valence and arousal mean ratings as a function of colour ....... 179 Fig. 9-5: Mean picture ratings in a 2-dimensional space ........................ 180 Fig. 9-6: Correlation between subjective arousal and SCR (log μS) ...... 181 Fig. 9-7: SCR amplitude (Log transformed) as function of colour ......... 181 Fig. 9-8: Peak dilation of pupil across presentation conditions .............. 182 Fig. 9-9: Correlation between SCR and pupil dilation amplitude........... 183 Fig. 9-10: Correlation between relative luminance and peak dilation amplitude .......................................................................................... 184 Fig. 10-1: Mean discrimination scores as a function of the retention intervals and visual field ................................................................... 196 Fig. 10-2: Mean reaction times as a function of the retention intervals and visual field ........................................................................................ 197 Fig. 10-3: Event-related pupil responses as a function of retention and visual field ........................................................................................ 198 Fig. 10-4: Pupil old/new effects according to visual field of encoding and retrieval ............................................................................................. 199 Fig. 11-1: Virtual characters used as sexual stimuli ............................... 211 Fig. 11-2: Gaze behaviour recording in virtual immersion ..................... 211

I See Me, You See Me

xiii

Fig. 11-3: GRAD and PPG for a typical sexually non-deviant subject and a typical child molester, with the female child stimulus ...................... 213 Fig. 11-4: Erectile response in Z scores for non-deviant control subjects and child molesters on the different stimuli categories ..................... 215 Fig. 11-5: Gaze behaviour as expressed in radial angular deviation (GRAD) ............................................................................................ 216 Fig. 11-6: The velocity of gaze radial angular deviation (GRADV) as expressed in degree per second ......................................................... 218 Fig. 11-7: The gaze radial angular deviation coefficient of variation (GRADCV) ....................................................................................... 219 Fig. 12-1: Iris-to-screen coordinates mapping, using a simple geometrical method .............................................................................................. 240 Fig. 12-2: Eye movements in eight different directions are translated into characters .......................................................................................... 247 Fig. 12-3: On-screen gaze markers are widely spaced in order to distinguish between separate fixation periods reliably ..................... 248 Fig. 12-4: Gaze gestures of increasing complexity were performed using four on-screen markers as guidance .................................................. 249 Fig. 12-5: Incorrect iris segmentation ..................................................... 250 Fig. 12-6: In case the onset of a fixation period is missed, gestures following similar paths cannot be distinguished from one another ... 251 Fig. 12-7: A fixation period onset at frame number 420 was left unidentified ....................................................................................... 251

ACKNOWLEDGMENTS Although only two names are on the book cover, two others belong to colleagues that have put a large effort into editing this book. They are Rodrigo Brito and Filipa Barata. Many thanks for your valuable support. Our colleagues Jorge Oliveira and Diogo Morais were, as usual, just within reach. Thanks for being our wingmen. Our dear students Nuno Santos, Fábio Soares and Catarina Sotto-Mayor were enthusiastic supporters during the entire process of putting the Eye Tracking, Visual Cognition and Emotion conference together. Many thanks. Paulo Sargento, thanks for being the eternal facilitator. Your friendship is much treasured. And, of course, a vigorous applause must go out to the authors and the reviewers. This book, is, ultimately, yours. The editorial team from Cambridge Scholar Publishing were unexcelled throughout the editorial process. Many thanks

CHAPTER ONE VOTING ON A FACE: THE IMPORTANCE OF APPEARANCE-BASED TRAIT INFERENCES IN A POLITICAL CANDIDATE EVALUATION: AN EYE TRACKING APPROACH DIOGO MORAIS†1,2,3, PEDRO J. ROSA1,2,4,5,6, RODRIGO BRITO1,2, INÊS MARTINS1, FILIPA BARATA1, JORGE OLIVEIRA1,2, PEDRO GAMITO1,2, FÁBIO SOARES1 AND CATARINA SOTTO MAYOR1 Abstract A large amount of studies have addressed the role of politicians’ appearance in the outcome of political elections in Western democracies. More recently, the focus of most of these studies has been on facial appearance-based trait inferences people make when considering electoral choice between candidates. This chapter describes a pilot study exploring †

[email protected]. School of Psychology and Life Sciences, Lusophone University of Humanities and Technologies (ULHT), Lisbon, Portugal. 2 COPELABS – Cognition and People-centric Computing Laboratories (ULHT). 3 CICANT – Centre for Research in Applied Communication, Culture, and New Technologies (ULHT). 4 Instituto Universitário de Lisboa (ISCTE-IUL), Cis-IUL Portugal. 5 Centro de Investigação em Psicologia do ISMAT, Portimão, Portugal. 6 GIINCO – Grupo Internacional de Investigación Neuro-Conductual, Barranquilla, Colombia. 1

2

Chapter One

the use of eye tracking data to understand the effects of appearance on preference for faces presented as those of candidates. Results show that both darker-skinned faces are judged as both more attractive and, in the case of males, as more threatening, and that they are chosen more often as preferred political candidates. ET data shows that these faces are also fixated faster than lighter-skinned faces, indicating heightened attention to faces with, in effect, more salient positive and negative stimuli. Limitations of the study point to the need for an expanded use of eyetracking with crossover designs.

Introduction Despite the widespread dissemination of eye-tracking (ET) techniques in the social and psychological sciences, as well as in market research and human resources management, there are a number of scientific disciplines in which the use of ET has barely penetrated. One of these is political psychology, which mostly studies psychological processes involved in the relation between mass publics and democratic politics. In the Western world, the importance of democratic politics is paramount, since it impacts every aspect of people’s lives. There are, however, some specific moments when people are requested to intervene directly in the democratic process by electing their representantives. In elections, people’s choices are usually based on a mix of political party sympathies, policy preferences, and available individual candidates. One of the factors that influence how people choose among candidates is a heuristic process of inference of candidates’ personality traits from their facial appearance (Lawson, et al. 2010; Little, et al. 2007; Berggren, Jordahl and Poutvaara 2010; Olivola and Todorov 2010). Indeed, people easily and consistently infer personality traits from facial features (Todorov, Said, Engell, & Oosterhof 2008). However, facial features are also used to infer social category membership (e.g. gender, age, race), thereby activating category stereotypes, which are mostly based on personality traits, and thus influencing electoral choice. Race and gender in particular are prone to perceptual biases favouring higher-status or dominant groups (e.g. Whites, males) over lower-status or subordinate groups (e.g. Blacks, females) (Jost, 2001; Jost, Pelham & Carvallo, 2002). In this chapter we report a pilot study testing the effect of skin tone and gender on inference of personality traits and its consequences on electoral choice, while exploring the usefulness of ET data for this purpose.

Voting on a Face

3

Candidates’ appearance and voting choice In common sense discourse, politicians’ appearance is often believed to have an effect on their electoral success. Likewise, candidates’ appearance has been a central concern for political campaign organizers for a long time (Schlesinger, 1994). Campaigners view candidates as a marketable asset, just like any product, service, or brand, and they do whatever it takes to make the “package” as attractive as possible for the consumer (i.e. voter). Thus, a good deal of work goes into making candidates more attractive, to the point of photoshopping pictures of candidates for campaign purposes. On the other hand, as Lawson and colleagues (2010) note, political scientists remain sceptical on the impact of candidates’ appearance on their electoral success, preferring to believe that electoral success depends almost exclusively on voting systems, ideological and political party identification, and the performance, real or perceived, of incumbents in key policy areas (Merrill and Grofman, 1999; Miller and Shanks, 1999). So, is this effect of appearance on voter choice simply a preconception, or does it correspond to reality? In fact, a growing body of research offers empirical support to the idea that politicians, just like people in other professions, are judged heuristically by their non-verbal behaviour, by their image in general, and by their facial features in particular, and that these have an impact on voters’ choices (Mattes et al. 2010). Thus, unlike what one might like to believe, voters are not purely rational thinkers, let alone decision makers, and we can trust them to make decisions based on heuristics (Quattrone & Tversky 1988). The past ten to fifteen years have witnessed a surge of research in political psychology on this subject (Little et al. 2007; Ballew & Todorov 2007; Castelli, et al. 2009) but seminal research going back to the 1980’s (e.g. Sigelman, Sigelman, & Fowler 1987) shows that physically attractive candidates have better results than those who are not (Sigelman, et al. 1986; Sigelman, Sigelman & Fowler 1987). More recent research found that participants’ inferences of competence of candidates based on their facial appearance (i.e. participants had no prior knowledge of the candidates) predicted those candidates’ actual electoral success, whereas inferences related to both trust and likeability did not (Todorov et al., 2005).

Race, gender, and voting choice Gender and race may play a role in voting choice in several different ways. People are usually biased to favour members of their own group (i.e.

4

Chapter One

ingroup bias; Brewer, 2007; Tajfel and Turner, 1986), because of identity concerns or trust. But they may also be biased to prefer members of dominant or higher-status groups (e.g. males, Whites), irrespective of whether they are ingroup or outgroup members, because these are considered to be more experienced and competent in wielding power (Jost, 2001; Jost, Pelham & Carvallo, 2002). In both cases, stereotypes of the groups concerned play a role. Voters may use non-party social category stereotypes to decide if candidates are honest, trustworthy, competent, or share the same political ideology and policy preferences as the voter; in the US, women are often viewed as more honest than men, and Blacks as more liberal (i.e. favouring social welfare policies) than Whites (McDermott, 1998). In Portugal, subtle (but not blatant) prejudice against Blacks is relatively widespread (Vala et al. 1999); and the operation of unconscious perceptual biases and stereotyping. In particular, Whites tend to spend less time processing Black faces in order to infer traits than they do processing White faces (intergroup time bias), and individual differences in this time bias correlate with implicit measures of prejudice (Vala et al. 2012), which indicates that people spending less time processing a target will tend to use negative stereotypes more easily. Finally, recent research shows that Whites use facial features and skin tone independently to infer how much Black stereotypes apply to a member of that category, influencing their affective reactions (Hagiwara, Kashy, and Cesario, 2012). Interestingly, because afrocentric features are also present in Whites, differences in afrocentric features in Whites also influence judgements, with negative practical effects (i.e. sentencing, Blair, Judd, and Chapleau, 2004). What is not known, however, is whether skin tone also influences judgements independently of categorization.

Can eye tracking tell the tale? Humans show a consistent pattern of eye movements which can best be described as ‘saccade and fixate’ strategy (Land 1999). When we look at a scene or search for a specific area in a visual field, our eyes usually move every 250–350ms. These movements serve to move the fovea centralis to the part of visual field that is going to be processed in high resolution (Rayner & Castelhano 2007). This process is at the basis of selectivity of attention first noted by William James (1890), and well-exemplified in his Latin phrase ‘Pluribus intentus, minor est ad singular sensus’, which expressed the fact that the human sensory system is limited and unable to attend to many things at the same time. Considering this process, choosing

Voting on a Face

5

a candidate is perhaps not that different from the process of choosing a product on a shelf, in which we are overwhelmed with information and have to select and pay attention to that which seems most relevant. The inspection of the visual field is performed minutatim, but not in its entirety. Attention is the selective process by which the minutiae are chosen. It coordinates the perception-action cycle and preserves goals over time, despite its limited capacity. This imperious information selection characterizes the visual selective attention. In any visual stimuli (e.g. an image of a candidate) attention can be deployed in one of two ways: endogenously or exogenously (Posner, 1980). In endogenous attention, attention is assumed to be under the overt control of the subject, (e.g., “I am searching for a serious politician, and I will attend to a face that expresses this quality”). This is also known as “top-down” or goal-driven attention (Yantis 1998). Endogenous attention is voluntary but has a slow time course. In contrast, attention can also be reflexive or exogenous when it is driven by an external stimulus that automatically captures attention to a specific visual area. This has been defined as “bottom-up” or stimulus-driven attention. For instance, the face of a Black candidate among face of White candidates will capture attention exogenously due to a contrast effect. Exogenous attention attracts attention in an automatic fashion and has a faster time course than endogenous attention (Cheal & Lyon 1991). When a specific visual area or object in a scene is selected by attention, it will be processed at high resolution, and other visual areas or elements in the visual field are concurrently suppressed. In other words, when a visual area captures the interest, the gaze moves to fixate it. This attentional process is achieved through both bottom-up and top-down mechanisms. The former process is related to the visual elements (e.g. contrast, luminance); the latter is initiated from higher cortical centres and driven by affective states, goals, memory or context (Rayner & Castelhano 2007). The combination of these mechanisms, along with other cognitive faculties, is at the basis of selective visual attention.

Study Overview The aims of this study were to explore the role of race and gender cues in facial appearance-based inferences of personality traits of targets presented as potential candidates, the effect of these inferences on voting preferences for those faces, and the role of eye-tracking data in understanding the relation between these effects. We decided not to use actual Portuguese candidates’ faces because a) the number of Black

6

Chapter One

candidates in Portuguese politics is scarce and b) candidates are often associated to a specific political history / party. We also chose not to use faces of US politicians because a pre-test showed that participants thought they did not look like Portuguese politicians. Finally, we chose not to use White vs Black faces, because the latter vary greatly in both skin tone and afrocentric features, thereby confounding the independent effects of both of these dimensions (Hagiwara et al. 2012). Instead, we manipulated the skin tone of male and female White computer-generated faces from existing databases, creating matched dark-skinned versions. We were interested in knowing what would be the effects of gender (male vs female) and skin tone (dark vs light) on trait inferences and voting choices, as well as the effects of the traits on the choices. We were also interested in knowing the difference in attention elicited by these cues in targets, as measured by speed of fixation.

Method Participants 30 Portuguese undergraduate students (16 males; 25 Whites and 5 Blacks; Mean age 21yrs ± 2) at Lusophone University were recruited by research assistants on campus. They were asked to participate solely to help advance scientific knowledge, and they participated voluntarily, receiving neither credits nor other external incentives. Potential participants with prior experience in Eye Tracking experiments or having previously worked with any of our research team members were immediately screened out. Also, only participants with normal or corrected-to-normal visual acuity were included. 62% of participants reported having voted in the last general elections, but 86.7% of participants did not view themselves as represented in those elections.

Procedure and stimuli Participants were told that they would be participating in an experiment to evaluate political candidates. Each participant was seated individually in a soundproof room 60cm from an Eye-Tracking screen and asked to keep their eyes focused on the screen. The experimenter explained that they would carry out two tasks. In Task 1, they would have to rate 80 images of faces each on four personality traits.

Voting on a Face

7

Before the actual task itself, participants carried out a training task (which consisted of a simulation using four images, one of each category) to guarantee that the instructions were fully understood. Each stimulus (face) was presented after a fixed inter-stimulus interval (ISI; 1000ms) and fixation point (500ms) for 2000ms, and followed by an instructions screen asking to rate the face on each of the four traits, with no time limit. The images of the faces were presented in random order, with a resolution of 1280 x 1024 pixels. Task 1 lasted around 30mins. Faces varied in gender (male vs. female) and skin tone (light vs. dark). We selected randomly forty pictures (twenty male and twenty female) of bald Caucasian faces from the database generated with FaceGen Modeller 3.1 (Singular Inversions 2004), as described in Oosterhof and Todorov (2008).

Fig. 1-1: Task 1 Stimuli

From these, forty more dark skinned faces were created through the image editing software ADOBE PHOTOSHOP CS5. In this darkening process based on Hue, Saturation and Lightening adjustments7, only the skin tone was altered. Altogether, eighty faces within four categories served as experimental stimuli (See Figure 1-1). In Task 2, we created a competing visual stimuli paradigm in which participants had to vote (by mouse-clicking) for one out of 4 ‘candidates’ whose faces were presented simultaneously on the screen. The faces used were the same as those that had been presented individually in Task 1 and the stimuli presentation followed the same steps used in Task 1: ISI, fixation point and stimuli. There was no time limit for this choice. Each set of images matched the lighter and darker-skin versions of the images of two males and two females, and was presented 4 times in a latin square fashion, that is, each face was presented once in each of the four possible 7 Darkening adjustment values were +5, +14 and -39 for Hue, Saturation and Lightness respectively, using the white faces as reference.

8

Chapter One

positions (top-right, down-right, top-left, down-left). The sets were presented randomly as in Task 1. An example with the sequence is presented in the figure 1-2 below. The same software and equipment used in Task 1 was used to design and present this task. We were interested in understanding which faces would be selected more frequently and which facial features, traits, and social categories (namely gender and skin tone), would influence their choice. In both tasks, the eye-tracking apparatus recorded a number of different measures; in this chapter, we report only the time to first fixation of the facial stimuli.

Fig. 1-2: Sequence and competing visual stimuli (latin square; example) used in task 2

In order to minimize the contrast during picture presentation the background was presented on a grey colour (RGB: 150, 150, 150). All images were resized to a resolution of 320x256 pixels. Altogether, 80 trials were presented at the corners of the screen (one image per corner) in the latin square design. The size of the pictures size was 7.69 ° X 6.26 ° of visual angle at a viewing distance of 60 cm. The instructions asked the participants to choose, in each image, the face of the candidate who they perceived to be the best possible candidate and the one to whom they would give their vote if this were a real election scenario. To register their choice, they had only to click on the selected candidates’ face. This would trigger the next stimuli sequence. The duration of this task ranged from 7 to 15 minutes. At the end of the second task, the participants were taken to another room where they filled in the demographic/political participation questionnaire, as well as the consent form. They were also debriefed about the experiment in general, asked to make suggestions and recommended to

Voting on a Face

9

make no comments to their colleagues about the experiment details and objectives. The whole experiment was carried out in the Experimental Laboratory at ULHT during the month of September, 2012, between 8 AM -12 PM.

Apparatus Stimuli were presented and eye movements were recorded on a TobiiT60 Eye Tracking System (Tobii Technology AB, Sweden), integrated into a TFT 17” monitor, and connected to an Intel core2duo 6550 Desktop computer. Gaze data of both eyes were recorded at 60 Hz with an average accuracy of 0.5 visual angle.

Measures Image rating scales. Each image was rated on a nine point scale (1 – Not at all [trait], to 9 – Extremely [trait]) by every participant on four different traits: ‘attractive’, ‘competent’, ‘trustworthy’, and ‘threatening’ selected from the nine traits on which the original images have already been rated by a sample of US undergraduate students (the other five traits being ‘dominant’, ‘frightening’, ‘extroverted’, ‘likeable’, and ‘mean’). We chose these four traits as previous studies suggested that they would be the most promising in either predicting electoral results or showing significant differences across groups.

Results8 Trait inferences (Task 1) The aims of this first task were a) to test the effects of gender and skin tone on the evaluation of each face on the inference of each of the four traits (attractiveness, competence, trustworthiness and threat) and b) to test the relation between these and voting choices of Task 2. We aggregated, for each participant, the ratings on each trait for all the faces of each specific category to which they belonged (light-skinned males, lightskinned females, dark-skinned males, and dark-skinned females). (see Table 1-1).

8

The statistical analysis was performed using IBM SPSS 20.0.

Chapter One

10

Table 1-1: Mean ratings on inferred traits per face category

Attractive

Light-skinned Males Females M (SD) M (SD) 2.8 (1.18) 3.34 (1.36)

Dark-skinned Males Females M (SD) M (SD) 3.13 (1.28) 3.92 (1.85)

Competent

4.5

(1.20)

4.74

(1.34)

4.78

(1.28)

4.94

(1.48)

Trustworthy

4.3

(1.26)

4.77

(1.49)

4.82

(1.28)

5.05

(1.55)

Threatening

2.6

(1.02)

2.42

(1.16)

3.13

(1.27)

2.40

(1.34)

We then carried out mixed repeated-measures ANOVA for each of the four traits with two within subjects factors [target-gender (male vs. female) and target-colour [light vs. dark]) and one between-subjects factor (participant gender). Greenhouse-Geisser correction was applied when the sphericity assumption was violated (Field 2013). Bonferroni correction was used for all pairwise comparisons. On attractiveness, we found a significant effect of target-gender, F (1, 26) = 11.70, p = .002, a significant effect of target-colour, F (1, 26) = 9.14, p = .006, qualified by a marginal interaction-effect of target-gender and target-colour, F (1, 26) = 3.06, p = .092, itself qualified by a triple interaction effect of target-gender, target-colour, and participant-gender, F (1, 26) = 5.56, p = .026. We therefore analysed the effects of male and female participants separately. For men, we found a marginal effect of target-gender, F (1, 11) = 4.71, p = .053, and a significant effect of targetcolour, F (1, 11) = 4.96, p = .048, qualified by an interaction effect of target-colour and target-gender, F (1, 11) = 6.11, p = .031. T-tests indicated that men rated dark-skinned females as more attractive than dark-skinned males, t (11) = 2.59, p = .025, and more attractive than lightskinned females, t (11) = 2.77, p = .018, but did not rate light-skinned males and females, nor light-skinned and dark-skinned males, differently (ns). In other words, males found dark-skinned female targets to be particularly attractive. For females, we found a significant effect of targetgender, F (1, 15) = 7.60, p = .015, and a marginal effect of target-colour, F (1, 15) = 3.98, p = .065, females tended to rate other females as more attractive than males, and dark-skinned target as more attractive than lightskinned targets. On trustworthiness, we found a marginal effect of target-gender, F (1, 26) = 3.88, p = .060, and a significant effect of target-colour, F (1, 26) = 7.86, p = .009, qualified by a marginally significant interaction effect of

Voting on a Face

11

participant-gender and target-gender, F (1, 26) = 2.00, p = .072, itself qualified by a triple interaction effect of gender, target-gender, and targetgender, F (1, 26) = 5.303, p = .030. We again analysed the effects of males and females separately. For males, there was only a simple effect of targetcolour, F (1, 11) = 4.87, p = .049: dark-skinned targets were perceived to be more trustworthy than light-skinned targets. For females, there was a target-gender effect, F (1, 15) = 4.73, p = .046, as well as a marginal target-colour effect, F (1, 15) = 3.15, p = .096, which were qualified by a two-way interaction effect, F (1, 15) = 5.13, p = .039. T-tests indicated that females rated light-skinned males as less trustworthy than darkskinned males, t (15) = 2.26, p = .039, as well as than light-skinned females t (15) = 4.11, p = .001; there were no differences between darkskinned females and dark-skinned males or light-skinned females, ns. On competence, results showed only a significant main effect of targetcolour, F (1, 26) = 4.55, p = .043, indicating that participants rated darkskinned candidates as more competent than light-skinned candidates. On threat, we found a main effect of target-gender, F (1, 26) = 13.49, p = .001, which was qualified by an interaction effect between gender and target-gender F (1, 26) = 9.34, p = .005, as well as an interaction effect between target-gender and target-colour, F (1, 26) = 4.41, p = .046. To analyse this interaction, t-tests showed that light-skinned males were rated as more threatening than light-skinned females, t (27) = 2.18, p = .038, but less threatening than dark-skinned males, t (27) = -2.35, p = .026, whereas dark-skinned females were rated less threatening than dark-skinned males, t (27) = 3.40, p = .002, and no different from light-skinned females, ns. In other words, males were rated more threatening than females, but darkskinned males particularly so. To analyse the interaction between participant-gender and target-gender, we analysed the effects of males and females separately. For females, we found only an effect of target-gender, F (1, 15) = 19.33, p = .001, indicating that male targets were rated more threatening than female targets. For males, there were no significant effects. In sum, dark-skinned female targets were rated highest on attractiveness, trustworthiness, and competence, whereas light-skinned males were rated lowest on all three positive traits. However, dark-skinned males stood out as being rated the most threatening. Intriguingly, lightskinned males were rated poorly in terms of trustworthiness when compared with other categories.

12

Chapter One

Participant race effects (Task 1) As we had a very small number of Black participants, we did not test the effect of participant-race together with that of participant-gender and target race and gender. Instead, we tested it separately, for each trait assessment and for each target gender/colour face type, with the nonparametric Mann-Whitney test, since there are some issues concerning group size [White/Caucasian (n = 24) and Black/African (n = 4) and homogeneity of variances. The only effect was on perceived threat of the light-skinned male stimulus faces, U = 17.00, p = .042, with White participants (MR = 15.79) rating those faces higher on perceived threat than Black participants (MR = 6.75), as well as on the perceived threat of light-skinned female face, U = 13.00, p = .019, with White/Caucasian participants (MR = 15.95) rating these faces significantly higher on threat than Black/African participants (MR = 5.75). Lastly, participant race was also a factor on the level of perceived threat in Dark-skinned Females, U = 17.50, p = .042. Like in the previously described comparisons, White/Caucasian participants (MR = 15.77) showed higher values than Black/African (MR = 6.88).

Mean time to first fixation (Task 1) Consistant with an attentional-process account of speed of fixation based on perceptual salience, and assuming that dark skins (Blacks, South Asians, etc) are more salient in Portuguese society than light skins (Whites), darker skinned faces elicited faster fixation (mean times to first fixation in seconds). Dark-skinned males were fixated faster (MTFF; M = .26; SD = .22) than dark-skinned females (M = .52; SD = .22), t (24) = 5.918, p = .000, as well as than light-skinned males (respectively M = .32; SD = .24 and M = 1.01; SD = .53), t (19) = -6.23, p < .000. Dark-skinned females also had lower MTFF than light-skinned females, t (19) = -4.81, p < .000 (respectively M = .53; SD = .21 and M = .98; SD = .46). There were no differences, however, between light-skinned males and females. In sum, dark-skinned faces elicited lower MTFF’s than light-skinned faces, but dark-skinned male faces elicited even lower MTFF’s than darkskinned female faces.

Candidate choices (Task 2) To test the effect of target-group race and gender on candidate selection, we again carried out a mixed repeated-measures ANOVA on the

Voting on a Face

13

number of selected faces from each category with two within subjects factors [target-gender (male vs. female) and target-colour [light-skinned vs. dark-skinned]) and one between subjects factor (participant gender). Participants chose from among 19 faces per category presented 4 times each (and thus had a total of 80 chances to select from each category). Greenhouse-Geisser correction was applied, as the sphericity assumption was violated (Field 2013). Results indicated a main effect of target-colour, F (1, 28) = 16.63, p < .001, indicating that dark-skinned targets were chosen more than lightskinned targets, but this was qualified by an interaction effect of targetcolour and target-gender, F (1, 28) = 5.48, p = .027. T-tests indicate that dark-skinned females (M = 23.73; SD = 15.89) were chosen significantly more than light-skinned females (M = 9.40; SD = 6.34), t (29) = 4.47, p < .001. There were no other significant differences (light-skinned males, M = 13.17, SD = 11.78; dark-skinned males, M = 16.57; SD =12.46). (Figure 1-3).

Fig. 1-3: Average number of selections for “favourite face” across categories

Mean time for first fixation (Task 2) To test the effect of target-group colour and gender on mean time to first fixation (MTFF) with the competing visual stimuli paradigm, we again carried out a mixed repeated-measures ANOVA with two within subjects factors [target-gender (male vs. female) and target-colour [lightskinned vs. dark-skinned]) and one between subjects factor (participant gender). Greenhouse-Geisser correction was once again applied as the sphericity assumption was violated (Field 2013).

14

Chapter One

Results indicated no main effects, but we found a marginal interaction effect of target-colour and target-gender, F (1, 28) = 3.50, p = .072, qualified by a triple interaction between these two within-subjects factors and the between subjects factor F (1, 28) = 5.84, p = .022. We therefore analysed males and females separately. For men, the results show that light-skinned candidates had higher MTFF (M = 1.50, SD = .140) than dark-skinned candidates (M = 1.25, SD = .110). Furthermore, lightskinned female candidates elicited significantly higher MTFF (M = 1.56, SD = .170) than light-skinned male candidates (M = 1.43, SD = .118). No significant results were found for women. Overall, the results show that light-skinned candidates’ faces had higher MTFF when perceived by men and that light-skinned females had the highest MTFF, again only when considering men. In other words, men’s attention was drawn faster to dark-skinned targets.

Association of traits with candidate choices To understand the relation between trait inferences and candidate preferences, we checked the correlations between participants’ mean ratings of each trait inference (trust, competence, attractiveness, and threat) for each gender-colour facial combinations (light-skinned males, light-skinned females, dark-skinned males, dark-skinned females), on the one hand, and the number of targets with that gender-colour combination that each participant chose. We found only two marginally significant correlations for our small sample size: for light-skinned male targets, competence ratings had a moderate negative correlation with number of choices (r = -.36, p = .062); and for dark-skinned male targets, threat ratings had a moderate negative correlation with number of choices (r = .37, p = .053). The unusual, though non-significant, negative correlation between number of choices of light-skinned male candidates and their perceived competence may not necessarily mean that participants actually prefer less competent candidates. It might be, rather, that at a time of disappointment with failed economic policies, participants are wary of their impressions of candidates’ competence. Our data, however, do not allow us to verify this hypothesis.

Discussion In Task 1, we were interested in understanding effects of target gender and skin tone on trait inferences. We found that women were rated as more attractive than men, and dark-skinned targets more attractive than their

Voting on a Face

15

lighter-skinned counterparts. For evolutionary reasons to do with their greater value in terms of reproductive resources, women are usually considered more attractive than men. However, the higher rating of attractiveness for darker skinned targets is not as obvious and runs counter to conventional wisdom that darker skin is associated to lower social status. Nevertheless, the Portuguese are a Southern European people, and regard tanning as both natural and attractive, which might partly explain this. However, these effects were similar to those on the ratings of trustworthiness: lighter-skinned males were identified as being less confidence-worthy than other categories. Previous studies show that more attractive people are generally perceived as being morally better (Dion, Berscheid and Walster 1972; Langlois, et al. 2000), although it can also be the case that it is less attractive people who are more negatively perceived (Griffin and Langlois 2006). In any case, our results similarly suggest that there is a relation between attractiveness and trustworthiness. Also, Portuguese politicians, most of whom are White males, are typically rated as untrustworthy: this could have had an effect on the ratings of the targets, who were presented as potential political candidates. Finally, males were judged more threatening than females, and dark-skinned males more than light-skinned males, which could reflect the stereotype of Black males. Conversely, however, dark-skinned candidates were seen as more competent than light-skinned candidates, which does not correspond to the stereotype of Blacks. As for the eyetracking-based measures of mean time to first fixation (MTFF) in Task 1, these showed that dark-skinned faces were fixated faster than light-skinned faces, and dark-skinned males fastest of all. Note that, on the one hand, darker-skinned faces were rated as more attractive but, on the other hand, darker-skinned male faces were also rated as more threatening. This suggests that both of these traits – positive and negative – attract attention faster, as they are both salient, independently of their valence. Results of Task 2 suggest that the social context of presentation of the target-faces was very relevant for participants’ inferences and choices. When asked to select the most suitable candidate, participants did not choose the ones most prototypical of Portuguese politicians (given that most of these are White males), but rather preferred dark-skinned candidates. At first glance, this is consistent with their ratings of attractiveness, which some studies have found to be related to electoral performance (Langlois, et al. 2000), in Task 1. Conversely, the higher rating of dark-skinned candidates on threat would have led us to predict

16

Chapter One

the revers pattern (as indicated by Mattes and colleagues, 2010). However, threat is the trait with the lowest mean values, which might explain its lack of significant impact on candidate choice. We expected the same pattern of results on MTFF as that in Task 1. However, we found no significant differences were found between categories. This indicates that further testing is required in order to identify the most salient aspects of ET and their relative importance in the explanation of the candidate’s selection process.

Conclusions The results show some usefulness of ET as a technique with potential to help explain the most salient aspects of attentional processes in candidate selection and political voting decision making. The fact that the most selected categories (in Task 2) were also the ones that had lower MTFF (in Task 1) indicates that some underlying aspects of attention might contribute to explain candidate selection. Attractiveness seems to have played the most significant role in differences in candidate selection rate between categories. Competence, on the other hand, did not play a significant role in candidate selection, which could be mostly due to the fact that it was less distinctive between categories. Given the exploratory nature of the study, the authors choose not to previously define hypothesis, since this would probably lead to greater confusion in the discussion of the results and add little to the current stateof-the-art in the field. Nevertheless, there is a sound body of knowledge in trait inferences from facial features and these specific questions were already discussed in other studies. However, ET results are harder to discuss considering the limitations in the current state-of-the art. Also, the small sample size demands prudence in interpretation. Finally, with a larger sample size, and in order to determine causality, crossover designs with alternate order of tasks should be used. Another issue concerns the stimuli used. On one hand, the use of randomly generated faces solves the problems associated with ecological validity and prior knowledge of particular politicians that could lead to biased results. However, it also adds to the questions about the ecological validity of doing political experiments in laboratory. Moreover, participants mentioned the fact that a lot of faces looked too similar to each other, which interfered with task motivation.

Voting on a Face

17

Future Research The lack of research using ET in political psychology and, thus, lack of comparable studies, limits our ability to derive conclusions on electoral choice from this particular study. However, this should not be a reason to dismiss its usefulness for that field. Beyond the use of extra-ocular movements, such as the number of fixations, fixation duration, saccades, etc., other complex ET measures can be usefully used in political and electoral psychology. Pupil dilation (intraocular movement) is thought to be a reliable index of emotional arousal and cognitive load. As such, it could be useful in decision-making studies, such as this one, where the objective is to identify the underlying processes of political choice. Moreover, other traditionally used ET measures can be used to identify which facial features are more important to the inference and decision processes.

Acknowledgements A special thanks is in order for our undergraduate students at ULHT considering that they received no credit or reward for their cooperation in this experiment. We also acknowledge the contribution of Tobii to our experimental laboratory’s daily activities.

Bibliography Abramoff, Michael D., Paulo J. Magalhaes, and Sunanda J. Ram. “Image Processing with ImageJ.” Biophotonics International 11, no. 7, (2004): 36-42. Ballew, Charles C., and Alexander Todorov. “Predicting political elections from rapid and unreflective face judgments.” Proceedings of the National Academy of Sciences of the USA 104, (2007): 17948–17953. Bentley, Tom. Everyday Democracy. Demos – Available from www. demos.co.uk/ publications/everydaydemocracy, (2005). Berggren, Niclas, Henrik Jordahl, and Panu Poutvaara. “The Looks of a Winner: Beauty and Electoral Success.” Journal of Public Economics 94, (2010): 8-15. Blair, Irene V., Kristine M. Chapleau, and Charles M. Judd. “The influence of afrocentric facial features in criminal sentencing.” Psychological Science 15, no. 10 (2004): 674-679. Brewer, Marylin. “The social psychology of intergroup relations: Social categorization, ingroup bias, and outgroup prejudice.” In A. Kruglanski

18

Chapter One

& E. T. Higgins (Eds.), Social Psychology: Handbook of Basic Principles (2007): 695-715. New York: Guilford Press. Castelli, Luigi, Luciana Carraro, Claudia Ghitti, and Massimiliano Pastore. “The effects of perceived competence andsociability on electoral outcomes.” Journal of Experimental Social Psychology 45, (2009): 1152–1155. Dion, Karen, Ellen Berscheid, and Elaine Walster. “What is beautiful is good.” Journal of Personality & Social Psychology 24, (1972): 285– 290. Eagly, Alice H., Richard D. Ashmore, Mona G. Makhijani, and Laura C. Longo. “What Is Beautiful Is Good, But: A Meta-Analytic Review of Research on the Physical Attractiveness Stereotype.” Journal of Management 27, (2001): 363–381. Evans, Will. Eye Tracking Online Metacognition: Cognitive Complexity and Recruiter Decision Making. Technical Report, TheLadders, 2012. Field, Andy. Discovering statistics using IBM SPSS Statistics: And sex and drugs and rock ’n’ roll (4th ed.). London: Sage, 2013. Granka, Laura, Thorsten Joachims, and Gary Gay. “Eye-tracking analysis of user behaviour in WWW search.” Proceedings of the 27th annual international ACM SIGIR conference on Research and development in. Sheffield, United Kingdom: ACM Press, (2004): 478-479. Griffin, Angela M., and Judith. H. Langlois. “Stereotype directionality and attractiveness stereotyping: Is beauty good or is ugly bad?” Social Cognition 24, no. 2 (2006): 187–206. Hagiwara, Nao, Deborah A. Kashy, and Joseph Cesario. The independent effects of ski tone and facial features on Whites’ affective reactions to Blacks. Journal of Experimental Social Psychology 48, (2012): 892898. Huang, Jeff, Ryen W. White, and Susan T. Dumais. “No clicks, no problem: using cursor movements to understand and improve search.” CHI. (2011): 1225-1234. Inversions, Singular. “Facegen Modeller 3.1.” Vancouver, B.C, (2004). Jost, John T. Outgroup favoritism and the theory ofsystem justification: An experimental paradigm for investigating the effects of socioeconomic success on stereotype content. In G. Moskowitz (Ed.), Cognitive socialpsychology: The Princeton symposium on the legacy and future of social cognition (2001): 89–102. Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Jost, John T., Brett W. Pelham and Mauricio R. Carvallo. Non-conscious forms of system justification: Cognitive, affective, and behavioural

Voting on a Face

19

preferences for higher status groups. Journal of Experimenntal Social Psychology 38, (2002): 586-602. Langlois, Judith H., Lisa Kalakanis, Adam J. Rubenstein, Andrea Larson, Monica Hallam, and Monica Smoot. “Maxims or myths of beauty? A meta-analytic and theoretical review.” Psychological Bulletin 126, (2000): 390-423. Lawson, Chappell, Gabriel. S. Lenz, Andy Baker, and Michael Myers. “Looking like a winner: Candidate appearance and electoral success in new democracies.” World Politics 62, no. 4 (2010): 561-593. Little, Anthony C., Robert P. Burriss, Benedict C. Jones, and S. Craig Roberts. “Facial appearance affects voting decisions.” Evolution and Human Behaviour 28, (2007): 18-27. Mattes, Kyle, Michael Spezio, Kim Hackjin, Alexander Todorov, Ralph Adolphs, and R. Michael Alvarez. “Predicting Election Outcomes from Positive and Negative Trait Assessments of Candidate Images.” Political Psychology 31, no. 1 (2010): 41-58. McDermott, Monika. Race and gender cues in low-information elections. Political Research Quarterly 51, no. 4 (1998): 895-918. Merrill, Samuel III, and Bernard Grofman. A Unified Theory of Voting Cambridge: Cambridge University Press, 1999. Miller, Warren E., and J. Merrill Shanks. The New American Voter Cambridge, MA: Harvard University Press, 1996. Olivola, Cristopher Y., and Alexander Todorov. “Elected in 100 milliseconds: Appearance-based trait inferences and voting.” Journal of Nonverbal Behaviour 34, (2010): 83–110. Oosterhof, Nikolaas N., and Alexander Todorov. “The functional basis of face evaluation.” Proceedings of the National Academy of Sciences of the USA 105. (2008): 11087-11092. Pieters, Rik, and Michel Wedel. “Attention Capture and Transfer in Advertising: Brand, Pictorial and Text Size Effects.” Journal of Marketing 68, (2004): 36–50. Quattrone, George A., and Amos Tversky. “Contrasting rational and psychological analysis of political choice.” American Political Science Review 82, (1988): 716–736. Schlesinger, Jr., Arthur M. Running for president: The candidates and their images. New York: Macmillan, 1994. Sigelman, Carol K., Lee Sigelman, Dan B. Thomas, and Frederick D. Ribich. “Gender, Physical Attractiveness, and Electability: An Experimental Investigation of Voter Biases.” Journal of Applied Social Psychology 16, (1986): 229–248.

20

Chapter One

Sigelman, Lee, Carol Sigelman, and Christopher Fowler. “A Bird of a Different Feather? An Experimental Investigation of Physical Attractiveness and the Electability of Female.” Social Psychology Quarterly 50, (1987): 32-43. Tajfel, Henri, and John. C. Turner. The social identity theory of intergroup behavior. In S. Worchel & L. W. Austin (Eds.), Psychology of Intergroup Relations. Chicago: Nelson-Hall, 1986. Todorov, Alexander, Anesu Mandisodza, Amir Goren, and Crystal Hall. Inferences of competence from faces predict election outcomes. Science 308, (2005): 1623-1626. Todorov, Alexander, Chris. P. Said, Andrew D. Engell, and Nicholas N. Oosterhof. Task-invariant Brain Responses to the Social Value of Faces. Trends in Cognitive Sciences 12, (2008): 455-460. Vala, Jorge, Rodrigo Brito and Diniz Lopes. Expressões dos racismos em Portugal [Expressions of racisms in Portugal]. Lisboa: Instituto de Ciências Sociais 1999. Vala, Jorge, Cícero R. Pereira, Marcus E. Lima, and Jacques-Phillipe Leyens. Intergroup time bias and racialized social relations. Personality and Social Psychology Bulletin 38, no. 4 (2012): 491-504. Wedel, Michel, and Rik Pieters. “Eye Fixations on Advertisements and Memory for brands: A Model and Findings.” Marketing Science 19, (2000): 297–312.

CHAPTER TWO MODELLING HUMAN VISUAL DETECTION OF ANTI-SOCIAL BEHAVIOUR JOHN R. ELLIOTT1 AND JAMES A. RENSHAW1 Abstract With the aim to investigate how human observers visually detect criminal behaviour, for purposes of modelling such process in automated intelligent computer systems, this paper describes the analysis of eye movements and interviews of observers and specialist police support personnel (analysts) as they viewed selected actual CCTV recordings of street scenes in the city of Leeds (UK). The eye movements were captured by means of a non invasive eye tracker as the participants watched a selection of previously recorded, randomly sequenced video clips; some containing criminal incidents and others containing none. Each participant was also interviewed after their individual eye tracking session, for insights and narrative, which then contributed towards modelling behavioural triggers.

Overview The objective of this particular study was to determine, by means of interview and eye tracking methodologies, what is it that attracts observers’ attention as they watch CCTV live feeds of street scenes, with a view to modelling these attributes and incorporating the intelligence gleaned into a predictive artificial intelligence system. Qualitative data indicates that there might well be differences between the way CCTV control room staff view the live images on the screen as events unfurl and the way analysts examine the recordings for evidence 1

School of Computing, Creative Technologies and Engineering, Leeds Metropolitan University, Leeds, LS6 3QS, UK.

22

Chapter Two

post the event. However, this study failed to establish any statistically significant differences between these two groups. From the interviews, it would appear that gathering useful data for the identification of those taking a lead in a disruptive incident is a priority for all observers. A secondary objective, for those watching an incident live, is to determine the location of the incident, the direction of travel of the protagonists and to plan which camera(s) in the system might next be deployed to keep track of the incident. These objectives are pursued against a backdrop of an individual observer’s knowledge of the area in terms of both topography and reputation for crime, experience as an observer, main responsibilities as observer (to observe or provide evidential material) and knowledge of what events might be taking place within the city. In describing what they were seeing, the following factors appeared most often: Time of day, location, number and gender of those involved with the incidents, their age category (young, girl/boy, man/woman), ethnicity, clothing, potential weapons being carried. The activities of the lead antagonists, their body language, the nature of their walking posture, what they appear to be looking at, their gesticulations, raised limbs, head movements, speed and direction of movement, and the objects of their attention.

Observers also stated that the weather, clubs, pubs, car parks and unlit areas were factors that heightened their expectation of trouble and increased their vigilance. Play fighting is very often a precursor for serious incidents. Eye movement studies generally find that movement is a great attractor of attention and this study is no exception. From our eye tracking data it is evident that movement grabs attention. In fact, its importance is perhaps emphasised in the mind of the observers, as incidents will often involve the movement of people reacting with object(s) of interest. At night, colour is not present or is very distorted by artificial light, and so cannot be a factor of prime importance in stimuli, but brightness and contrast may be of significance. Trained observers pay attention to who is doing what, as well as gathering data on time and date, with perhaps less emphasis being placed on the “where” of the incident. Areas of the scene or people who do not appear to be immediately involved with the incident do not appear to be fixated upon in the five seconds building up to it. This would seem to imply that the observers have an idea in their minds as to what could unfold and what is important, before the incident actually takes

Modelling Human Visual Detection of Anti-Social Behaviour

23

place. Perhaps eye movements only tell part of the story, being the product of thought processes based on experience rather more than of what there is in the scene, apart from the influence of movement as discussed.

Eye Tracking Viewers’ reactions to dynamic images used as visual stimuli in eye tracking research are at the centre of this study. Interest in eye tracking itself is not a recent phenomenon. According to Jacob and Karn (2003) it has been of interest for over 100 years. It has become non-invasive, more reliable, faster, accurate, cheaper and easier to use, particularly over the last decade (Duchowski, 2007; Webb and Renshaw, 2008). These developments have enabled its application to an ever increasing range of stimuli from those designed for use in the psychology laboratory to the designs of web pages. In a majority of cases, the traditional eye tracking metrics are very much dependant on what objects the eye gazes can be considered to be fixated upon (Webb and Renshaw, 2008). It is a trivial fact that eyes are normally always on the move. However, there are occasions when the focus of attention is restricted to a small area for a few milliseconds. At these times the eyes are said to be fixating, and under certain conditions it can be inferred that what is being fixated upon is also being thought about. This is an essential and fundamental assumption of the eye-mind hypothesis (Just and Carpenter, 1976). When attention is directed elsewhere, either by means of top-down direction or some salient feature in the visual range, the eye is moved to focus upon it. This movement is said to be ballistic, that is to say once a decision has been made to move to a certain location and the movement has started it cannot be interrupted until after the predetermined location has been achieved. This movement between fixations is known as a saccade. Eye tracking systems record where the eyes are looking at frequencies ranging from 30 to over 1000 times a second; data is usually recorded in the form of pairs of X,Y coordinates in terms of pixels for both eyes (Points of Regard: PoR). These coordinates are then aggregated automatically, using computer software, into fixations.

Eyes and Perception The basis of the argument that there could be a link between eye movement and emotions is that the eye is very much an extension of the brain; the retina at the back of the eye having a very similar cell structure to parts of the brain (Gregory, 1998). Much of the research based on eye

24

Chapter Two

tracking is based on the assumption that the overt visual attention detected by the eye tracker gives us a window onto a participant’s cognitive processes and has been used to evaluate cognitive processing (Rayner, 1998; Goldberg and Kotval, 1999; Duchowski, 2007) and to develop models of perceptual mechanisms (Peebles and Cheng, 2002).

The Perception of Motion The perception of motion is a complex research field. There seems to be some debate about the precise mechanisms involved (Hochberg et al, 2007; Palmer, 1999). Hochberg et al (2007) discuss the perception of motion in greater depth than is possible here. However, one of the key factors of relevance to this paper is the influence of the story line. He states: “The story structure encodes the too-numerous trajectories [of objects in the scene] as a smaller number of more distinctive and familiar purposeful actions” Hochberg p 382

He argues that the way we perceive motion is synthesized from the hypotheses the viewer develops from experience, the ‘story line’ and what is seen, and that these influences drive eye movement (Hochberg, 2007). However, other perceptual processes other than those dependent upon fixations may also be involved, for one is aware of the context of the scene without having to fixate each section of the screen to determine it (Underwood et al, 2005). It is therefore important to recognise that eye tracking’s dependence on fixations may not be sufficient to capture and define the viewer’s total experience.

Background Like many local authorities in the UK, Leeds City Council has, established a network of CCTV cameras located in areas where their deployment is thought to be advantageous from a security point of view. They typically provide views of street scenes from high vantage points that pan a 360 degree rotation and have a zoom in/out capability. The cameras are controlled remotely from a control room situated – in Leeds City Council’s case – in an urban area away from the city centre. The cameras are linked to a multi-screen display in the control room, and observers are able to select what camera feed they place on the terminal in front of them. Observers follow an incident or potential incident by calling down images from cameras they believe will give them the best view of

Modelling Human Visual Detection of Anti-Social Behaviour

25

the scene. Recorders record all the images from the cameras continuously for post-event review purposes. The staff in the CCTV unit look at incidents from the cameras around Leeds, listen to the police radio and, if possible, focus the relevant cameras on the scene. Post-event, they complete an information sheet; the main writing space of which permits comments about the incident as observed. Each incident is logged. The control room staff does not use the replay facility when completing the incident sheet (IS) – they use any notes they might have taken and their memory attempting to recall the camera numbers used as events unfold. There is no field on the sheet for this information. The IS is used to assist supervisors locate recordings, if they need to review an incident. The police analysts, on the other hand, are usually involved post incident. They use the stop, play and edit facilities to isolate clips of interest.

Method Participants Thirteen people volunteered to participate: twelve male and one female. Eight participants (Participants 1-8) were employed by Leeds City Council as observers in the CCTV Control Centre. None of these had been trained as police officers and they had all been employed previously in a variety of unrelated occupations. The other 5 participants (Participants 913) were all trained as police officers and were now employed as specialist analysts whose job it was to examine video footage and look for evidence prior to prosecution.

Stimuli All fourteen video clips were created from actual footage of street scenes recorded and stored in the CCTV archives. All were approximately three minutes long and recorded in colour. Eight depicted incidents both in the day and at night (Video references 2, 4, 6, 8, 9, 11, 13 and 14). Whereas others (Video references 1, 3, 5, 7, 10 and 12) were of neutral and incident-free scenes again some were recorded at night and some during the day. Where clips contained incidents, the durations of lead-up to the first incident occurring were variable; no clip started immediately with an incident. The incidents covered included: Attempted motor theft, co-ordinated group violence, mugging, armed assault, vandalism, indiscriminate violence, theft, fooling around on a pavement. None of the recordings incorporated sound.

26

Chapter Two

Apparatus The study used a 17” Tobii 1750 non-invasive eye tracker. Eyetracking as a technique for measuring physiological responses to visual stimuli has the advantage that a large volume of eye movements can be recorded in real time and non-invasively. Modern eye trackers are accurate to within 0.5 degrees of visual angle, robust and tolerant of normal lighting environments, and so ideal for the environment within which the current research took place. Furthermore, the software associated with the eye tracker has the capacity to overlay video recordings of the scene with time-stamped recorded eye movements; thereby permitting their synchronisation with the DVD images. The eye tracking system used in the current research measures pointof-regard through the use of the “corneal-reflection/pupil centre” method (Goldberg & Wichansky, 2003). An infrared camera, which is situated at the bottom of the computer monitor, directs near-infrared light from an LED into the eyes of the participant. This light is reflected from the whites of the eyes (sclera), resulting in the iris/pupils appearing as well-defined discs, the infrared light also produces a small, sharp reflection from the cornea. The software identifies the location of the corneal reflection and the centre of the pupil and by calculating the vector between them is able to determine the point-of-regard every 50th of a second. By using both the corneal reflection and pupil locations it is possible to mitigate the effects of head movements from the tracking of eye movements. The eye tracker analysis software – Clearview – records the point-of-regard as a pair of X,Y coordinates in pixels, disregards the data associated with saccades and blinks then aggregates the remaining data into fixations according to algorithms and parameters, which can be varied to suit the nature of the analysis being undertaken. The location of the computed fixation is determined by averaging the X and Y coordinates of all those eye gaze locations meeting the set parameters. For the purposes of this study, the parameters were set to accumulate any eye gaze locations that occurred within 40 pixels and 100ms of each other as fixations.

Apparatus configuration The eye tracker was set up as a two-processor one-screen configuration, as per the manufacturer’s instructions (Tobii, 2008). One processor drove the eye tracker; the stimuli were played on the second PC. The essential details of this arrangement is shown in Figure 1.

Modelling Human Visual Detection of Anti-Social Behaviour

27

The eye tracking software has the capacity to accept a range of formats as stimuli. In this study the stimulus format of “external video” was selected as this format is available from a DVD-playing PC. The splitter enables the synchronous input of the signal to the eye tracker and the eye tracker display screen, but before it can be displayed on the eye tracker screen it has to be converted back to VGA format, hence the need for signal converter between the output from the splitter and the eye tracker (Fig. 2-1).

Fig. 2-1: Eye tracking Setup

Procedure The eye tracking apparatus was set up in an open plan office at one of the desks made available to the research team. Each participant was given a briefing document, which explained that their eyes were to be tracked whilst they watched clips from recorded CCTV footage. They were then asked to sign a consent form and complete a basic questionnaire. Prior to the showing of the video clips, the apparatus had to be calibrated to the participant’s eyes; a standard process described in the eye tracker manufacturer’s operating instructions. They were informed that they could move their head normally as they watched the videos but were asked not to move their body position in any dramatic way once the recording had started.

28

Chapter Two

Each participant individually saw all 14 videos in a predetermined random sequence managed by the test supervisor, who sat in the participant’s vicinity. The participants were asked to take notes as they watched the video – a normal process for the CCTV control room staff but not for some of the police specialists who would normally stop and replay any particular scene they were interested in rather than take notes. The eye tracking element of the process lasted approximately 60 minutes. At the end of the video sequence, each participant was interviewed to assess their experience during the eye tracking session and to produce a verbal explanation of what they looked at as three selected videos were shown to them again but without their eyes being tracked. These interviews were recorded using a digital recording machine and saved for further analysis.

Data collection and preparation The eye tracker recorded the eye movements of the participants as they viewed each scene pre-recorded on video. The eye tracker’s recordings can be played pack post the eye tracking sessions to facilitate analysis with the eye movement superimposed automatically over them. Additionally, stills from these recordings can also be created with the fixations, for a specified period of time; again superimposed upon them. Thirdly, timestamped fixations, together with their spatial co-ordinates, were exported to an excel spread sheet for statistical analysis. Upon reviewing these sources of data, it was decided that data from two participants (P1 and P6) was too sparse, i.e. there were large gaps, sometimes exceeding 60 seconds in duration, between registered fixations, and data from these participants was excluded from the analysis. Although tolerant of head movement, the eye tracker can lose contact with the eyes if, over an extended period, participants change their body position significantly, resulting in a loss of data. Video recordings from the eye tracker for each participant were first reviewed, and the start and end timings of the appearance of each of the stimuli for that person established to the nearest 10th of a second. Extraneous fixations falling outside these times were removed, as they were not relevant to the study, and the remaining data amalgamated into one spread sheet for convenience. The video recordings used as stimuli were similarly reviewed, and the precise time of an incident, for those recordings including incidents, was established to the nearest tenth of a second. These timings were then used to annotate the fixation spread sheet in such a way as to highlight those fixations falling within 5 seconds either

Modelling Human Visual Detection of Anti-Social Behaviour

29

side of the incident. Those fixations immediately prior to the incident were of particular interest. Table 2-1 is an extract from the spreadsheet and shows a typical participant’s reference number, group (observer/analyst), the fixation number from the start of the recording, its timing in milliseconds, its duration and its co-ordinates in pixels (as measured from the top left of the eye tracker screen), the stimuli reference, video group (non incident(a)/incident(b) type), the incident reference, the incidents timing on this recording and the time in seconds pre/post the incident Table 1-1: Exported data sheet Participant Ref 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5

Participant Group 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

Fix Number 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178

Time stamp (ms) 933256 933475 933834 934034 934233 934752 934952 935170 935389 935588 936086 936306 936983 937183 938219 938478 939076 939335 940312 940432 940651 940890 941090 941388 941887 942106 942385

Duration (ms) 199 239 179 159 479 160 178 199 139 478 199 498 179 997 239 578 219 718 100 200 221 179 279 239 199 259 1156

Gazepoint (x) pixel 250 205 307 343 455 352 314 447 297 503 333 394 405 160 449 366 427 506 473 450 383 449 518 418 474 279 502

Gazepoint (Y) pixel 337 317 422 443 444 137 110 256 202 239 253 233 241 238 196 171 154 125 164 185 240 233 218 236 202 110 212

Video Ref 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

Video Group b b b b b b b b b b b b b b b b b b b b b b b b b b b

Incident ref 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

Incident Time (ms) 938 938 938 938 938 938 938 938 938 938 938 938 938 938 938 938 938 938 938 938 938 938 938 938 938 938 938

Actual Time dist -4.744 -4.525 -4.166 -3.966 -3.767 -3.248 -3.048 -2.83 -2.611 -2.412 -1.914 -1.694 -1.017 -0.817 0.219 0.478 1.076 1.335 2.312 2.432 2.651 2.89 3.09 3.388 3.887 4.106 4.385

Pre/post incident -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 1 1 1 1 1 1 1 1 1 1 1 1

Results Quantitative Data Eye trackers record fixations, their timing, duration, location and sequence based on how many eye gaze positions are recorded within a given area within a given time. These parameters can be defined by the analyst either pre or post the eye tracking recording sessions. When analysing eye movements over dynamic stimuli, the eye movement measures are limited, as it is uneconomic to attempt to determine what fixations are associated with what objects of interest (the spatial locations of which may also be dynamic). Thus, measures which in a static scene may be of interest, such as scan paths (paths defined by linking consecutive fixations), and areas of interest are of little relevance in

Chapter Two

30

dynamic scenes. This study considers fixation duration and their distribution, as measured by the standard deviation of their X and Y location co-ordinates, as potentially useful dependant variables. The independent variables considered are: the participants’ job function (observer: group 2 or analyst: group 1), the stimulus content [videos including incidents (b) vs. those that do not (a)] and, for those video clips that do include an incident, pre and post incident fixation attributes. Table 2-2 summarises the overall results for participant groups 1 and 2 for all video types: Table 2-2: Mean Fixation Duration and Standard Deviations for Gaze Points for both groups For Entire Duration of All Video Clips TAFD

PG 2 2 2 2 2 2 1 1 1 1 1

PT 2 3 4 5 7 8 GM 9 10 11 12 13 GM

274.4 294.9 305.7 318.3 314.5 374.1 313.7 316.7 445.0 278.7 380.2 400.9 364.3

TSDHCG TSDVCG

(X)

(Y)

114.2 124.3 105.0 108.3 128.0 139.2 119.8 149.8 100.1 123.1 98.7 123.0 118.9

85.8 86.4 83.8 95.4 100.4 106.2 93.0 96.3 73.1 72.8 77.4 102.9 84.5

Group “a” Videos TAFD

261.8 259.7 283.7 300.4 295.7 334.9 289.4 297.3 379.1 276.8 331.8 361.6 329.3

TSDHCG TSDVCG

(X)

(Y)

111.4 132.2 108.2 95.7 136.0 137.1 120.1 155.4 103.8 128.5 101.5 124.1 122.7

85.8 87.0 86.5 86.6 103.4 111.3 93.4 94.1 75.4 77.6 81.5 100.4 85.8

Group “b” Videos TAFD

288.1 328.9 333.0 336.0 331.5 392.5 335.0 331.5 501.1 282.0 419.8 431.0 393.1

TSDHCG TSDVCG

(X)

(Y)

116.6 115.8 100.5 118.6 119.6 139.3 118.4 145.0 96.7 118.8 96.1 122.2 114.4

85.8 85.2 80.1 103.3 97.4 103.2 92.5 97.9 71.1 63.9 73.9 93.8 80.1

Key: TAFD = Total Average Fixation Duration (ms); TSDHCG = Total Standard Deviation of Horizontal Component Gazepoint TSDVCG = Total Standard Deviation of Vertical Component Gazepoint GM = Group Mean; PT = Participant; PG = Participant Group.

Participant Group 1 are “observers” Group 2 are “analysts”. Group “a” videos contain no incidents, Group “b” contain an incident. Table 2-3 shows the product of the standard deviations in both the horizontal and vertical planes for the same participant, and video groupings, thereby providing a measure of the distribution of gaze in both planes.

Modelling Human Visual Detection of Anti-Social Behaviour

31

Table 2-3: Values of the product of the standard deviations for Gaze points for both groups Product of Horizontal and Vertical Group Mean Standard Deviations

Participant Group

For the Entire Durations of All Videos Clips

2 1

1147 10054

Duration of Group “a” Videos (no incident) 11218 10524

Entire Duration of Group “b” Videos 10955 9164

Tables 2-4 and 2-5 show the same statistics but only for those video clips that contain an incident. Table 2-4: Mean Fixation Durations and Standard Deviations for Gaze Points for video clips with incidents Group “b” Videos

PG 2 2 2 2 2 2 1 1 1 1 1

TAFD

Pre-Incident TSDHCG (X)

TSDVCG (Y)

TAFD

262.6 314.3 277.7 298.2 265.6 299.6 2863 295.1 450.9 265.4 417.2 431.7 372.1

110.1 128.7 106.4 129.2 132.7 137.2 124.0 145.8 94.1 123.3 84.3 149.2 119.3

85.8 95.8 96.0 123.4 137.9 110.7 108.3 101.1 77.8 49.9 72.7 91.1 78.5

278.2 319.1 339.6 317.9 257.8 380.9 315.6 268.2 460.1 273.5 414.2 388.4 360.9

PT 2 3 4 5 7 8 GM 9 10 11 12 13 GM

Post-Incident TSDHCG (X)

TSDVCG (Y)

153.3 105.5 117.0 131.2 132.6 147.2 131.2 226.3 103.6 117.7 117.6 145.1 142.0

87.6 77.6 80.5 108.3 114.0 122.7 98.4 123.0 70.1 46.7 63.4 96.7 80.0

Key: TAFD = Total Average Fixation Duration (ms); TSDHCG = Total Standard Deviation of Horizontal Component Gazepoint TSDVCG = Total Standard Deviation of Vertical Component Gazepoint GM = Group Mean; PT = Participant; PG = Participant Group.

Table 2-5: Values of the product of the standard deviations for Gaze points before and after the incidents for both groups Participant Group 2 1

Product of Horizontal and Vertical Group Mean Standard Deviations Up to 5 seconds Up to 5 seconds Pre-Incident Post-Incident 13428 12910 9372 11359

32

Chapter Two

Discussion Qualitative Results None of the differences between participant or video groups or between pre and post incident statistics achieved significance using the Student’s T test. However, the results do reflect what might reasonably be expected. Fixation durations for time gazes maintained on objects of interest were greater for analysts (1) than for observers (2) in all scenarios; overall, for video in group a and b and in pre and post incident videos. The distribution of gazes, as assessed by the product of the standard deviations of gaze X and Y location co-ordinates, show that in all instances that the analysts (2) are more focused in their gazes than the observers (1) and that the analysts (1) show a tendency to focus their gazes prior to the commencement of an incident rather than post incident. Both observers and analysts seem to focus for longer and in a more focused way during video clips containing incidents. Interestingly, observers’ gazes were more focused post an incident than prior to it, indicating perhaps a lesser skill in predicting when an incident might occur; and then when it does occur the participants start looking for clues post the event.

Quantitative Results Interviews provided some qualitative evidence of what the volunteer participants looked at during the viewing of the video clips. The participants were each asked to talk through what they observed on the three video clips, which were replayed to them post the eye tracking session. The three clips discussed were selected from the following video reference: 3 – street scene, 6 – a mugging, 8 – assault with a dangerous weapon (bottle), 10 -street scene, 11 – indiscriminate violence. From the interviews it would appear that gathering data which might then be useful in the identification of those taking a lead in a disruptive incident is a priority for all observers. A secondary objective, for those watching an incident live, is to determine the location of the incident and the direction of travel of the protagonists, and then to plan which camera(s) in the system might next be deployed to keep track of the incident. Observers also stated that the weather, clubs, pubs, car parks and unlit areas were factors which heightened their expectation of trouble and

Modelling Human Visual Detection of Anti-Social Behaviour

33

increased their vigilance. Play fighting is very often a precursor for serious incidents.

A Qualitative Study of Pre-incident Eye Movements As a result of reviewing the result obtained from the eye tracking study, within the context of the overall aim of the project it was decided to study the eye movements of analysts in the seconds before an incident took place; for it is from within this time period that any proposed artificial intelligence system would be expected to pick up and interpret clues as to what ought to be bought to the attention of CCTV control prior to a potential event. However, it has been possible, with a reasonable degree of accuracy, to use a scatter graph of all the participants’ fixation coordinates produced from data in the excel spread sheet, and superimpose this plot over the eye tracking still of one of the participants. To ensure the correct scaling, the points of one selected participant were aligned with the fixation points plotted by the software for the same participant. The results of such manipulations are shown in the following images (Fig. 2-2 through 2-3). From the video clips used as stimuli it is possible to determine the timing of an incident (such as the raising of a bottle used as an offensive weapon). Furthermore it is possible, using the eye tracking software, to produce a still of the same moment from an individual participant’s eye movement recordings over which the participant’s fixations, for a specified period, are superimposed. At the time of writing, however, the authors did not have access to software which would enable the superimposition of several participants’ eye movements simultaneously. Fig. 2-2 shows an incident that took place on Tuesday 13.11.07 at 00:21 hrs at 176 Templar St. One male, White, under thirty years of age, wearing a white peaked cap and a dark windjammer, approaches the rear of a vehicle parked on a well-lit street. The plot of eye movements indicates similar patterns for all participants. It seems that the observers gradually progress to fixating on what the male is doing with his hands after having fixated on the head of the would-be thief. The image and activity is relatively simple and the eye movements straight forward as a result. The camera operator has zoomed in on the scene and there is little of interest for observers to look at other than the man and the car. This incident took place in broad daylight on 10th July 1996 at 15:26 (Fig 2-3). The camera operator zooms in on a small group of four people standing in a knot in the street placing them centre screen. Two White males and one White female surround a short White male holding a yellow

34

Chapter Two

plastic carrier bag. The tallest male, wearing dark glasses, checks the exit of the bag carrying boy by standing in front of him. Most fixations seem to be on the White male holding the bag. Perhaps because in the build-up to the incident this boy drinks from a can/bottle by raising the can and throwing his head back. But the body checker [male] also attracts a significant proportion of attention. The girl (to the left of the image) is fixated less often. The time and date of the incident are fixated by two observers and the location (lower left on the screen) by one observer in the 5 seconds leading up to the incident. There is one other person walking past the scene at the time of the incident; he/she is not fixated upon. The incident was captured on video on Monday the 22 January 2000 at 02:45 (Fig. 2-4). A group of youths are standing by a bus stop whilst other youths run from right to left across the front on the bus stop. One youth, having run past the bus stop, turns and walks slowly from left to right in the direction of the bus stop; he appears to be shouting. Suddenly, one youth punches a bystander in the face as he (the assailant) runs across the bus stop from right to left. In the time period leading up to the incident, the observers’ eye movements fixated alternatively from the shouting youth to the youths at the location of the punch. One observer fixated on the hands of the youth doing the shouting and walking back to the bus stop. Another observer fixated on the time and date of the incident. The day and location of the incident are not looked at in this time period, but may have been either previously or after the event. There were no fixations on the youth on lower left in the shot who appears to be looking back at the incident.

Modelling Human Visual Detection of Anti-Social Behaviour

35

Fig. 2-2: Video 2 Attempted theft from a parked vehicle (using participant 10 as the key)

Fig. 2-3: Video 6 – A mugging in broad daylight (using participant 10 as the key)

36

Chapter Two

Fig. 2-4: Video 11 – Street violence (using participant 13 as the key)

Conclusion Eye tracking studies have shown for a long time that movement is a great attractor of attention. This study is no exception. In the mind of the observers, the importance of movement is perhaps emphasised, as from experience they will expect people to interact with the object of their interest. At night colour is not present or is very distorted by artificial light so cannot be important, but brightness and contrast may be significant. All this indicates that participants pay attention to who is doing what, as well as gathering data on time and date, with perhaps less emphasis being placed on the “where” of the incident. Observers (group 2 participants) might have had different priorities. Areas of the scene or people who do not appear to be immediately involved with the incident do not appear to be fixated upon in the 5 seconds building up to it. This would seem to imply that the participants have an idea in their minds as to what could unfold and what is important before the incident actually takes place. It is unlikely that eye movements tell the whole story. Eye movement is also the product of thought processes based on experience and the task in hand, as well as being driven by movement and what there is in the scene.

Modelling Human Visual Detection of Anti-Social Behaviour

37

Future Research This eye tacking study has shown that experts do look at potential instigators of crimes prior to the crimes being committed. Their eye movements appear to be driven by movement, the instigators’ appearance, their closeness to others, their body language, their hand activity, and whether they are upright or flat on the ground. How might this knowledge be of use in an AI (artificial intelligence) system? We can outline one possible approach. An AI system may have to merge information from a variety of information sources. One could envisage the system generating and allocating a score to each camera location based on the history of that camera’s location and information collected from the scene being recorded. A location’s score could then be used to indicate the priority with which each scene was uploaded to the observer’s console. The history of each location would mean that each time a crime was committed it would have to be attributed to that location. It would also imply that crimes would have to be codified and that the system would also have to record the time and date of the offence. These attributes could be represented as a score for this location. To this score would be added the score generated from a scene being viewed made up of the following elements: time of day, date, place, weather conditions, knowledge of special event, anniversaries, festivals, numbers of people in view, numbers of people moving and their direction of travel, their rate of change of movement, their orientation (upright or prostrate), their proximity to others, what they were carrying/wearing in/on their hands. The rates of change of position assessment would mean that some reference points would need to be established for each location. Points could be given for each of these elements, and the score added to the history score. The aggregate score would be an indicator of the probability of an incident and result in the scene being moved to an observer’s screen if a sufficiently high risk score was achieved. It is suggested that the addition of an audio feed would enhance the interpretation of the scene quite significantly. Sounds could be interpreted, scored and added to the location’s risk profile. To test this proposal, it is suggested that observers are asked to score selected video events leading up to an incident, using suitably designed score sheets based on the elements described above. Should this be successful, then such a scoring scheme could be piloted in a live control room for a limited number of cameras prior to the development of AI systems to automate the process. This would enable the assessment of the proposal and highlight unforeseen difficulties with it.

38

Chapter Two

Acknowledgements The authors would like to thank the participants, the CCTV control centre manager Wayne Clamp and his staff at LeedsWatch CCTV Unit, and Leeds City Council (UK), for their support during this project.

Bibliography Duchowski, Andrew. Eye Tracking Methodology: Theory and Practice 2nd edition. London: Springer-Verlag. 2007. Goldberg, Joseph and Xerxes Kotval. “Computer interface evaluation using eye movements: Methods and constructs.” International Journal of Industrial Ergonomics 24, (1999): 631-645. Goldberg, Joseph and Anna Wichansky. “Eye tracking in usability evaluation”: A practitioner’s guide. In J. Hyönä, R. Radach, & H. Deubel (Eds.), The mind’s eye: Cognitive and applied aspects of eye movement research, 493-516, Amsterdam: Elsevier. 2003. Gregory, Richard. Eye and Brain: the psychology of seeing, 5th ed. Oxford: Oxford University Press. 1998. Hochberg, Julian, Mary Peterson, Barbara Gillam and H.A. Sedgwick.In the mind’s eye: Julian Hochberg on the perception of pictures, films, and the world. Oxford; New York: Oxford University Press. 2007. Jacob, Robert, and Keith Karn. “Eye tracking in Human-Computer Interaction and usability research: Ready to deliver the promises”. In J. Hyönä, R. Radach, & H. Deubel (Eds.), The mind’s eye: Cognitive and applied aspects of eye movement research, 573-605, Amsterdam: Elsevier. 2003. Just, Adam and Patricia Carpenter. “Eye fixations and cognitive processes.” Cognitive Psychology 8, (1976): 441-480. Palmer, Stephen. Vision Science: Photons to Phenomenology. MIT Press, Cambridge. 1999. Peebles, David and Cheng, Peter. “Extending task analytic models of graph-based reasoning: A cognitive model of problem solving with Cartesian graphs in ACT-R/PM.” Cognitive Systems Research 3, (2002): 77–86. Rayner, Keith. “Eye movements in reading and information processing: 20 years of research.” Psychological Bulletin 124, (1998): 372–422. Tobii. Eye tracking as a tool in package and shelf testing: Tobii Technology AB, Stockholm, Sweden: [online] www.tobii.com. 2008. Underwood, Geoffrey, David Crundall and Katherine Hodson. “Confirming statements about pictures of natural scenes: evidence of

Modelling Human Visual Detection of Anti-Social Behaviour

39

the processing of gist from eye movements.” Perception 34 (2005): 1069-1082. Webb, Natalie and James Renshaw. “Eyetracking in HCI.” In Research Methods for Human-computer Interaction. Eds Paul Cairns and Anna L. Cox, 35-69: Cambridge University Press. 2008.

CHAPTER THREE EYE OF THE BEHOLDER: VISUAL SEARCH, ATTENTION AND PRODUCT CHOICE MARIJA B$129,û1,2,3, PEDRO J. ROSA4,5,6,7,8 AND PEDRO GAMITO5,6 “Good Lord Boyet, my beauty, though but mean, Needs not the painted flourish of your praise: Beauty is bought by judgement of the eye, Not utter’d by base sale of chapmen’s tongues.” - Love’s Labours Lost, William Shakespeare -

Abstract This chapter focuses on the process of visual search during product choice, in which stimuli are absorbed and interpreted by the consumer. It highlights the effects of visual attention on perceptual performance. The chapter reviews mechanisms and consequences of selection and deployment of visual attention across stimuli. It emphasizes that the way stimuli are presented is decisive for whether the consumer will notice them and make sense of them. It also presents a case study that focuses on 1

Faculty ofEconomics, Universidade Nova de Lisboa. Faculty of Veterinary Medicine, Technical University of Lisbon. 3 Aarhus School of Business – MAPP, University of Aarhus. 4 Instituto Universitário de Lisboa (ISCTE-IUL), Cis-IUL. 5 School of Psychology and Life Sciences, Lusophone University of Humanities and Technologies (ULHT), Lisbon. 6 COPELABS – Cognition and People-centric Computing Laboratories (ULHT). 7 Centro de Investigação em Psicologia do ISMAT, Portimão. 8 GIINCO – Grupo Internacional de Investigación Neuro-Conductual, Barranquilla, Colombia. 2

Eye of the Beholder: Visual Search, Attention and Product Choice

41

consumer’s perceptual analyses during choice of consumer nondurables as exposed by patterns of visual attention.

Introduction Eye movements and visual attention What you see is what you pay attention to. We live in an information society, exposed to far more perceptual information than we are capable of or willing to process. When browsing through a shop, we are often overwhelmed after going through aisles filled with hundreds of competing brands. Since almost any visual environment is characterized by complexity and information overload, and as the brain’s capacity to process information is limited, we are often selective about what we pay attention to, attending only to a small portion of the information we are exposed to. Visual attention, thus, is important on any occasion in which actions are based on visual information from an environment overflowing with stimuli. When shopping, consumers are exposed to an environment with far more perceptual information than can be effectively processed. To be able to cope with the potential information overload, the brain is equipped with certain attentional mechanisms. These mechanisms allow consumers to select the most relevant information to their ongoing behaviour and/or to ignore irrelevant or interfering information, as well as to change or improve the selection of information according to their evolving state and goals. In other words, consumers practice a form of psychological economy, actively seeking, selecting and interpreting stimuli, being only aware of specific information, so as to avoid being overwhelmed with stimuli clutter. Thus, consumers undergo stages of perceptual analysis when devoting visual attention to stimuli. Consumers start by investigating sensory features of a stimulus, such as colour and size, then interpret the stimulus on categorical cues, such as brand name and label, and select certain cues over others. These stages of selecting and interpreting stimuli make up the process of perception, where efficient and reliable attentional selection is critical, as the various relevant cues emerge amidst cluttered stimuli features. If everything is in the “eye of the beholder”, then research on eye movement during visual search clearly has an important role to play in our understanding of how consumers look at products and how their eye movements are related to product choice.

42

Chapter Three

Visual search Simply put, when we search for a product in a visual array, we move our eyes smoothly until we bring a particular segment of the visible field into the centre of vision, fixating it so that we may see this segment in fine-grained detail and that our mind may successfully process it. A single fixation can rapidly give us a general idea of a product. Thus, we can focus our attention on the product or on certain product-relevant information. During this immobile episode, or fixation, when our eyes remain still, and throughout every subsequent immobile episode, we collect new information. However only by moving our eyes are we able to fully process the product information. During the actual eye movement, or saccade, vision is suppressed (while mental processing is alive). Thus, these smooth movements of our eyes consist of two different elements: fixations and saccades (Buswell 1935). Saccades are fastest, ballistic eye movements, lasting around 20–40 milliseconds, while fixations are episodes when the eye remain relatively immobile, lasting around 200–400 milliseconds (Rayner 1998). The blueprint of fixations and saccades across a stimulus represent a scanpath (Noton and Stark 1971). Thus, if we can track someone’s eye movements and follow the scanpath of deployed attention, we can have an insight on where this person looked, on what has drawn their attention, and this gives us a clue as to how this person perceived the viewed scene (Duchowski 2007). Since eye movements are motor movements, it takes time to plan and execute a saccade. Of course, the duration of a saccade and fixation depend on the difficulty of a visual search display and how cluttered it is, as well as on the nature of the search task. In a simple visual search, where subjects only monitor when a target moves from one location to another, it takes as little as 175 milliseconds for subjects to move their eyes (McPeek et al. 2000; Rayner et al. 1983). However, when the visual search display is very cluttered, the number and duration of fixations increase, while saccade size decreases with the complexity of the range (Vlaskamp and Hooge 2006). Thus, visual search is more costly when the search display is cluttered. Additionally, the organization of the search display and other factors, such as colour of the items and similarity of shape to that of the target, can all have an effect on the pattern of eye movements (Zelinsky 2005; Williams et al. 2005). Eye movements and attention are closely linked, and perhaps the most important function of attention is to guide eye movements. Eye movements directed to a certain event are preceded by an attentional shift to that event, thus making union of eye movements and visual attention compulsory (Hoffman 1998). Visual attention is often seen as a

Eye of the Beholder: Visual Search, Attention and Product Choice

43

“spotlight”, “glue” and “window” that improves the efficiency of revealing events within its range, integrates separate features so that they are perceived as a unified whole, and selects patterns in the “visual buffer” (Posner et al. 1980; Treisman 1986). Thus, attention plays an important role in guiding the eye to informative visual areas, integrating separate snapshots captured by the eye movements, and in filtering relevant from irrelevant information (Hoffman 1998). In any visual search, attention can be driven internally or externally, that is, endogenously or exogenously (Posner 1980). Endogenous attention, also known as top-down or goal-driven attention, is voluntary, effortful, with a sustained time course and presumed to be under a person’s overt control (Yantis 1998). On the other hand, exogenous attention, also referred as stimulus-driven or bottom-up attention, is automatic, with a rapid, transient time course (Cheal and Lyon 1991). Both bottom-up and top-down factors interact to determine covert attention patterns as revealed in overt eye movements. The stimuli’s salience, locations and items, originating from perceptual feature contrasts (bottomup), as well as their informativeness originating from their goal-relevance (top-down) affect attention (Yantis and Jonides 1990; Öhman et al. 2001). There is a bulk of evidence on bottom-up, stimulus-driven attention that perceptual features such as colour, edges, luminance, shapes, and sizes of objects make a difference. For example, in a visual search, feature singletons, such as a red item among green distractors or a vertical item amongst horizontal items, can draw attention, but spatial cues and abrupt visual onsets (sudden luminance changes) can capture attention even more effectively (Yantis & Jonides 1984). This will of course depend on the nature of the visual search task (Theeuwes 1991, 1992). On the other hand, there is much less evidence on top-down voluntary mechanisms in capturing attention. Top-down factors such as a person’s search goals, memory, preferences and choices have an important weight in visual search by enhancing and selecting those features that are diagnostic and by suppressing those features that are nondiagnostic (Wedel & Pieters 2008). Generally, every visual search model proposes that attention is determined by the interplay between bottom-up and topdown perceptual factors (Duncan and Humphreys 1989; Müller et al. 1994; Wolfe 1994). As attention is reflected in eye movements, research on eye movements during visual search is obviously quite relevant for understanding how people observe different stimuli. In consumer research, visual search is an area that has been the object of considerable study over the past few decades. Unfortunately, the majority of this research has often been undertaken in the absence of

44

Chapter Three

considering eye movements and visual attention (Pieters and Warlop 1999). In particular, eye movements and visual attention have generally not been taken into account, and it has often been thought that eye movements and visual attention are not principally important in understanding visual search. However, this position appears to be largely changing, as there are now various studies using measures of eye movements and visual attention to understand the visual search process (for a review, see Wedel and Pieters 2008). Many of these studies focus on using a search task to reveal eye movements and visual attention metrics in order to optimize the design of packages, shelves, web-pages, catalogs, and billboards and print and TV ads. In sum, the rising acknowledgment of the essential role of visual attention in consumer visual search and behaviour and the recognition that eye movements are in fact an indication of these processes has promoted the use of eye movement systems in consumer research.

Visual search and choice behaviour The visual environment rarely has only one or few possible items worthy of attention. When looking for a designated target item among a number of distracting items in a visual search task, attention and eye movements are guided towards informative regions likely to contain the target item (Hidalgo et al. 2005; Chun & Jiang 1998, 1999). Since it is unclear where the target item is among it distractors, subject needs to reduce spatial uncertainty when searching for a target item. Likewise, when looking for a preferred product among an abundance of products in a choice task, a consumer’s attention is directed towards shelves most likely to hold the desired product (Chandon et al. 2009). As a consumer needs to make a decision on which item to choose amid the available alternatives, reducing preference (choice) uncertainty requires downsizing. Using these two simple paradigms, a researcher can analyse how visual stimuli are distinguished and compared, what features of a stimulus draw attention, how attention is used from one item to the next, how a person chooses among number of alternatives and verifies what was attended (Chun and Wolfe 1996). Due to its resemblance to real situations, the visual search paradigm has been widely used in research with numerous variations on the basic search task (Braun and Julesz 1998; Fehd and Seiffert 2008). In a standard visual search paradigm, the subject is presented with a display that either contains or not a target stimulus (e.g., letters, coloured line segments, etc.) among a variable number of distracting stimuli. The total number of

Eye of the Beholder: Visual Search, Attention and Product Choice

45

stimuli in the display is referred to as set size. Usually, in half of the trials the target stimulus is present, while in the other half only distracting stimuli are in attendance. Thus, the target item is either available or missing, and participants must make a decision as quickly and accurately as possible on whether the target is present or absent. Participants are typically instructed to press one button if the target is present, and another if the target is absent, as quickly and accurately as possible. The reaction time (RT) needed for these decisions and their accuracy are measured. RT is usually analysed as a function of the set size given the search efficiency. A vital understanding of the search and attention mechanisms can be obtained by analysing search efficiency. If the search function increases only little with increasing set it is understood that all items in the display are searched for simultaneously or in parallel, i.e., efficiently. However, if the search functions exhibit a high increase, it is assumed that the individual items are searched successively or serially, i.e., inefficiently (Treisman and Gelade 1980). The visual search becomes more efficient when the target-distractor distinctions get larger in a variety of features such as colour, orientation, curvature, size, motion, shape, gloss and some 3-D properties (such as depth) (for a review see Wolfe 1998). On the other hand, as target-distractor differences get smaller and with increasing distractor heterogeneity, search becomes less efficient (Foster and Westland 1992; Duncan and Humphreys 1989). Efficient visual searches produce high accuracy independently of set size even in brief displays, while in less efficient visual searches accuracy decreases with the increase of the set size, except when exposure time is increased (Palmer 1994). Overall search performance also improves over repeated presentations of a search display, since it produces a robust association between a particular display and a target (Hidalgo et al. 2005). Past research on consumer decision-making is traditionally concerned with the comparison of decisions to certain standards, which allow evaluation of the decisions as better or worse (for a review see Koehler and Harvey 2004). A typical decision task is usually structured to capture the preference uncertainty that people face when required to evaluate alternatives composed of multiple attributes. In this type of task, participants are asked to make a choice among multiple product options and their related attributes. Products are reduced to bundles of attributes and participants are asked to assess each attribute’s relative importance, where each option is evaluated on each attribute, yielding a set of utilities, which are then aggregated in accordance with the related attribute weights, thus yielding the overall utility. Choice then tends to result from the option that brings the greatest utility. However, the cognitive processes that guide

46

Chapter Three

people’s decisions often do not conform to the requirements of consistency imposed by these traditional models, and there have been doubts about generalization to everyday consumer choice behaviour (Louviere and Meyer 2008). Instead of seeing products as a bundle of comparable features and evaluating them in a structured and strategic way to assess each option’s overall worth, consumers actually only pay attention to a fraction of the information in the cluttered shopping environment, where some information has to “win” over others (Russo and Dosher 1983; Pieters and Warlop 1999). Furthermore, values and weights assigned to the attributes can fluctuate in unforeseen ways depending on the context of decision and/or the uncertainty people experience when handling different options (Kahneman & Tversky 1979; Tversky & Shafir 1992a). As consumers look at and evaluate only a small fraction of the multitude of alternatives cluttering shops, it is not surprising that visual attention measures hold information about consumer’s choices. There have been only a few studies that have examined the role of visual attention during product or brand choice. The focus of these studies has been mainly on understanding the procedure consumers undertake in the visual search and choice behaviour, examining the influence of bottom-up factors in competition for attention. In a study using a one-way mirror and recording camera to observe consumers’ eye movements Van Raaij (1977) found that consumers persistently used paired comparisons between alternatives in order to make their choices. Likewise, Russo and Leclerc (1994) investigated the process of choice for nondurables, by direct observation of eye movements from video recordings through a oneway mirror in laboratory settings. They observed that the choice process contains three different stages, which they interpreted as orientation, evaluation and verification. The orientation stage involved an overview of the product display with some initial screening, while the evaluation stage implicated direct comparisons between alternative products. Finally, the verification stage implicated re-examination of an already chosen product. Russo and Leclerc pointed out that consumers spent more time viewing products than choosing them, demonstrating the significance of attention for subsequent choice behaviour. Also, that study suggested that prior research on consumer choice processes, traditionally concerned with the comparison of decisions to certain standards, had the weakness of not representing the actual choice processes that adjust to the immediate purchasing environment, in which alternative choices are the norm. Similarly to Russo and Leclerc’s (1994) orientation stage in choice process, Janiszewski (1998) also studied visual search behaviour and the relationships between the size of objects in product displays and the

Eye of the Beholder: Visual Search, Attention and Product Choice

47

amount of attention devoted to them. Janiszewski’s study showed the influence of bottom-up mechanisms in competition for attention. Specifically, he pointed out that the distractor items surrounding a focal item influence the amount of time a person spends looking at the item and the probability of recalling the focal item-related information, showing that the careful arrangement of the items’ display could maximize attention to the display as a whole. Lohse (1997) observed eye movements during consumer visual search and choice among different businesses in a telephone directory, and found that consumers search successively, serially, but not exhaustively, and as a result, some advertisements remain unnoticed. In that study, the size and colour of the ads had a strong effect: consumers observed nearly all the ads of a quarter-page display, but only one-quarter of the plain listings, noticing more colour than noncolour ads, and bold listing than plain listings. That research emphasized the weight of perceptual features, such as colour and size, in consumer’s visual search and, similarly to Russo and Leclerc (1994), it indicated the importance of attention for subsequent choices. Two other studies that have focused on both bottom-up and top-down factors in visual attention during choice behaviour have demonstrated the potential of using eye-movement analysis to infer higher-order cognitive processes. Pieters and Warlop (1999) investigated the impact of time pressure and task motivation on visual attention during brand choice. Their analysis of eye movements showed that, during brand choice, consumers adjust to time pressure by accelerating the visual scanning, by filtering information and by changing their scanning strategy, as observed by a decrease in the average duration of fixations and increasing number of inter-brand saccades. Moreover, under time pressure, consumers filter more textual information, and less pictorial information, while consumers with high task motivation filter less brand information and more pictorial information, as indicated by longer average fixation durations and reduced levels of inter-brand saccades. Pieters and Warlop also pointed out that, independently of time pressure and task motivation, the chosen brand receives significantly more intra-brand and inter-brand saccades and longer fixation durations than non-chosen brands; showing that the brand preference can be predicted from a scanpath of eye movements. Chandon and colleagues (2008) measured the value of point-ofpurchase (POP) marketing with commercial eye-tracking data, showing that the more consumers look at the brand, the more they consider it for purchase. These researchers pointed out that this effect is prevalent for

48

Chapter Three

brands with lower memory-based equity, being in line with the interaction of bottom-up and top-down processes in visual attention. In sum, it seems that the application of eye tracking opens a bright future in the consumer research, advancing a better understanding of consumer choice and search behaviour in the overwhelmingly visual world of consumer choice. Hence, consumer research no longer needs to rely solely on subjective responses and memory, as eye-tracking methodologies have the potential to form a more scientifically solid inprint in the study of consumer visual search and choice behaviour.

A Case Study The study reported in this chaper is part of a larger research that has been submitted for publication elsewhere. This study involves visual attention performance metrics during product choice. Previous research has indicated that variations in the visual features of fresh nondurables often result in consumer’s difficulty in evaluating their quality, and consequentially in choosing a product (Brunsø et al. 2002; Grunert et al.  %DQRYLü HW DO  %DQRYLü HW DO   +RZHYHU WKRVH VWXGLHV have emphasized the usage of perceptual analyses, but only after consumers have previously been exposed to the stimulus and it has already captured the consumer’s attention. These analyses are largely associated with consumer-self-reported data and measurement-induced inferences (Kardes et al. 2004). Although these self-reports may provide valuable knowledge about the product, the fact that consumers are sometimes unnable able to articulate what they actually want is often disregarded (Verbeke and Vackier 2004). In view of dominance of visual sensory features in fresh foodstuff, visual attention should be a key coordinating mechanism with close correspondence to actual consumer behaviour, information processing and higher-order cognitive processing (LaBerge 1995; Rizzolatti et al. 1994). Here, eye-tracking methodology, with its high degree of flexibility, helps capture consumer behaviour in a natural and unbiased fashion, revealing what really attracts consumers’ attention, eliminating bias or errors related to human recall, and offering researchers a firmer scientific footing than self-report measures (Ballard and Hayhoe 2005). Surprisingly, even though its potential has been recently recognized in marketing and used in studies on advertising (Maughan et al. 2007) and packaged goods (Gofman et al. 2009; Pieters and Warlop 1999), there is a lack of sustained spotlight on eye-tracking application regarding sensory features of consumer nondurables (Russo and Leclerc 1994).

Eye of the Beholder: Visual Search, Attention and Product Choice

49

Visual searchand choice paradigm The shopping environment rarely presents only one or two objects worthy of attention. The visual search paradigm offers a good resemblance to a real-life situation. In this paradigm, subjects look for a designated object among number of distracting objects (Chun and Wolfe 2005). Likewise, in a choice task, the subject’s aim is to select one out of a set of multiple objects. However, in a visual search task, it is unclear where the designated object is among its distractors, and participants feel a need to reduce spatial uncertainty. In contrast, in a choice task, it is preference uncertainty demands to be reduced, as participants are required to make a decision on which item to choose among a number of available alternatives (Wedel and Pieters 2008). These simple concepts allow for the analysis of how visual stimuli are distinguished and compared, what stimulus properties draw attention, how attention is used from one item to the next, how participants choose among a number of alternatives, and how they verify what was expected. Building on previous research, and using an eye-tracking methodology, we will observe scanpaths of eye movements across a set of nondurable alternatives. As previously shown (Russo and Leclerc 1994, Pieters and Warlop 1999; Chandon et al. 2008) the more consumers pay attention to a product, the more they consider it. We thus expected that visual search and product choice are positively related, as revealed by longer fixation durations and higher number of fixation for the chosen product compared to the nonchosen product. Furthermore, as visual search improves and response time decreases over repeated presentations of a search display, we also expected that the repeated searches decrease the response time for the chosen product. Moreover, we expected that repeated searches (i.e., prior experiences or contexts) would have a learning effect on the visual attention, resulting in decrease of time needed to fixate the chosen product and on duration of this first fixation +LGDOJR HW DO  %DQRYLü HW DO 2012).

Method Subjects and stimuli 113 participants were recruited in a Lisbon campus to participate in an ‘eye-tracking study on nondurables’. Each session lasted approximately 8 minutes. Analyses regarding visual search in product choice are based on 106 participants. Seven subjects were eliminated due to the incomplete

50

Chapter Three

eye-recordings. Analyses in relation to response time are based on a subsample of 63 subjects. The stimuli were twenty-four colour slides showing a choice set consisting of pictures of four nondurables mainly fresh foodstuff. The pictures were displayed as two rows of two different nondurables. In total, twenty-four pictures were randomly assigned to twelve colour slides changing their position each time.

Task and tracking procedure Eye movements were tracked with a Tobii T60 eye tracker in a soundproof and dimly lit room. Each subject participated individually. Before starting the experiment, but already in the lab, subjects were verbally informed that their eyes would be recorded while they observed and chose nondurables. Subjects were not told that the actual purpose of the study was to measure the general salience of the products. Participants were seated in front of the eye tracker, which was individually calibrated for each of them, at a distance of 65cm from eyes to screen. Each participant started the experiment looking at the centre of the screen, always facing the same direction. In each task, four stimuli, each 600 x 496 pixels, within a square of 640 x 512 pixels, were exposed on a black background with a resolution of 1280 x 1024 pixels. As a fixation point, a 0.5ºdiametre red dot was used. The experiment consisted of twenty-four tasks. In each task, the search display consisted of four stimuli of fresh nondurables. Each image had been randomly assigned to different combinations. The experiment started with instructions in which participants were asked to indicate verbally their choice of preferred product from those that appeared on screen. There were two practice phases, in which participants gazed at four stimuli unrelated to the subsequent tasks, in order to avoid incomplete eyerecordings. Tasks started with a fixation point for 500ms, followed by a search. Participants had six seconds to inspect each stimulus, followed by a verbal response of product choice. Participants had a free response time when they were instructed to verbalize the preferred product by indicating the position in which it had appeared (i.e., up: left/right; down: left/right) and to press the “space” button when they were done. There as an interval of 5 seconds with a blank slide between each task. Each session lasted approximately 10 minutes.

Eye of the Beholder: Visual Search, Attention and Product Choice

51

Visual attention measures The focus of this study wais on visual attention to the relevant areas of the product display, i.e. areas of interest (AOIs). Each product was considered a separate AOI. Therefore, each search display had four major AOIs related to the each of the four images of nondurables. Several measures of visual attention were used. Total fixation duration (TFD) represents the mean of all individual fixation durations (measured in seconds) within an AOI. Fixation number (FN) accounts for number of times a participant fixated on an AOI. If, during the recording, a participant left and returned to the same AOI, then the new fixations on the AOI were included in the count. If, at the end of the recording, the participant has not fixated on an AOI, the respective value was not computed. Likewise, the response time (RT) in seconds that the participants needed for their decisions when choosing the preferred product was also measured.In connection to RT, time to first fixation (TFF) and first fixation duration (FFD) were also used. TFF measures the time in seconds from the appearance of the search display until the participant fixates on the active AOI. FFD represents the duration in seconds of the first fixation on an AOI.

Results Visual searchand product choice We started by analysing whether visual attention and product choice were positively related to each of the four products in the search display. Our main interest was not to investigate which specific products were chosen over others, but whether the choice of a product could be explained by the visual attention measures to that product, as well as difference between attention to the chosen and non-chosen products in a search display. Duration of fixations and number of fixations were used as measures of visual attention. Each consumer allocated attention to each of the four products in the search display, and chose one product out of the four in the set. Twelve sets were shown, with products appeared randomly in different sets. On average, each product attracted at least one fixation. The average number of fixations across products is 4, while the average fixation duration among products is 1.34 seconds. We found a clear correlation between the number of fixations on a product, average fixation duration, and subsequent spontaneous choice of that product (all r’s > 0.30, and p’s

52

Chapter Three

< 0.05). This was shown by longer fixation duration for the chosen product than for non-chosen products [t (47) =23.37, p< 0.001; Mchosen=2.16, SDchosen=0.40; Mnon-chosen=1.05, SDnon-chosen=0.18]. Likewise, the number of fixations was higher for the chosen product compared to the non-chosen products [t (47) = 2.23, p< 0.001; Mchosen=6.73, SDchosen=0.99; Mnon-chosen=3.60, SDnon-chosen=0.53]. Furthermore, results indicate that, on average, the fixation duration on the chosen product is almost 1.108 seconds longer than the fixation duration on the non-chosen products (Figure 3-1). This is quite a large difference, bearing in mind that an overall fixation duration in the sample as a whole is 1.335 seconds. The chosen products receive on average about 3 fixations more than non-chosen products do (Figure 3-2).

Fig. 3-1: Duration of fixations on chosen product and non-chosen products

Fig. 3-2: Number of fixations on chosen product and non-chosen products

These results clearly show that the visual attention measures predict product choice. Thus, the key view here is that the eye tracking-based

Eye of the Beholder: Visual Search, Attention and Product Choice

53

attentional measures metrics used in this case study are quite adequate measures, capable of predicting product choice. The prior discussion on the theory of visual search and eye movements support this view (Russo and Leclerc 1994; Pieters and Warlop 1999; Chandon et al. 2008; Wedel and Pieters 2008).

Response time and product choice We were also interested in analysing how prior experience in guided visual search by tracking eye movements. Participants explored a search display for a preferred product and decided on product choice. The goal was to identify which stage of the visual search and product choice benefits from prior experience. Therefore, we analysed to what extent prior experience influenced the timing in product choice: we explored the time subject needs to choose the product over twelve stages of search display and if repetition of this action influenced the time needed to fixate the chosen product (i.e., TFF) and duration of this first fixation (i.e., FFD). The results undoubtedly show that visual search performance improves over repeated presentations of a product (Figure 3-3).

Fig. 3-3: Response time during product choice

Figure 3-3 presents the average response time (RT) data. As expected, learning occurs across repetitions, but with a different magnitude. There is a significant effect of search display stage on RT data [F (11) =9.73, p< 0.001). The effect of learning on the time to first fixation varies with stage of search display and is statistically significant [F (11) = 3.85, p< 0.001). The difference between the first and twelfth stage in the time of the initial

54

Chapter Three

fixation to the entry in the product region is shown in Figure (3-4). As seen from Figure 3-4, subjects need less time to fixate on products they are about to choose. Even though there is a significant effect of learning on time to first fixation, there is no significant difference in time to first fixation between the chosen product and non-chosen products, t (11) =1.57, p = 0.141. On the other hand, there is a marginally significant difference in first fixation duration between the chosen product and nonchosen products, t (11)=2.205, p=0.050.

Fig. 3-4: Time to first fixation on chosen product and non-chosen products

Figure 3-5 shows a clear duration of first fixation learning gain between the first and the twelfth stage of search display. Moreover, on average the first fixation duration for a chosen product dropped by 97 milliseconds between those stages. These results suggest that the visual search is strongly influenced by prior experience, which reduces the sensory information required to process the product features. There is also evidence that the onset of a repetition of familiar scenes may begin to initiate the planning of more accurate response in visual search and in product choice by improving their response time. This is consistent with prior research observations of principal effects of prior experiences on the visual search time and gaze duration gain over scene context learning (Hidalgo et al. 2005; Chun and Jiang 1999; Chun and Jiang 1998).

Eye of the Beholder: Visual Search, Attention and Product Choice

55

Fig. 3-5: First fixation duration on chosen product and non-chosen products

Conclusion It is well known that everything is in the “eye of the beholder” and that “eyes guide us” across life in general, and store shelf displays in particular, when searching and making decisions. Thus, managing what people “see” and “track” this to understand the visual search process is of outmost importance both for academic and practical purposes. The eye movements’ research clearly opens here a new window of opportunity for the development of more precise theories of consumer information acquisition and perceptual processes in which a researcher does not have to rely solely on consumers’ self-report measures, but rather can capture consumer behaviour in a more natural and unbiased fashion, revealing what really attracts consumer’s attention, and eliminating bias or errors related to human recall. Eye movements’ research can be applied to a bulk of naturally occurring stimuli in the consumer-related environment, both globally, such as store shelf or a fresh foodstuff display, and particularly, such as product design or packaging. Here, an efficient investigation of different variables, such as the display characteristics of various products or their design, can provide answers as to how these should be recognized to have an important influence on consumer attention. We have presented here a case study that examines the role of visual attention in choice of consumer nondurables by using simple measures derived from eye-tracking procedure. This study’s main contribution is showing the effect of prior contexts on visual search and product choice. Elaborate visual search patterns are not easily accessed with traditional methods. In contrast, with eye-tracking procedure they are quite effortlessly obtained. The results shown here emphasize that overall visual

56

Chapter Three

search performance adjusts rapidly over repeated presentations of products, where two important contextual factors arise. Firstly, with repeated exposure subjects accelerate response time in relation to product choice. Secondly, the repetition of same action influences the time subjects need to fixate the chosen product and duration of this first fixation in a decreasing order. Thereby, the results show the time benefit in the exploration stage of visual search. Furthermore, we have shown not only that visual attention measures have a significant positive effect on product choice of consumer nondurables, but also that they have the ability to predict product choice. In sum, eye-tracking is a quite friendly procedure that takes only a few seconds of consumer’s time, does not involve verbal questioning (which can interrupt ongoing product processing), and offers a firmer scientific footing for visual consumer research. Of course, clearly eye movements in real life and standard conditions are heavily influenced by high-order cognitive processes; thereby more investigation is required into top-down effects on visual search and choice behaviour in consumer research. Eyetracking procedures with a combination of appropriate experimental procedures can get us a view of this “cognitive iceberg” of both bottom-up and top-down processes by simply following the eye movements. In that way, eye-tracking procedures can contribute significantly to consumer research, by offering an unparalleled revelation of consumers’ day-to-day processing of visual stimuli and by giving extrapolative validity to optimized decisions on different visual stimuli. We have great confidence that in the future there will be a wider use of eye-movements procedures in consumer research, which will help resolve controversial findings regarding consumer perceptual processes and, with more in-depth investigation into visual search, as well as integration with vision and attention theory, this will lead to a better understanding of consumer perceptual processes in reaction to visual stimuli.

Acknowledgments ,Q WKH SUHSDUDWLRQ RI WKLV FKDSWHU 0DULMD%DQRYLü ZDV VXSSRUWHG E\ Grant – SFRH/BPD/63067/2009 from the FCT – Fundaçãopara a Ciência e a Tecnologia, Ministério da Educação e Ciência, Portugal.

Bibliography %DQRYLü 0DULMD .ODXV * *UXQHUW 0DGDOHQD 0 %DUUHLUD DQG 0DJGD Aguiar Fontes, “Beef quality perception at the point of purchase: A

Eye of the Beholder: Visual Search, Attention and Product Choice

57

study from Portugal,” Food Quality and Preference 20, (2009): 335342. %DQRYLü 0DULMD 0DGDOHQD 0 %DUUHLUD .ODXV * *UXQHUW DQG 0DJGD Aguiar Fontes, “Consumers’ quality perception of national branded, national store branded, and imported store branded beef,” Meat Science 84, (2010): 54-65. %DQRYLü0DULMD0DJGD$JXLDU)RQWHV0DGDOHQD0%DUUHLUDDQG.ODXV G. Grunert, “Impact of product familiarity on beef quality perception,” Agribusiness 28, (2012): 157-172. Ballard, Dana, and Mary Hayhoe, “Eye movements in natural behaviour,” Trends in Cognitive Sciences 9, (2005): 188-194. Brunsø, Karen, Thomas A. Fjord, and Klaus G. Grunert, “Consumers’ food choice and quality perception,” Working paper no 77, Aarhus School of Business (ISSN 0907 2101, June 2002). Buswell, Guy T., How People Look at Pictures: A Study of the Psychology of Perception in Art. Chicago: University of Chicago Press, 1935. Braun, Jochen, and BelaJulesz, “Withdrawing attention at little or no cost: Detection and discrimination tasks,” Perception & Psychophysics 60, (1998): 1–23. Chandon, Pierre, J. Wesley Hutchinson, Eric T. Bradlow, and Scott H. Young, “Measuring the Value of Point-of-Purchase Marketing with Commercial Eye-Tracking Data,” in Visual Marketing: From attention to action, edited by Michel Wedel and RikPieters. New York: Lawrence Erlbaum, Taylor & Francis Group, 2008. Cheal, MaryLou, and Don R. Lyon, “Central and peripheral precuing of forced-choice discrimination,” Quarterly Journal of Experimental Psychology A (1991): 859–880. Chun, Marvin M., and Yuhong Jiang, “Contextual cueing: Implicit learning and memory of visual context guides spatial attention,” Cognitive Psychology 36, (1998): 28-71. Chun, Marvin M., and Yuhong Jiang, “Top-down attentional guidance based on implicit learning of visual covariation,” Psychological Science 10, (1999): 361-365. Chun, Marvin M., and Jeremy M. Wolfe, “Just say no: How are visual searches terminated when there is no target present?,” Cognitive Psychology 30, (1996): 39–78. Duchowski, Andrew T., Eye tracking methodology. London: Springer Verlag, 2007. Duncan, John, and Glyn W. Humphreys, “Visual search and stimulus similarity,” Psychological Review 96, (1989): 433–458.

58

Chapter Three

Fehd, Hilda M., and Adriane E. Seiffert, “Eye movements during multiple object tracking: Where do participants look?”, Cognition 108, (2008): 201–209. Foster, David H., and Stephen Westland, “Fine structure in the orientation threshold function for preattentive line-target detection,” Perception 22, (1992): 6. Gofman, Alex A., Howard R. Moskowitz, Johanna Fyrbjork, David Moskowitz, and Tõnis Mets, “Extending rule developing experimentation to perception of food packages with eye tracking,” The Open Food Science Journal 3, (2009): 66-78. Grunert, Klaus G., Lone Bredahl, and Karen Brunsø, “Consumer perception of meat quality and implications for product development in the meat sector – a review,” Meat Science 66, (2004): 259-272. Hidalgo-Sotelo, Barbara, AudeOliva and Antonio Torralba, “Human Learning of Contextual Priors for Object Search: Where does the time go?,” in Proceedings of 3rd Workshop on Attention and Performance at the International Conference in Computer Vision and Pattern Recognition (CVPR), San Diego, CA. Hoffman, James E., “Visual attention and eye movements,” in Attention, ed. Harold Pashler. London: University College London Press, 1998. Janiszewski, Chris, “The Influence of Display Characteristics on Visual Exploratory Search Behaviour,” Journal of Consumer Research 25, (1998): 290–301. Kahneman, Daniel, and Amos Tversky, “Prospect theory: An analysis of decision under risk,” Econometrica 47, (1979): 263–91. Kardes, Frank R., Steven S. Posavac, and Maria L. Cronley, “Consumer inference: a review of processes, bases, and judgment contexts,” Journal of Consumer Psychology, 14, (2004): 230-256. Koehler, Derek J., and Nigel Harvey, Blackwell Handbook of Judgment and Decision Making. Blackwell Publishing, 2004. LaBerge, David, Attentional processing: The brain’s art of mindfulness. Cambridge, MA: Harvard University Press, 1995. Lohse, Gerald L., “Consumer Eye Movement Patterns on Yellow Page Advertising,” Journal of Advertising 26, (1997): 61–73. Louviere, Jordan J., and Robert J. Meyer, “Formal choice models of informal choices: What choice modeling research can (and can’t) learn from behavioural theory,” in Review of Marketing Research, ed. Naresh K. Malhotra. London: M.E.Sharpe, 2008. Maughan, Lizzie, Sergei Gutnikov and Rob Stevens, “Like more, look more. Look more, like more: The evidence from eye-tracking,” Brand Management 14, (2007): 335–342.

Eye of the Beholder: Visual Search, Attention and Product Choice

59

McPeek, Robert M., Alexander A. Skavenski, and Ken Nakayama, “Concurrent processing of saccades in visual search,” Vision Research 40, (2000): 2499–2516. Müller, Hermann J., Glyn W. Humphreys, and Nick Donnelly, “SEarch via Recursive Rejection (SERR): Visual search for single and dual form-conjunction targets,” Journal of Experimental Psychology: Human Perception & Performance 20, (1994): 235–258. Noton, David, and Lawrence Stark, “Eye Movements and Visual Perception,” Scientific American 224, (1971): 34–43. Öhman, Arne, Daniel Lundqvist, and Francisco Esteves, “The face in the crowd revisited: A threat advantage with schematic stimuli,” Journal of Personality & Social Psychology 80, (2001): 381-396. Palmer, John, “Set-size effects in visual search: The effect of attention is independent of the stimulus for simple tasks,” Vision Research 34, (1994): 1703–1721. Pieters, Rik, and LukWarlop, “Visual attention during brand choice: The impact of time pressure and task motivation,” International Journal of Research in Marketing 16, (1999): 1–16. Posner, Michael, “Orienting of attention,” Quarterly Journal of Experimental Psychology 32, (1980): 3–25. Rayner, Keith, “Eye Movements in Reading and Information Processing: 20 Years of Research,” Psychological Bulletin 124 (1998): 372–422. Rayner, Keith, Maria L. Slowiaczek, Charles Clifton, and James H. Bertera, “Latency of sequential eye movements: Implications for reading,” Journal of Experimental Psychology: Human Perception and Performance 9, (1983): 912–922. Rizzolatti, Giacomo, Lucia Riggio, and Boris M. Sheliga, “Space and selective attention,” in Attention and performance XV: Conscious and nonconscious information processing, eds. Carlo Umiltà and Morris Moscovitch. Cambridge, MA: MIT Press, 1994. Russo, Edward J., and Barbara A. Dosher, “Strategies for multiattribute binary choice,” Journal of Experimental Psychology: Learning, Memory, and Cognition 9, (1983): 676–96. Russo, Edward J., and France Leclerc, “An eye-fixation analysis of choice processes for consumer nondurables,” Journal of Consumer Research 21, (1994): 274–290. Theeuwes, Jan. “Cross-dimensional perceptual selectivity,” Perception & Psychophysics 50, (1991): 184–193. —. “Perceptual selectivity for colour and form,” Perception & Psychophysics 51, (1992): 599–606.

60

Chapter Three

Treisman, Anne, “Features and objects: The fourteenth Bartlett memorial lecture,” The Quarterly Journal of Experimental Psychology 40, (1988): 201–237. Treisman, Anne, and Garry Gelade, “A feature-integration theory of attention,” Cognitive Psychology 12, (1980): 97–136. Tversky, Amos, and EldarShafir, “Choice under conflict: The dynamics of deferred decision,” Psychological Science 3, (1992): 358–61. van Raaij, W. Fred, “Consumer Information Processing for Different Information Structures and Formats,” in Advances in Consumer Research Vol. 4., ed. William D. Perreault, Jr. Atlanta: Association for Consumer Research, 1977. Verbeke, Wim, and Isabelle Vackier, “Profile and effects of consumer involvement in fresh meat,” Meat Science 67, (2004): 159-168. Vlaskamp, Björn N.S., and IgnaceTh.C. Hooge, “Crowding degrades saccadic search performance,” Vision Research 46, (2006): 417–425. Wedel, Michel, and RikPieters, Visual marketing: From attention to action (New York: Lawrence Erlbaum, Taylor & Francis Group, 2008). Williams, Carrick C., John M. Henderson, and Rose T. Zacks, “Incidental visual memory for targets and distractors in visual search,” Perception & Psychophysics 67, (2005): 816–827. Wolfe, Jeremy M., “Guided Search 2.0: A revised model of guided search,” Psychonomic Bulletin & Review 1, (1994): 202–238. —. “Visual search,” in Attention ed. H. Pashler. London: University College London Press, 1998. Yantis, Steven, “Control of visual attention,” in Attention, ed. H. Pashler. London: University College London Press, 1998. Yantis, Steven, and John Jonides, “Abrupt visual onsets and selective attention: Voluntary vs. automatic allocation,” Journal of Experimental Psychology: Human Perception and Performance 16, (1990): 121-134. Yantis, Steven, and John Jonides, “Abrupt visual onsets and selective attention: Evidence from visual search,” Journal of Experimental Psychology: Human Perception & Performance 10, (1984): 601–621. Zelinsky, Gregory J., “Specifying the components of attention in a visual search task,” in Neurobiology of attention, edited by Laurent Itti, Geraint Rees, and John K. Tsotsos. Elsevier, 2005.

CHAPTER FOUR COGNITION AND CONTROL OF SACCADIC SYSTEM ANSHUL SRIVASTAVA1, VINAY GOYAL2, SANJAY KUMAR SOOD1 AND RATNA SHARMA1 Abstract In everyday life, we are bombarded with visual information. Our eyes make movements to process relevant information and discard irrelevant information, which requires the contribution of both stimulus-driven and goal-driven processing of visual information. The influence of cognitive processes on eye movements and visual cognition is a point of contention among researchers working in the field. There is no conclusive evidence as to how much these cognitive processes influence eye movement, but a great deal of research is addressing this area of visual cognition. Many studies suggest that cognitive processes such as attention, working memory and decision-making do influence our eye movements to varying degrees. In this framework, saccadic eye movements have become a very useful tool for studying cognition. Monitoring saccades in laboratory settings provides insights on the relations between eye movements and cognition.

Introduction The human eye consists of a number of major anatomical structures that play an important role in providing accurate visual information 1

Department of Physiology, All India Institute of Medical Sciences, New Delhi, India. 2 Department of Neurology, All India Institute of Medical Sciences, New Delhi, India.

62

Chapter Four

(Krauzlis 2008). The cornea is a transparent outer layer that allows the light to enter the eye. Light is refracted by the cornea and then by the lens, which is located behind the retina. The lens is flexible and its curvature is controlled by ciliary muscles. Changes in the curvature of lens depend on the location of the target being viewed. The iris is the coloured part of the eye and is located between the cornea and lens. It consists of a ring of muscles that control the amount of light entering the eye through pupil, which is the dark opening in the centre of the eye. Pupillary movement is an important intraocular movement, whereby the amount of light entering the eye is adjusted by contraction and relaxation of a ring of muscles, thereby changing the size of the pupil. The dilation of the pupil reflects cortical processing, especially when cognitive load is increased (Beatty 1982). Light entering the eye through the cornea and lens is received by the retina, which is a membrane consisting of photoreceptors at the back of the eye, which detects objects. The highest-quality detection, however, occurs when light from objects falls on the fovea. Higher visual acuity is restricted to this small circular region in the retina, which is densely packed with cone photoreceptors. Impulses from the photoreceptors are sent to various parts of the brain, where the final object is perceived. To bring an object of interest onto the range of fovea when the object is moving or when the head is moving, eye movements play an important role. Eye movements help to avoid blurred retinal images of the foveated objects, producing the sense of a still image. Different types of extraocular movements are required to acquire visual information in an efficient manner for goal-oriented behaviours. Eye movements have been classified into various types (Krauzlis 2008): Vestibulo-ocular reflex: During head movements, the vestibulo-ocular reflex (VOR) produces an eye movement in the direction opposite to the head movement, to bring an object of interest or to fix an object of interest onto the range of the fovea. Optokinetic reflex: In this reflex, there is no head movement and it helps in stabilizing a moving image on the retina. Saccadic eye movement: Such eye movements are fast and eyes move from one point to another to foveate the object. Smooth pursuit: When the eyes follow a moving object very closely then such movements are known as smooth pursuit movements. Vergence: When both the eyes move simultaneously in opposite directions to foveate a nearby object, this is called vergence. Another important aspect in the effective detection of an object is the active fixation system. When an object is stationary, this system comes

Cognition and Control of Saccadic System

63

into play and helps to focus the eyes on the stationary object. There are fixational eye movements even during steady fixation. Microsaccades are one such type of fixational eye movement; they consist of random tiny movements and are involuntary in nature (Krauzlis 2008). The role of microsaccades is not well defined, but it has been suggested that they prevent neural adaptation and fading of visual information (MartinezConde 2009). Microsaccades are also linked to cognitive processes such as attention (Martinez-Conde 2009). A great deal of research has been carried out on saccadic eye movements, due to their ease of measurement, reliability, and the potential ability of saccades to provide insight into various cognitive processes such as attention, working memory, and decision-making (which will be covered in later sections of this chapter). Saccadic eye movements are also measured in various clinical conditions with cognitive impairment.

Saccadic eye movements

Fig. 4-1: Schematic representation of a reflexive saccade towards a peripheral target (black dot)

The Russian scientist Alfred Yarbus (1967) showed that eyes move several times in a second to scan different parts of a scene and pause at specific parts of the scene in order to process relevant information. One of the most important types of eye movement, and one which intrigues cognitive neuroscience researchers, are saccadic eye movements. Saccadic eye movements are ballistic or rapid eye movements; they have been generally classified into reflexive and voluntary saccadic eye movements. Reflexive saccades are elicited with the sudden onset of peripheral targets

64

Chapter Four

(Fig. 4-1). Voluntary saccades are considered goal-oriented saccades, and they can be considered purposeful behaviour (Leigh & Kennard 2004). Voluntary saccades include predictive saccades, which are generated in anticipation of a target at a particular location, and memory-guided saccades which are elicited to a location at which a target was presented earlier. Antisaccades are also voluntary saccades; they are generated in the opposite direction of a target location (Leigh and Kennard 2004).

Parameters of saccadic eye movements Saccadic eye movements are assessed using a variety of parameters: saccadic latency, amplitude, peak velocity, average velocity, trajectories, pupil size, fixation time, and directional errors. These parameters offer insights into the cognitive modulation of saccades (Leigh & Kennard 2004). Saccadic latency reflects decision-making time before the initiation of a saccade (Reddi et al. 2003, Anderson et al. 2010). Shorter latency saccades are supposed to occur through an oculomotor reflex, whereas longer latency saccades are generally controlled by higher cortical regions. The amplitude of saccadic eye movements provides information regarding landing of saccades with respect to target location. It is a very important parameter, as it is modulated by cognitive load. In some clinical conditions, the saccadic amplitude has been found to be impaired, which helps in understanding the underlying mechanisms of several disorders. For instance, in Parkinson’s disease (PD) it has been found that patients are hypometric, which could be due to dopamine depletion in the basal ganglia circuitry (Rascol et al. 1989). Pupil size as a saccadic measure has recently generated great interest among cognitive neuroscientists. Many recent studies have suggested that pupil dilation is a reliable correlate of cognitive load in saccadic eye movement tasks (e.g. Verney et al. 2001, Moresi et al. 2008). Saccadic velocity and fixation time are other important measures that are sensitive to cognitive demand in saccadic eye movement tasks. Saccadic errors in tasks with higher cognitive loads reflect the weakening of cognitive control of higher cortical regions over lower level saccadic circuitry.

Neural basis of saccadic eye movements Most of the knowledge of the neural circuitry controlling saccadic eye movements has been derived from lesion studies, neurophysiological studies and functional imaging studies. Initially, the model of the neural

Cognition and Control of Saccadic System

65

basis of the saccadic system was a simple serial model. It stated that visual information from the retina arrives at the primary visual cortex, and moves from there to extrastriate cortex, where further information processing takes place (Yarbus 1967).Then, the information is further relayed to the parietal cortex, where the relevant part of the information is attended to and is subsequently sent to the frontal eye field (FEF), where motor signals are generated and further sent to the brainstem oculomotor circuitry for generation of eye movements. However, lesion experiments have shown that this model is inadequate. It was found that the ablation of the FEF produces saccadic inaccuracy and latency deficits which are mild and transient (Schiller et al. 1980), and are resolved in a few days or weeks. Temporary inactivation of the frontal eye fields by muscimol during saccadic task performance can induce transient impairments (Dias et al. 1999). The notion of a serial processing of visual information that leads to saccadic eye movements was thus rejected, and the idea of the involvement of other brain structures (in parallel) in such motor functions,took root. In the late 80s, the concept of distributed network became influential and was supported by various studies. For instance, Lynch (1992) found that lesions of the frontal and of the parietal eye fields alone did not elicit saccadic impairment, as patients presented normal saccadic eye movements a few days after the lesion. However, when both frontal and parietal eye fields were lesioned together, impairment was greater and recovery was prolonged. This is suggestive of a distributed network saccadic system in the brain. Beyond their role in oculomotor control, a possible role of frontal and parietal eye fields in other cognitive functions such as attention and working memory has also been suggested (Corbetta et al. 1998). Taken together, results from these studies suggest that cognitive processes and saccades can be linked, as there is enough evidence of overlap between the saccadic circuitry and the circuitry involved in cognitive processes (Nobre et al. 2000). The basic neural circuitry is the same for both reflexive and voluntary saccades, with the involvement of some cortical areas in complex saccades, such as voluntary saccadic eye movements (Anderson et al. 1994, Pierrot-Deseilligny et al. 2004, Watanabe et al. 2010). Voluntary saccades are controlled by a distributed neural network, which involves the cortico-cortical network, as well as cortical-subcortical interactions (Gaymard et al. 1998, Pierrot-Deseilligny et al 2003). Superior colliculus (SC) and other subcortical structures play an important role in the generation of saccades towards a target (Wurtz and Optican 1994, Hikosaka et al. 2000, Munoz et al. 2000). However, there are various other important cortical areas involved in this process,

66

Chapter Four

such as the dorsolateral prefrontal cortex (DLPFC), the frontal eye field (FEF), the supplementary eye field (SEF), and the parietal cortex (PC), which project to subcortical structures involved in saccade generation (Fig. 4-2).

Fig. 4-2: Schematic representation showing saccadic eye movements controlled by bottom up and top down factors DLPFC – Dorsolateral Prefrontal cortex, SEF – Supplementary eye field, FEF – Frontal eye field, PC- Parietal cortex, Thal – Thalamus, CN- Caudate nucleus, SCSuperior Colliculus, GP- Globus Pallidus, SN-Substantia Nigra

The dorsal stream of visual pathway is instrumental in the processing of spatial features (Ungerleider and Mishkin 1982). The dorsal stream projects to the parietal cortex in the brain. The posterior parietal cortex (PPC) is an important brain area involved in eye movement circuitry. It is involved in saccadic guidance as a lateral intra parietal area (LIP), which is a division of the PPC and is connected to various saccade-generating areas like FEF and SC. It plays an important role in target selection, visuospatial attention and planning of saccadic movement (Goldberg et al. 2006). These different cortical areas are connected to each other and also receive projections from the visual cortex (Sparks and Barton 1993, McDowell et al. 2008, Munoz 2002, Munoz and Coe 2011). The interplay and interactions between these cortical and subcortical areas influence our decisions while we make saccades (Glimcher 2001, Glimcher 2003).

Cognition and Control of Saccadic System

67

Saccadic eye movements are mainly a complex processing of “bottom up” factors that involve basic stimulus features like position and size and “top down” factors such as goals, intentions and rewards. Visual information is processed in terms of bottom up and top down factors before response selection. This stage is crucial, since it is when our response with respect to the visual stimulus is decided. Saccadic planning takes place after the response selection, being translated into saccadic eye movements.

Eye tracking techniques Numerous studies have addressed the relationship of cognition and saccades using different eye tracking techniques. There have been many eye-monitoring techniques based on different principles. Electrooculography (EOG) is one such technique. This technique exploits the dipole nature of the eye. Electrical differences between the front (cornea) and back (retina) of the eye are measured by EOG (Davis and Shackel 1960). Electrodes are placed on the outer canthi of the eyes and when the eyes move, changes in this dipole potential can be measured. Another technique is the Scleral Search Coil, which exploits a current flow in induction coil. In this method, the eye position is estimated by placing a silicon annulus on the eye. This annulus contains a coil of thin copper wire. Induction of voltage in the coil, when the coil moves in the magnetic field, helps in measuring eye position as the coil is attached to the eye. This method provides a high spatial and temporal resolution, but its major disadvantage is that it is highly intrusive (Duchowski 2007). Photooculography (POG) or videooculography (VOG) employs video and image analysis of recorded eye movements. Various features such as the shape of the pupil or limbus position are measured. Disadvantages of this method are related to the non-automatic (frame-by-frame) nature of its visual analysis, which is prone to errors and is time consuming. The most recent method is video-based eye tracking, which is based on pupil/corneal reflection. Light source illuminates the eye and provides reflection from the surface of the eye. One example of this is infrared-based eye tracking, which uses a corneal reflection measurement relative to the center of the pupil. Image segmentation algorithms are used to locate pupil area and corneal reflections (Duchowski 2007).

Cognitive Processes and Saccadic Eye Movements The relationship between cognitive processes and saccadic eye movements have been a major focus in the field of visual cognition from

68

Chapter Four

the 19th century, when Hermann Von Helmholtz showed that attention can be moved or shifted without eye movements (Duchowski 2007). In 1976, Wurtz and Mohler showed that there are neurons in monkey SC that are active in response to a spotlight when it is used as a target for saccade. Posner (1980) showed a functional relationship between saccadic eye movements and attention through his Spatial Cuing Paradigm. Shift of attention has been demonstrated towards saccade target location and affects performance even before the eyes have begun to move (Hoffmann and Subramanian 1995, Peterson et al. 2004, Harrison et al. 2012). When a sudden peripheral target is presented, a saccade is usually generated towards that target. This movement normally takes more than 200ms after the target presentation, which is quite a long time if we consider the basic neural circuitry of reflexive saccades, which involves SC and brainstem saccade generators. It reflects that our saccadic eye movements can be modulated by other factors depending on the cognitive demands. Reddi and Carpenter (2000) tried to explain the time taken in generation of saccade by a model known as LATER (Linear approach to threshold with ergodicrate) which is a decision-making model for saccade generation. According to them, when sudden onset of target takes place, at that time a decision signal starts from a baseline level and rises with a constant rate. Saccade is initiated towards the target when this signal reaches a threshold. Changes in latencies can be seen when this baseline activity or the rate or threshold is manipulated. Cognitive demands of the task can influence baseline activity that can lead to change in saccadic parameters, specifically latencies. It has been reported that saccades which take place at shorter latencies are purely stimulus-driven whereas saccades which take longer time to get elicited are mainly goal-driven and are not affected by stimulus salience or distractors (Theeuwes 2004).

Attention and saccades Attention and eye movements are considered to be closely linked (Kowler et al. 1995, Liversedge and Findlay 2000, Shipp 2004, Hutton 2008, Findlay 2009, Van der Stigchel et al. 2009, Mazer 2011). Several studies have demonstrated that if attention is deployed to the same location where a future target is to be presented, saccadic reaction times are faster. Enhanced processing of targets takes place when attention is shifted before the saccades are executed. It is reflected from the fact that participants perform better in object discrimination tasks. The premotor theory of attention (Rizzolatti et al. 1987) suggests that the shift of attention is just a

Cognition and Control of Saccadic System

69

by-product of the eye movement preparation. In contrast to this theory, the visual attention model (Deubel and Schneider 1996) suggests that saccadic programming will take place as a result of shift of attention. Researchers are trying to understand this relationship by manipulating stimulus presentation and cognitive demands. Fixation offset is one such manipulation; it has been widely used to understand the relation between attention and saccades. Studies suggesting a role of higher cognitive processes in eye movements have shown that saccades to a peripheral target are initiated more quickly in the absence of an object at fixation with introduction of a gap of around 200ms before target appearance (Pratt et al. 2000) (Fig. 4-3). Initially, researchers considered fixation offset to be a general warning, but later it was suggested to be due to attentional release (Jin and Reeves 2009) or attentional disengagement from the fixation point. It was supported by one of the studies by Pratt and collaborators (2006) where a complex fixation point was used and participants were asked to attend a portion of this complex fixation stimulus. It was found that participants initiated saccades faster when attended portion was removed; suggesting attentional modulation of this fixation offset effect. This might suggest that even reflexive saccades are influenced by cognitive processes like attention. Short latency saccades known as express saccades in fixation-offset condition have been reported. Express saccades are visually guided saccades that are reflexive in nature. These saccades are generated experimentally by introducing a temporal gap around 200ms between fixation point offset and target appearance (Fischer and Weber 1993). Express saccades appear under the saccadic latency range of 90-120ms (Hamm et al. 2010). There are various explanations for the significance of generation of express saccades. Some of these suggest that they are elicited so that one can react expeditiously to sudden appearance of visual stimuli. Others suggest it to be generated only in specific conditions like motor training (Bibi and Edelman 2009). Modulation of express saccades was seen in one of the studies (Edelman et al. 2007). In that study, saccades were expected to be made to one of two simultaneously appearing stimulus. Targets were defined by their relative positions. Endpoints of the express saccades were biased in the direction of the central cue. These results probably reflected the fact that even fast saccades are influenced by higher cognitive processes and can be modulated.

Chapter Four

70

Fig. 4-3: Schematic representation of reflexive saccades with fixation offset condition and introduction of gap between fixation offset and target appearance .

The role of attention and higher cognitive functions in eye movements has been widely studied using antisaccade-producing tasks. Antisaccades are some of the most studied voluntary saccades in the field of visual cognition and eye movements (Munoz and Everling 2004). In such saccadic movements, participants are supposed to fixate a central fixation point. A peripheral target appears after some time, but subjects have to refrain themselves from looking towards the target and instead they are instructed to look towards its mirror image location. Antisaccades are slower than reflexive saccades (Olk and Kingstone 2003). Such saccadic eye movements give us a lot of information as they involve the inhibition of reflexive saccades and an ability to generate saccades towards their mirror image location. Studies have reported participants making errors because of the failure to inhibit or cancel the motor program generated for the reflexive saccades by higher cognitive processes. It has been argued that antisaccade errors take place due to the failure of one process, which is the activation of the correct response, rather than two separate processes (inhibiting reflexive saccades and generation of saccade to mirror image location) (Roberts 1994). If this single step were successful, then the inhibition of reflexive saccade would be automatic. Parallel processing has also been suggested as an explanation for antisaccades when there is competition between visually guided reflexive saccades and endogenously generated saccades to the mirror location (Findlay and Walker 1999, Walker et al. 2006).

Cognition and Control of Saccadic System

71

Other cognitive elements and saccades In order to generate a correct voluntary saccadic eye movement, the goal of the task has to be stored actively in working memory. Healthy individuals with low working memory capacities make more saccadic eye movement errors and have slower response times than individuals with high working memory capacities (Unsworth et al. 2004). Theeuwes and collaborators (2009) have reported a tight link between working memory and saccades. These authors showed that if the participant maintains a location in working memory, saccades curve away from the remembered location. Other studies have also shown that there is an impact of cognitive load on impairment of saccadic eye movements on various parameters (Stuyven et al. 2000). Working memory is a part of executive functioning. These results on working memory and saccades are suggestive of modulation of saccades by higher cortical areas. Performing simultaneously on working memory task (n-back task) and antisaccade tasks results in a marked impairment in accuracy. The effect is mainly observable with the inhibitory component of the antisaccade task (inhibiting saccade towards the target), which suggests that cognitive load affects our saccadic eye movements (Mitchell et al. 2002). The effects of instructions on saccades have also been investigated. Altering the verbal instructions can influence saccadic parameters. In one study (Mosimann et al. 2004), participants were asked to deliberately delay a saccade or make an inaccurate saccade and then to redirect saccades. Delay and inaccurate instruction saccadic tasks showed impaired latencies and errors. In addition, the intersaccadic intervals to correct erroneous saccades were shorter than intersaccadic intervals (in redirect task) where instructions were given to subjects. The results suggested that top down control is being influenced by the content of the instructions, which supports the role of top down control in saccadic eye movements. Incentives can influence saccadic eye movements as well. Milstein and Dorris (2007) investigated the effect of expected reward value upon saccadic parameters towards a particular target. They found that reaction times were shorter for correct saccades with greater rewards. These authors concluded that reward associated with the position modulates attention to a spatial location. A large body of studies has shown that cuing for the target provides important information about saccadic information processing. Saccadic performance is enhanced if a target is presented to a previously cued location (Posner 1980). Studies on inhibition of return (IOR) phenomena have shown participants are sometimes slower to respond to events occurring at recently attended (cued) locations if the time between cue and

72

Chapter Four

target presentation is more than 300ms. It has been considered to be due to delay in the allocation of attention to the previously attended location, which is a sort of foraging facilitation (Klein 2000) whereby, for efficient exploration, revisiting target locations is prevented. Hunt and Kingstone (2003) showed that attention, target contrast and even fixation offset influence saccadic IOR. It has always been a point of contention whether we perceive objects by integrating salient features or we perceive objects as a whole and then analyse their features. According to the feature integration theory of attention, physical features are automatically processed first and the identification of an object only occurs later in the processing stage (Treisman and Gelade 1980). Researchers are interested in understanding as to what degree our saccadic eye movements are influenced by the irrelevant information in the form of distractors. Distractors can capture saccades through bottom up processing but these captured saccades can be modulated by top down control of information processing (Ruz and Lupianez 2002). Longer latencies have been reported in the presence of distractors (Godijn and Theeuwes, 2002), which reflect the fact that saccades are captured by distractors through bottom-up processing. However, Wu and Remington (2003), using a modified capture paradigm, showed that when attentional load is increased in the task, captured saccades were absent or greatly reduced, reflecting top down modulation. Saccadic endpoints also provide important information on saccadic information processing. Researchers have found a global effect whereby saccades tend to land between target and distractors when they are located nearby (Ettinger et al. 1982, Van der Stigchel et al. 2011). This effect occurs more when the execution of a saccade takes less time after the target presentation as compared to when it takes more time, suggesting a top down control of saccades (McSorley et al. 2006). Saccadic trajectories are one of the most widely studied behavioural measures widely studied to unravel information about saccades and cognitive processes (Van der Stigchel 2010). It has been shown that saccade trajectories are a product of bottom up and top down processes. If distractors are present, saccadic trajectories deviate towards the distractor before reaching the target. However, if sufficient time is taken to process information before saccadic execution, saccadic trajectories are deviated away from the distractors (McSorley et al. 2006). In a study conducted by Walker and colleagues (2006), target and distractors were presented simultaneously. The results showed that, if the target location was unpredictable, saccades deviated towards the distractor. Conversely, when participants knew the target location beforehand, saccades were deviated

Cognition and Control of Saccadic System

73

away from the distractor. As target and distractor were presented simultaneously, saccadic inhibition for the distractor was required to reach target location (saccadic goal) which is under top down control. The role of location of peripheral target on saccadic latencies and other parameters have been studied in order to understand the information processing time within saccadic system (Frost and Poppel 1976). Various studies have shown an increase in latencies with greater target eccentricities (Dick et al 2004). This increase in latency with more peripheral targets is considered an effect of central processing time for the saccadic system. However, some studies have found similar latencies with an increase in target eccentricities (Darrien et al. 2001). Similar latencies point to the facts that angular displacement of the target does not have any influence on saccadic latency and that there is a constant saccade generation time for all target eccentricities. It has also been reported that saccadic latencies are influenced in reflexive saccades with target eccentricities, but there is no influence on latencies of antisaccades (voluntary saccade); this can be explained in terms of control of saccades by higher cognitive processes (Dafoe et al. 2007). A study by Srivastava and colleagues (2012) based on a decision saccade task with 3 different angles of presentation (8º,12º,16º), showed no significant changes in latencies in decision task, which reflects that spatial location of targets does not interfere in voluntary or goal-oriented saccades.

Brain Activity and Saccadic Eye Movement Studying saccadic eye movements’ related potentials (SERP) and eye fixation-related-potential (EFRP) has given us a lot of understanding of various cognitive mechanisms which are related to planning and programming of saccades as well as execution of saccades. These ERP paradigms are useful for the assessment of processes that play a role in saccadic planning and generation. For instance, SERP are different during reflexive saccade tasks as compared to voluntary saccades. These potentials can change with specific mental state, for example motivation, where some incentive or reward is associated with saccades. Potentials which are studied broadly consist of pre-saccadic and post-saccadic potentials (Jagla et al. 2007). Pre-saccadic negativity (between 1-3 seconds) has been reported, which is considered to be a sort of readiness potential. Pre-saccadic positivity has been shown just before the saccadic movement (100-150 ms), which may reflect saccadic eye movement motor program formulation (Jagla et al. 2007). Elicitation of distinct negative potential with the disappearance of fixation point before a target’s

74

Chapter Four

appearance in reflexive saccades mainly in frontal and central brain sites are a strong indication of involvement and a close relationship between cognitive processes and saccades (Kurtzberg 1982). Visual cortex modulation has been shown in subjects where activity in visual cortex did not change from baseline levels in preparation for reflexive saccades, but significantly decreased in antisaccades (McDowell et al. 2008). These results indicate that activity in the visual cortex can be modulated as a function of saccadic task demands. It has been reported that the predictability of the target location also influences pre-saccadic potentials (Yilmaz et al. 2006). Functional imaging has provided a better understanding of the visual stimulus-information processing and saccadic response. It has given us a deep insight on saccadic circuitry and its overlap with cognitive circuitry (Corbetta 1998). Activation of similar neural network in the frontal, parietal and temporal brain areas in tasks of attentional shift and saccadic shift in brain imaging studies made researchers consider that there may be an overlap of cognitive circuitry and saccadic circuitry. Imaging studies have shown that the activation pattern of neural networks is different in case of voluntary saccades and simple visually guided saccades. FEF is considered important for saccade initiation and greater activity in FEF accounts for saccadic reaction time (McDowell et al. 2008). SEF is activated more when a complex saccade has to be performed (Stuphorn et al. 2000). FEF and SEF have been found to be activated more during voluntary saccades as compared to reflexive saccades (Curtis and Esposito 2006). It has also been reported that these two regions are activated during a gap period after fixation point offset in reflexive saccade task. It reflects that fixation offset leads to the engagement of frontal cortex regions, suggesting the cognitive control of saccades. Functional subdivision with regards to two distinct FEF regions, i.e. lateral FEF and medial FEF, have also been reported (Simó et al. 2005). Lateral FEF is suggested to be involved in generation of reflexive saccades and medial FEF in generation of volitional saccades (Ettinger et al. 2007, McDowell et al. 2008). Parietal eye field (PEF) is important for reflexive but not for voluntary saccadic generation (McDowell et al. 2008). SC is involved in generation of reflexive saccades as suggested by functional imaging studies (McDowell et al. 2008). DLPFC, which is considered to be clearly involved in higher cognitive functions such as attention, reasoning, planning, and conflict monitoring has been found to be activated more during voluntary saccades as compared to reflexive saccades. It also has an inhibitory role (top down control) where it inhibits unwanted saccades towards the peripheral target (Pierrot-Deseilligny et al. 2003). Another brain area recruited in saccadic

Cognition and Control of Saccadic System

75

eye movements is the anterior cingulate cortex (ACC), which is activated more in voluntary saccades (antisaccades) than in reflexive saccades. It has also been shown that ACC is involved in error monitoring and tasks that require inhibition of saccades (McDowell et al. 2008).

Cognitive Control of Saccades in Clinical Conditions Studies using clinical samples have greatly contributed to our understanding of the role of cognitive processes in saccadic eye movements (Leigh and Kennard 2004). There are various clinical conditions in which both cognition and saccadic eye movement are impaired. The interpretation of these deficits in patient populations and their comparison with normal healthy subjects helps us in unravelling the relationship between higher cognitive processes and saccades. Eye movements can be used for studying different aspects of executive function in various clinical conditions by measuring performance on specific tasks, including saccadic planning, generation, saccadic inhibition, accuracy and ability to remember object location. One such clinical condition is Parkinson’s disease (PD). In addition to motor impairments, PD causes well-documented cognitive deficits in some patients (Bassett 2005, Cameron et al. 2010). Cognitive modulation of saccades has been shown in these patients, which is reflected through changes in latencies and saccadic amplitude. These patients have also been found to have impaired voluntary saccades (Perneczky et al. 2011) but not reflexive saccades (Amador et al. 2006, Van Stockum 2012). Facilitation of reflexive saccades has also been reported recently (Van Stockum et al. 2011). Neurocognitive impairments in patients with schizophrenia have also been investigated using saccadic eye movements; visually guided saccades have been found to be normal, but voluntary saccades (antisaccades) are impaired in schizophrenia (Broerse et al. 2001). It has been suggested that cognitive deficits in schizophrenia can impair voluntary saccadic movements, showing how close saccades and cognitive processes are linked. Other clinical conditions, such as autism and attention deficit hyperactivity disorder (ADHD), in which executive impairments are reported, have been investigated for saccadic eye movement abnormalities (Gooding 2008), and saccadic eye movement deficits have been reported in many anxiety disorders. Socially anxious subjects were found to be impaired in inhibiting reflexive saccades for neutral and emotional expressions. This impairment of attentional control in subjects with social anxiety is suggested to be a reason for such saccadic deficits (Wieser et al. 2009). A study with patients with

76

Chapter Four

generalized anxiety disorder showed that they not only allocated their attention more towards emotional faces, but also oriented their attention faster towards threatening faces than neutral faces. These results reflect attentional bias to emotional stimuli in terms of capture and allocation (Mogg et al. 2000). Futhermore, phobic individuals tend to show a delayed attentional shift from the feared stimulus (distractor), resulting in slower eye movements to neutral targets. It has been suggested that threat captures attention, which is reflected in response time of eye movements (Miltner et al. 2004). These studies showed how a set of saccadic eye movements can be used to understand functional state of the neural circuitry in various clinical disorders.

Conclusion Eye movements are very important, as they decide among vast visual stimuli, which one to see, to attend to, and keep in the brain for further processing in order to survive. Monitoring saccadic eye movements has proven to be a valuable approach to understand how cognition guides our actions. There is a long history of efforts by researchers to understand the relationship between cognition and saccades (Hutton, 2008). These studies have provided an important window to understand the cognitive control of saccadic eye movements. Several important conclusions have been derived from findings of studies that have combined different techniques, such as eye tracking, EEG and neuroimaging. One important aim of those studies was to understand to what degree cognitive processes are linked to eye movements. Cognitive processes such as attention and working memory have shown to modulate even the fastest reflexive saccades. Voluntary saccades such as antisaccades, which are goal-oriented, were found to be slower compared to reflexive saccades, as they involve additional steps of inhibition in addition to saccadic execution. Such saccades seem to be modulated by higher cognitive processes. A closer link between cognitive processes and saccades than previously thought has been established with the overlapping saccadic and cognitive neural circuitry. Studies on saccadic eye movements in patients with clinical disorders have also increased our understanding of this link by correlating the pathology of the disorder and saccadic performance. Future research in this area of visual cognition will surely provide more specific information on the functional relationship between cognitive processes and saccadic movements.

Cognition and Control of Saccadic System

77

Bibliography Amador, Silvia C., Ashley J. Hood, Mya C. Schiess, Robert Izor and Anne B. Sereno. “Dissociating cognitive deficits involved in voluntary eye movement dysfunctions in Parkinson’s disease patients.” Neuropsychologia. 44, no. 8 (2006): 1475–82. Anderson, Andrew J. and Roger H.S. Carpenter. “Saccadic latency in deterministic environments: Getting back on track after the unexpected happens.” Journal of Vision. 10, no.14 (2010): 1–10. Anderson, Tim J., I.H. Jenkins, David J. Brooks, M.B. Hawken, Richard S.J. Frackowiak and Christopher Kennard. “Cortical control of saccades and fixation in man. A PET study.”Brain. 117. Pt 5 (1994): 1073–84. Bassett, Susan S. “Cognitive Impairment in Parkinson’s Disease.”Primary Psychiatry. 12, no.7 (2005): 50–55. Beatty, Jackson. “Task-Evoked Pupillary Responses, Processing Load, and the Structure of Processing Resources.” Psychological Bulletin. 91, no. 2 (1982):276–92. Bibi, Raquel and Jay A. Edelman. “The Influence of Motor Training on Human Express Saccade Production.” Journal of Neurophysiology. 102, no.6 (2009): 3101–10. Broerse, Annelies, Trevor J. Crawford and Johan A. den Boer. “Parsing cognition in schizophrenia using saccadic eye movements: a selective overview.” Neuropsychologia. 39, no. 7 (2001): 742–56. Cameron, Ian G., Masayuki Watanabe, Giovanna Pari and Douglas P. Munoz. “Executive impairment in Parkinson’s disease: response automaticity and task switching.” Neuropsychologia. 48, no. 7 (2010): 1948–57. Corbetta Maurizio., Erbil Akbudak, Thomas E. Conturo, Abraham Z. Snyder, John M. Ollinger, Heather A. Drury, Martin R. Linenweber, Steven E. Petersen, Marcus E. Raichle, David C. Van Essen and Gordon L. Shulman. “A Common Network of Functional Areas for Attention and Eye Movements.” Neuron. 21, no.4 (1998): 761–773. Curtis, Clayton E. and Mark D’Esposito. “Selection and Maintenance of Saccade Goals in the Human Frontal Eye Fields.” Journal of Neurophysiology. 95, no. 6 (2006): 3923–7. Dafoe, Joan M., Irene T. Armstrong and Douglas P. Munoz. “The influence of stimulus direction and eccentricity on pro- and antisaccades in humans.” Experimental Brain Research. 179, no. 4: (2007) 563–70.

78

Chapter Four

Darrien, Jennifer H., Katrina Herd, Lisa-Jo Starling, Jay R. Rosenberg and James D. Morrison. “An analysis of the dependence of saccadic latency on target position and target characteristics in human subjects.” BMC Neuroscience 2, (2001): 13. Davis, Jean R. and Boris Shackel. “Changes in the electro-oculogram potential level.” British Journal of Opthalamology 44, (1960): 606-18. Deubel, Heiner and Werener X. Schneider. “Saccade Target Selection and Object Recognition: Evidence for a CommonAttentional Mechanism.” Vision Research. 36, no. 12 (1996): 1827–37. Dias, Elisa C. and Mark A. Seagraves. “Muscimol-Induced Inactivation of Monkey Frontal Eye Field: Effects on Visually and Memory-Guided Saccades.” Journal of Neurophysiology 81, no. 5 (1999): 2191–214. Dick, Sandra, Florian Ostendorf, Antje Kraft and Christoph J. Ploner. “Saccades to spatially extended targets: the role of eccentricity.” Neuroreport 15, no. 3 (2004): 453–6. Duchowski, Andrew T. “Eye Tracking Methodology: Theory and Practice.” Second edition. London: Springer, 2007. (GHOPDQ-D\$ÈUQL.ULVWMiQVVRQ$DQG.HQ1DND\DPD³7KHLQÀXHQFH of object-relative visuomotor set on express saccades.” Journal of Vision 7, no. 6 (2007): 1–13. Ettinger, Ulrich., Dominic H. ffytche, Veena Kumari, Norbert Kathmann, Benedikt Reuter, Fernando Zelaya and Steven C. R. Williams Findlay, John M. “Global visual processing for saccadic eye movements.” Vision Research 22, no. 8 (1982): 1033–45. Ettinger, Ulrich, Dominic H. Ffytche, Veena Kumari, Norbert Kathmann, Benedikt Reuter, Fernando Zelaya and Steven C. R. Williams. “Decomposing the Neural Correlates of Antisaccade Eye Movements Using Event-Related fMRI.” Cerebral Cortex 18, no.5 (2008):1148– 59. Findlay, John M. “Saccadic eye movement programming: sensory and attentional Factors.” Psychological Research 73, no. 2 (2009): 127–35. Findlay, John M. and Robin Walker. “A model of saccade generation based on parallel processing and competitive inhibition.” Behavioural and Brain Sciences 22, no. 4 (1999): 661–721. Fischer, B. and Heike Weber. “Express saccades and visual attention.” Behavioural and Brain sciences 16, no. 3 (1993): 553–67. Frost, Douglas and Ernst Pöppel. “Different Programming Modes of Human Saccadic Eye Movements as a Function of Stimulus Eccentricity: Indications of a Functional Subdivision of the Visual Field.” Biological Cybernetics 23, no. 1 (1976): 39–48.

Cognition and Control of Saccadic System

79

Gaymard, Bertrand, Christoph J. Ploner, Sophie Rivaud-Péchoux, A.I. Vermersch and Charles Pierrot-Deseilligny. “Cortical control of saccades.” Experimental Brain Research. 123. No. 1–2 (1998): 159– 63. Glimcher, Paul W. “Making choices: the neurophysiology of visualsaccadic decision making.”Trends in Neurosciences 24, no. 11 (2001): 654–9. Glimcher, Paul. “The Neurobiology of Visual-Saccadic Decision making.”Annual Review of Neuroscience 26, (2003): 133–79. Godijn, Richard and Jan Theeuwes. “Programming of Endogenous and Exogenous Saccades: Evidence for a Competitive Integration Model.” Journal of Experimental Psychology 28, no. 5 (2002): 1039–54. Goldberg Michael E., James W. Bisley, Keith D. Powell and Jacqueline Gottlieb. “Saccades, salience and attention: the role of the lateral intraparietal area in visual behaviour.” Progress in Brain Research. 155. Part B (2006): 157–175. Gooding, Diane C. and Michele A. Basso. “The tell-tale tasks: A review of saccadic research in psychiatric patient populations.” Brain and Cognition 68, no. 3 (2008): 371–90. Hamm, Jordan P., Kara A. Dyckman, Lauren E. Ethridge, Jennifer E. McDowell and Brett A. Clementz. “Preparatory Activations across a Distributed Cortical Network Determine Production of Express Saccades in Humans.” The Journal of Neuroscience 30, no. 21 (2010): 7350–7. Harrison, William J., Jason B. Mattingley and Roger W. Remington. “PreSaccadic Shifts of Visual Attention.” PLoS One 7, no. 9 (2012): e45670. Hikosaka, Okihide, YorikoTakikawa and Reiko Kawagoe. “Role of the Basal Ganglia in the Control of Purposive Saccadic Eye Movements.” Physiological Reviews 80, no. 3 (2000): 953–78. Hoffman, James E. and Baskaran Subramaniam. “The role of visual attention in saccadic eye movements.” Perception & Psychophysics. 57, no. 6 (1995): 787–95. Hunt, Amelia R. and Alan Kingstone. “Inhibition of return: dissociating attentional and oculomotor components.” Journal of experimental psychology: Human perception & performance 29, no. 5 (2003): 1068–74. Hutton, Samuel B. “Cognitive control of saccadic eye movements.”Brain and Cognition 68, no. 3 (2008): 327–40.

80

Chapter Four

Jagla, Fedor, Mariana Jergelovam and Igor Riecanský. “Saccadic Eye Movement Related Potentials.” Physiological Research. 56, no. 6 (2007): 707–13. Jin, Zhenlan and Adam Reeves. “Attentional release in the saccadic gap effect.” Vision Research. 49, no. 16 (2009): 2045–55. Klein, Raymond M. “Inhibition of return.”Trends in Cognitive Sciences 4, no. 4 (2000): 138–47. Kowler, Eileen, Eric Anderson, Barbara Dosher and Erik Blaser. “The Role of Attention in the Programming of Saccades.” Vision Research. 35, no. 13 (1995): 1897–916. Krauzlis, Richard J. “Eye movements.”In Fundamental Neuroscience, 3rd edition, edited by Larry Squire, Darwin Berg, Floyd E. Bloom, Sascha du Lac, AnirvanGhosh and Nicholas C. Spitzer, 775–92. Academic Press: Elsevier, 2008. Kurtzberg, Diane and Herbert G. Vaughan Jr. “Topographic analysis of human cortical potentials preceding self-initiated and visually triggered saccades.” Brain Research 243, no. 1 (1982):1–9. Leigh, R. John and Christopher Kennard. “Using saccades as a research tool in the clinical neurosciences.” Brain. 127. Pt 3 (2004): 460–77. Liversedge, Simon P. and John M. Findlay. “Saccadic eye movements and cognition.” Trends in Cognitive Sciences 4, no. 1 (2000): 6–14. Lynch, James C. “Saccade initiation and latency deficits after combined lesions of the frontal and posterior eye fields in monkeys.”Journal of Neurophysiology 68, no. 5 (1992): 1913–6. Martinez-Conde, Susana, Stephen L. Macknik, Xoana G. Troncoso and David H. Hubel. “Microsaccades: a neurophysiological analysis.” Trends in Neurosciences 32, no.9 (2009):463–75. Mazer, James A. “Spatial Attention, Feature-Based Attention, and Saccades: Three Sides of One Coin?” Biological Psychiatry 69, no. 12 (2011): 1147–52. McDowell, Jennifer E., Kara A. Dyckman, Benjamin P. Austin, and Brett A. Clementz. “Neurophysiology and Neuroanatomy of Reflexive and Volitional Saccades: Evidence from Studies of Humans.” Brain and Cognition 68, no. 3 (2008): 255–70. McSorley, Eugene, Patrick Haggard and Robin Walker. “Time Course of Oculomotor Inhibition Revealed by Saccade Trajectory Modulation.” Journal of Neurophysiology 96, no. 3 (2006): 1420–4. Milstein, David M. and Michael C. Dorris. “The influence of expected value on saccadic preparation.” The Journal of Neuroscience 27, no. 18 (2007): 4810–8.

Cognition and Control of Saccadic System

81

Miltner, Wolfgang H R., SilkeKrieschel, Holger Hecht, Ralf Trippe and Thomas Weiss. “Eye movements and behavioural responses to threatening and nonthreatening stimuli during visual search in phobic and nonphobic subjects.” Emotion. 4, no.4 (2004):323–39. Mitchell, Jason P., C. Neil Macrae and Iain D. Gilchrist. “Working memory and the suppression of reflexive saccades.” Journal of Cognitive Neuroscience 14, no. 1 (2002): 95–103. Mogg, Karin, Neil Millar, and Brendan P. “Biases in eye movements to threatening facial expressions in generalized anxiety disorder and depressive disorder.” Journal of Abnormal Psychology. 109, no. 4 (2000):695–704. Moresi, Sofie, Jos J. Adam, JonsRijcken, Pascal W. M. Van Gerven, Harm Kuipers, and JelleJolles. “Pupil dilation in response preparation.” International Journal of Psychophysiology. 67, no. 2 (2008): 124 – 30. Mosimann, Urs P., Jacques Felblinger, Sean J. Colloby, and René M. Müri. “Verbal instructions and top-down saccade control.” Experimental Brain Research 159, no. 2 (2004): 263–7. Munoz, Douglas P. “Commentary: Saccadic eye movements: overview of neural circuitry.”Progress in Brain Research 140, (2002): 89–96. Munoz, Douglas P and Brian C. Coe. “Saccade, search and orient – the neural control of saccadic eye movements.” European Journal of Neuroscience 33, no. 11 (2011): 1945–7. Munoz, Douglas P., Michael C. Dorris, Martin Paré and Stefan Everling. “On your mark, get set: Brainstem circuitry underlying saccadic initiation.” Canadian Journal of Physiology and Pharmacology 78, no. 11 (2000): 934–44. Munoz, Douglas P. and Stefan Everling. “Look away: the anti-saccade task and the voluntary control of eye movement.” Nature Reviews Neuroscience 5, no. 3 (2004): 218–28. Nobre, Anna C., Darren R. Gitelman, Elisa C. Dias and M. MarselMesulam. “Covert Visual Spatial Orienting and Saccades: Overlapping Neural Systems.” NeuroImage 11, no. 3 (2000): 210–6. Olk, Bettina and Alan Kingstone. “Why are antisaccades slower than prosaccades? A novel finding using a new paradigm.” Neuroreport 14, no. 1 (2003): 151–5. Perneczky, Robert, Boyd C. P. Ghosh, Laura Hughes, Roger H.S. Carpenter, Roger A. Barker and James B. Rowe. “Saccadic latency in Parkinson’s disease correlates with executive function and brain atrophy, but not motor severity.” Neurobiology of Disease 43, no. 1 (2011): 79–85.

82

Chapter Four

Peterson, Matthew S., Arthur F. Kramer and David E. Irwin. “Covert shifts of attention precede involuntary eye movements.” Perception & Psychophysics 66, no. 3 (2004): 398–405. Pierrot-Deseilligny, Charles, Dan Milea and René M. Müri. “Eye movement control by the cerebral cortex.” Current Opinion in Neurology 17, no. 1 (2004): 17–25. Pierrot-Deseilligny, Charles, René M. Müri, Christoph J. Ploner, Bertrand Gaymard, Sophie Demeret and Sophie Rivaud-Péchoux. “Decisional role of the dorsolateral prefrontal cortex in ocular motor behaviour.” Brain 126, no. 6 (2003): 1460–73. Pierrot-Deseilligny, Charles, René M. Müri, Christoph J. Ploner, Bertrand Gaymard and Sophie Rivaud-Péchoux. “Cortical control of ocular saccades in humans: a model for motricity.” Progress in Brain Research. 142, (2003): 3–17. Posner, Michael I. “Orienting of attention.”Quarterly Journal of Experimental Psychology 32, no. 1 (1980): 3–25. Pratt, Jay, Harold Bekkering and Mark Leung. “Estimating the components of the gap effect.” Experimental Brain Research. 130, no. 2 (2000): 258–63. Pratt, Jay, Clara M. Lajonchere and Richard A. Abrams. “Attentional modulation of the gap effect.” Vision Research. 46, no. 16 (2006): 2602–7. Rascol, Olivier, Michel Clanet, Jean-Louis Montastruc, M. Simonetta, M.J. Soulier-Esteve, Bernard Doyon and André Rascol. “Abnormal ocular movements in Parkinson’s disease: Evidence for involvement of dopaminergic systems.” Brain. 112. Pt. 5 (1989): 1193–214. 5HGGL %$- DQG 5RJHU +6 &DUSHQWHU ³7KH LQÀXHQFH RI XUJHQF\ RQ decision time.” Nature Neuroscience 3, (2000): 827–30. Reddi, B.A.J., Kaleab N. Asrress and Roger H.S. Carpenter. “Accuracy, Information, and Response Time in a Saccadic Decision Task.” Journal of Neurophysiology 90, no. 5 (2003): 3538–46. Rizzolatti, Giacomo, Lucia Riggio, Isabella Dascola, and Carlo Umiltá. “Reorienting attention across the horizontal and vertical meridians: evidence in favor of a premotor theory of attention.” Neuropsychologia. 25(1A) (1987): 31–40. Roberts, Ralph J., Lisa D. Hager and Christine Heron. “Prefrontal cognitive processes: Working memory and inhibition in the antisaccade task.” Journal of Experimental Psychology 123, no. 4 (1994): 374–93.

Cognition and Control of Saccadic System

83

Ruz, Maria and Juan Lupiáñez. “A review of attentional capture: On its automaticity and sensitivity to endogenous control. Psicológica 23, no. 2 (2002): 283–309. Schiller, Peter H., Sean D. True and Janet L. Conway. “Deficits in eye movements following frontal eye-field and superior colliculus ablations.” Journal of Neurophysiology 44, no. 6 (1980):1175–89. Shipp, Stewart. “The brain circuitry of attention.” Trends in Cognitive Sciences 8, no. 5 (2004): 223–30. Simó, Lucia S., Christine M. Krisky and John A. Sweeney. “Functional neuroanatomy of anticipatory behavior: dissociation between sensorydriven and memory-driven systems.” Cerebral Cortex 15, no. 12 (2005): 1982–91. Sparks, David L. and Ellen J. Barton. “Neural control of saccadic eye movements.” Current Opinion in Neurobiology 3, no. 6 (1993): 966– 72. Srivastava, Anshul, Sanjay Kumar Sood and Vivekananth S, AshleshPatil. “Spatial effects of targets on Decision saccadic eye movements.” (paper presented at the 3rd International Conference on Eye Tracking, Visual Cognition and Emotion, Lisbon: Portugal, 2012). Stuphorn, Veit, Tracy L. Taylor and Jefferey D. Schall. “Performance monitoring by the supplementary eye field.” Nature 408, (2000): 857– 60. Stuyven, Els, KoenVan der Goten, Andre Vandierendonck, KristlClaeys and Luc Crevits. “The effect of cognitive load on saccadic eye movements.” Actapsychologica (Amst) 104, no. 1 (2000): 69–85. Theeuwes, Jan. “Top-down search strategies cannot override attentional capture.” Psychonomic Bulletin & Review 11, no. 1 (2004): 65–70. Theeuwes, Jan, Artem Belopolsky and Christian N.L. Olivers. “Interactions between working memory, attention and eye movements.” ActaPsychologica (Amst) 132, no. 2 (2009): 106–14. Treisman, Anne M. and GarryGelade. “A Feature-Integration Theory of Attention.” Cognitive Psychology 12, no.1 (1980): 97–136. Ungerleider, Leslie G. and Mortimer Mishkin. “Object Vision and Spatial Vision: Two Cortical Pathways.” In Analysis of visual behaviour.Edited by D.J. Ingle, M.A. Goodale, and R.J.W. Mansfield, 296–302. Cambridge, MA: MIT Press, 1982. Unsworth, Nash, Josef C. Schrock and Randall W. Engle. “Working Memory Capacity and the Antisaccade Task: Individualdifferences in Voluntary Saccade Control.” Journal of Experimental Psychology: Learning, Memory, and Cognition 30, no. 6 (2004): 1302–21.

84

Chapter Four

Van der Stigchel, Stefan, Artem V. Belopolsky, Judith C. Peters, Jasper G. Wijnen, MartjinMeeter and Jan Theeuwes. “The limits of top-down control of visual attention.” ActaPsychologica (Amst) 132, no. 3 (2009): 201–12. Van der Stigchel, Stefan. “Recent advances in the study of saccade trajectory deviations.” Vision Research 50, no. 17 (2010): 1619–27. Van der Stigchel, Stefan, Jelmer P. de Vries, R. Bethlehem and Jan Theeuwes. “A global effect of capture saccades.” Experimental Brain Research. 210, no. 1 (2011): 57–65. Van Stockum, Saskia, Michael R. MacAskill and Tim J. Anderson. “Impairment of voluntaU\ VDFFDGHV DQG IDFLOLWDWLRQ RI UHÀH[LYH saccades do not co-occur in Parkinson’s disease.” Journal of Clinical Neuroscience 19, no. 8 (2012): 1119–24. Van Stockum, Saskia, Michael R. MacAskill, Daniel Myall and Tim J. Anderson. “A perceptual discrimination task abnormally facilitates reflexive saccades in Parkinson’s disease.” European Journal of Neuroscience 33, no. 11 (2011): 2091–100. Verney, Steven P., Eric Granholm and Daphne P. Dionisio. “Pupillary responses and processing resources on the visual backward masking task.” Psychophysiology 38, no. 1 (2001): 76–83. Walker, Robin, Eugene Mcsorley and Peter Haggard. “The control of saccade trajectories: Direction of curvature depends on prior knowledge of target location and saccade latency.” Perception & Psychophysics 68, no. 1 (2006): 129–38. Walker, Robin and Eugene McSorley. “The parallel programming of voluntary and reflexive saccades.” Vision Research 46, no. 13 (2006): 2082–93. Watanabe, Masayuki, Masahiro Hirai, Robert A.Marino and Ian G. Cameron. “Occipital–Parietal Network Prepares Reflexive Saccades.” The Journal of Neuroscience 30, no. 42 (2010): 13917–8. Wieser Mathias J., Paul Pauli, and Andreas Muhlberger. “Probing the attentional control theory in social anxiety: an emotional saccade task.” Cognitive, Affective, Behavioral Neuroscience. 9, no. 3 (2009): 314–22. Wu, Shu-Chieh and Roger W. Remington. “Characteristics of Covert and Overt Visual Orienting: Evidence FromAttentional and Oculomotor Capture.” Journal of Experimental Psychology: Human Perception and Performance 29, no. 5 (2003): 1050–67. Wurtz, Robert H. and Lance M. Optican. “Superior Colliculus cell types and models of saccade generation.” Current opinion in Neurobiology. 4, no. 6 (1994): 857–61.

Cognition and Control of Saccadic System

85

Wurtz, Robert H. and Charles W. Mohler. “Organization of monkey superior colliculus: enhanced visual response of superficial layer cells.” Journal of Neurophysiology 39, no. 4 (1976): 745–65. Yarbus, Alfred L. “Eye movement and vision.” New York: Plenum Press, 1967. Yilmaz, Alpaslan, CemSüer and ÇiðdemÖzesmi. “Dependence of presaccadic cortical potentials on the predictability of stimulus.” Erciyes Medical Journal 28, no. 3 (2006): 138–44.

CHAPTER FIVE GAZE FIXATION PATTERNS IN A ROUTE WITH OBSTACLES: COMPARISON BETWEEN YOUNG AND ELDERLY INÊS P. SANTOS1,2 AND LEONOR MONIZ-PEREIRA1 Abstract A comparison was made between the visual strategies of young adults (18-24 years) and of elderly people (65-76 years) before and during a task consisting of a walk-through an obstacle course. Participants were instructed to start and finish at predefined locations. Twelve pylons were used as obstacles to be avoided during the walk-through. Participants’ eye movements were monitored using the Mobile Eye model 1.35®. Results showed that before starting elderly participants spent more time looking at the initial section of the course and also fixated that section more often than young adults. During the walk-through, the elderly spent more time looking at the final section and at the goal position than did the young adults. These results suggest age-related changes in patterns of motorrelated visual processing. Young adults tend to show more effective eye movements while the elderly need more time to process the same visual information.

1

Technical University of Lisbon, Faculty of Human Kinetics, Interdisciplinary Centre for the Study of Human Performance, Lisbon, Portugal. 2 [email protected].

Gaze Fixation Patterns in a Route with Obstacles

87

Introduction Vision is the sense that provides precise information about one’s surroundings, including people and objects, which is achieved through eye movements, essential to guide people safely around their environment (Hollands, Patla and Vickers 2002, 221, Rajashekar, Cormack and Bovik 2004). A common, everyday activity for humans is to walk through an environment while avoiding collisions with obstacles. Ways to achieve collision-free locomotion are, for example, to step over an obstacle situated in the pathway or to avoid it by going around. To ensure safe and efficient locomotion, this obstacle crossing or avoidance strategy requires visual guidance both before and during the execution of the maneuver (Jansen, Toet and Werkhoven 2010, 55). In order to fully understand how vision is used to guide locomotion, it is necessary to know: what visual features the individual are looking at, where the individual is located in relation to these features, and how the individual brings the retinal field to the target of interest (Hollands, Patla and Vickers 2002, 221-22). According to previous research, gaze location and fixation duration are indicators of quality in visual search strategies during the task performance (Patla and Vickers 2003). One study (Deshpande and Patla 2007) has suggested that the elderly increasingly rely on visual input rather than other sensory information such as proprioception or vestibular input. However, older age is associated with changes and deteriorations of the visual system (e.g. visual acuity declines to about 40% of the value in youth (Brabyn, et al. 2001, 266) contributing to changes in gaze behavior (Zietz and Hollands 2009, 357). Previous studies involving the measurement of eye movements during locomotion or motor tasks are more common in the young adult population. Researchers have studied how vision guides locomotion in experimental situations, such as walking and stepping on one or multiple targets (Igari, Shimizu and Fukuda 2008, Patla and Vickers 2003, Chapman and Hollands 2007, Di Fabio, Greany and Zampieri 2003b), changing direction of locomotion (Land, Mennie and Rusted 1999), or stepping over an obstacle (Patla and Vickers 1997), walking on terrain with different textures (Marigold 2008) and walking in natural (woods) vs. man-made environments (sidewalks) (Pelz and Rothkopf 2007). The general conclusion from those studies is that the nature of the task influences fixation behavior. People need to know where to look and where to search relevant information from the scene so they can complete tasks successfully (Hollands and Marple-Horvat 2001, 215, Patla et al.

88

Chapter Five

2007, 688). In particular, these studies found that visual information is used in an anticipatory manner to plan and execute movement, e.g. when there are specific targets to be stepped on, participants make local fixations about two steps ahead (Patla and Vickers 2003, 137, Marigold 2008, 148), with some variability in the extent of location that is fixated, which can be from one step in advance up to several steps in advance, depending on the constraints of the task (Chapman and Hollands 2007, 65, Di Fabio, Greany and Zampieri 2003b, 394). On the other hand, there are fewer studies reporting the visual behavior of elderly people during locomotion or motor tasks. In relation to a task requiring to step on one or multiple targets (Chapman and Hollands 2007, Di Fabio, Zampieri and Greany 2003a), or step over one or multiple obstacles (Di Fabio, Greany and Zampieri 2003b), differences were found between young adults and the elderly. Older adults looked earlier for targets to be stepped on and fixated those targets longer than young people, suggesting that older people need more time to plan their movements (Chapman and Hollands 2007, 65, Di Fabio, Greany and Zampieri 2003b, 394). In another study by the same team, age-related differences in coordination between eye movements and foot placement were also found (Di Fabio, Zampieri and Greany 2003a, 182). Based on those results, the authors suggested that older adults required more time to process the visual information of targets/obstacles and to program an appropriate motor response. However, (Igari, Shimizu and Fukuda 2008) in a study based on a simulation of riding a bike, found no differences between young adults and the elderly in their fixations’ duration on obstacles. Thus, although the literature has given us some clues about the elderly’s visual strategies, it does not fully clarify the differences in visual behavior between the elderly and younger participants. There is not currently any published study describing where and when elderly people look during a locomotion task that requires an obstacle avoidance strategy, such as going around the obstacle. Since the nature of the task influences the fixation behavior, there is a clear need to know where and when individuals look as they perform this kind of locomotion task and whether there are age-related changes in this behavior. It is still unknown if individuals’ gaze behavior during deviation from multiple obstacles shows similar changes to those observed during tasks that require stepping on or over targets. The aims of this study were to quantitatively describe where and when individuals look while they walked through a cluttered environment towards a goal position that was visible from the starting point and to determine whether there are any age-related differences in these measures.

Gaze Fixation Patterns in a Route with Obstacles

89

We hypothesized that 1) the elderly participants would take more time to start the task than young participants; 2) before starting to walk, both groups would spend more time fixating the goal than other target-areas; 3) before starting to walk, the elderly would spend more time fixating the goal than the young; 4) before starting to walk, both groups would present the greatest number of fixations on the goal area than other target-areas; 5) before starting to walk, the elderly would make a greater number of fixations on the goal than the young; 6) the elderly would take more time to perform the task than the young participants; 7) during walking, both groups would spend more time fixating the path of the middle and final sections than the other target-areas; 8) during walking, the elderly would fixate the path of the middle and final section for a longer period than the younger; 9) during walking, both groups would present the greatest number of fixations on the path of the middle and final sections; and 10) during walking, the elderly participants would make more number of fixations on the path of the middle and final section than younger participants.

Methods and Materials Participants Twenty-six healthy persons with normal visual acuity participated voluntarily in the study (14 younger adults ranging from 18 to 24 yrs old, Mage = 20.6 yrs, SDage = 1.6; and 12 older adults ranging from 65 to 76 yrs old, Mage = 71 yrs, SDage =3.2).

Test/Instruments To analyze the participants’ visual acuity we used the MonCV3®Acuity Test. To monitor the gaze of participants we used an Applied Sciences Laboratories (ASL) Mobile Eye 1.35®. This eye-tracker is designed specifically for applications in which lightweight, completely un-tethered eye gaze tracking is required. The Mobile Eye records data at 60Hz by interleaving images taken from two cameras. The eye camera records the eye being tracked while the scene camera records the environment being observed by the user. Both image streams are recorded on the same digital videotape medium by alternating frames. Therefore, the actual functional sampling of point of gaze data is 30Hz. The PC runs ASL EyeVision® software and Captiv L-2100® software for data processing (Operation Manual MobileEye 2008).

90

Chapter Five

Experimental protocol The experimental task took place in a room where there was drawn on the floor a 3.15m x 4.55m grid divided in 9x13 square cells (0.35m x 0.35m) with a predefined start and a finishing point. Twelve pylons were used as obstacles that had to be avoided during walking. The 12 pylon locations on the grid cells were randomly generated with the following conditions: (1) all the eight adjacent cells around a pylon had to be empty; (2) pylons were not placed directly in front of the exit point; (3) there was just one entrance and one exit point. A schematic diagram of the obstacle course is shown in Figure 5-1. Participants were conducted to the starting point with eyes closed, ignoring the pylon arrangement. The participants were instructed that they had to walk at their natural, self-selected pace from the starting position to the end point avoiding the pylons, without touching them, taking the time they needed. They were instructed to maintain their eyes closed until the moment they heard the starting signal.

Fig. 5-1: Floor plan of the obstacle course with the 12 randomly arranged pylons

Data Analysis and Results We started by analyzing the location of gaze fixations with ASL software. We analyzed number and duration of gaze fixations separately according to the various targets of gaze (initial section: path and pylons; middle section: path and pylons; final section: path and pylons; goal position of the obstacle course; and points outside the obstacle course).

Gaze Fixation Patterns in a Route with Obstacles

91

Fixations on the initial section were fixations on the first four rows, fixations on the middle section were fixations on the subsequent four rows and fixations on the final section were fixations on the last four rows of the obstacle course; path fixations were those on any point of the obstacle course, excluding the pylons, and pylon fixations were all those on or above any of the pylons. Fixations on the goal position were fixations on the exit or an area slightly ahead of the exit. Fixations on other spatial locations outside the travel path were categorized as points outside the obstacle course. A fixation was defined as gaze stabilized on a location for three consecutive frames (0.1s) or longer (Patla and Vickers 1997). The gaze fixation data (number of fixation and fixation duration) were grouped in two categories, those occurring before starting and those occurring during the task performance. See Figure 5-1. We tested differences between the young and the elderly groups on the fixation duration and on the number of fixations with the statistical Test, Wilcoxon Mann-Whitney. Significance levels were set at p = .05 for all analysis.

Before task initiation (Duration of fixations) Contrary to what we expected in H1, time to start walking did not differ between the young (M = 5.08s, SD = 3.53s) and the elderly (M = 4.62s, SD = 4.01s). There was no significant difference between the time that the young needed to start the task and the time that the elderly needed to start the same task. The first hypothesis was rejected, since the elderly did not take more time to start the task. The fixations’ duration on the various locations of the obstacle course, before participants started to walk, is summarized in Table 5-1 and in Figure 5-2. Table 5-1: Percentage of duration of fixations for young adults and elderly, before task initiation Pylons Initial Section Pylons Middle Section Pylons Pylons Final Section Total Path Initial Section Path Middle Section Path Path Final Section Total Goal Outside Total time of fixation

Young (M%±SD) .38 ± 1.12 6.27 ± 10.91 8.46 ± 10.43 15.11 ± 13.85 .58 ± 1.47 16.17 ± 15.83 12.72 ± 8.11 29.46 ± 17.45 12.49 ± 14.09 7.66 ± 14.72 64.72 ± 19.93

Elderly (M%±SD) 2.96 ± 6.81 9.44 ± 6.8 7 ± 8.44 19.40 ± 12.47 17.11 ± 25.68 11.64 ± 12.30 11.75 ± 14.52 40.49 ± 24.65 3.7 ± 4.53 3.35 ± 7.22 66.95 ± 23.37

92

Chapter Five

Regarding the duration of fixations, before the task initiation, the young adults spent 65% of the time fixating the target-areas (15% for the pylons, 29% for the path, 12% for the goal and 7% for the points outside the obstacle course). The young participants spent more time fixating the path of the final section (13%) and the goal position (12%). Comparing the target-areas, we did not find a significant difference between the duration of fixations on these target-areas and the duration of fixations on the path and pylons of the middle section, the pylons of the final section, and the points outside the course. The young participants spent almost the same time fixating each of these target-areas. However, the young participants spent less time fixating the pylons of the initial section than the path of the middle section (p = .050) and the path of the final section (p = .022). The young adults spent also less time fixating the path of the initial section than the path of the final section (p = .038). The initial section (path and pylons) was the target-area where the participants spent less time looking. Regarding the duration of fixations before task initiation, the elderly spent 67% of the time fixating the target-areas (19% for the pylons, 40% for the path, 4% for the goal and 3% for the points outside the obstacle course). The elderly participants spent more time fixating the path of the initial section (17%). However, we did not find significant differences between the duration of fixations on this target-area and the duration of fixations on the others target-areas. The elderly spent almost the same time fixating all the target-areas, since there were no significant differences between the time spent looking at each of these areas. The second hypothesis was rejected, as before walking neither of groups spent more time fixating the goal compared with the other targetareas. We did find a significant difference between young adults and elderly in the duration of fixations on the path of the initial section (Z = -3.43, p = .001): the elderly (M = 17%) spent more time fixating this area than young adults (M = .6%). The third hypothesis was rejected, as the elderly participants did not spend more time fixating the goal position than the young participants. Instead, the elderly participants spent more time fixating the path of the initial section than did the young participants.

Gaze Fixation Patterns in a Route with Obstacles

93

Fig. 5-2: Percentage of duration of fixations for young adults and elderly, before task initiation. The symbol “*” indicates significant differences between young adults and elderly

Before task initiation (Number of fixations) The number of fixations on the various locations of the obstacle course, before participants started to walk, is summarized in Table 5-2 and in Figure 5-3. Table 5-2: Percentage of number of fixations for young adults and elderly, before task initiation Pylons Initial Section Pylons Middle Section Pylons Pylons Final Section Total Path Initial Section Path Middle Section Path Path Final Section Total Goal Outside Total time of fixation

Young (M%±SD) 1.73 ± 5.38 9.35 ± 15.17 10.56 ± 10.91 21.63 ± 17.81 2.86 ± 7.26 25.71 ± 26.97 25.59 ± 14 54.16 ± 24.87 16.9 ± 18.41 7.31 ± 11.8 10.50 ± 6.3

Elderly (M%±SD) 6.34 ± 14.65 16.31 ± 13.51 10.76 ± 12.19 33.41 ± 16.06 19.27 ± 27.03 19.13 ± 15.75 20.02 ± 18.25 58.41 ± 18.35 4.42 ± 6.14 3.76 ± 7.57 11.08 ± 9.73

Before task initiation, the young adults made 11 fixations on the targetareas (22% for the pylons, 54% for the path, 17% for the goal and 7% for the points outside the obstacle course). The young participants made a greater number of fixations on the path of the middle section (26%) and on the path of the final section (26%). Comparing the target-areas, we did not

94

Chapter Five

find a significant difference between the number of fixations on these target-areas and the number of fixations on the pylons of the middle section, the pylons of the final section, the goal position and the points outside the course. The young participants made almost the same number of fixations on these target-areas. However, the young adults made a lower number of fixations on the pylons of the initial section than on the path of the final section (p = .002). The young adults also made a lower number of fixations on the path of the initial section than on the path of the final section (p = .003). The initial section (path and pylons) is the target-area where the participants looked less often. In turn, the elderly made 11 fixations on the target-areas (33% for the pylons, 58% for the path, 4% for the goal and 4% for the points outside the obstacle course). The elderly participants made a greater number of fixations on the path of the final section (20%), on the path of the initial section (19%) and on the path of the middle section (19%). Comparing target-areas, we did not find significant differences between the number of fixations between target-areas. Thus, we rejected the fourth hypothesis, as neither group fixated the goal more often than any of the other target-areas.

Key: The symbol “*” indicates a significant difference between young adults and elderly

Fig. 5-3: Percentage of number of fixations for young adults and elderly, before task initiation

When we compare the number of fixations made by young participants with the number of fixations made by elderly participants we found a significant difference between young adults and the elderly on the path of

Gaze Fixation Patterns in a Route with Obstacles

95

the initial section (Z = -2.81, p = .005): the elderly (M = 19%) fixated this area of the obstacle course more often than did young adults (M = 3%). The fifth hypothesis was thus rejected, because the elderly participants did not make a bigger number of fixations on the goal position than the young participants; instead, they made a bigger number of fixations on the path of the initial section than the young.

During task performance (Duration of fixations) The young participants took slightly less time (M = 4.95s, SD = 1.37s) to perform the task than the elderly participants took (M = 5.29s, SD = 1.78s), but this difference was not significant contrary to the sixth hypothesis. The duration of fixations on the various locations of the obstacle course during task performance is summarized in Table 5-3 and in Figure 5-4. Table 5-3: Percentage of duration of fixations for young adults and elderly, during task performance Pylons Initial Section Pylons Middle Section Pylons Pylons Final Section Total Path Initial Section Path Middle Section Path Path Final Section Total Goal Outside Total time of fixation

Young (M%±SD) .63 ± 2.37 4.2 ± 5.89 6.72 ± 7.61 11.55 ± 12.09 .17 ± .65 6 ± 6.02 14.23 ± 12.91 20.41 ± 14.44 21.27 ± 17.46 8.61 ± 13.25 61.85 ± 22.93

Elderly (M%±SD) .15 ± .52 2.42 ± 3.39 10.07 ± 8.8 12.65 ± 10.5 .38 ± 1.32 12.14 ± 10.27 25.13 ± 12.44 37.65 ± 14.31 32.9 ± 11 5.67 ± 7.05 88.87 ± 9.33

Regarding the duration of fixations during task performance, the young adults spent 62% of the time fixating the target-areas (12% for the pylons, 20% for the path, 21% for the goal and 9% for the points outside the obstacle course). The young participants spent more time fixating the goal position (21%) and the path of the final section (14%). Comparing the target-areas, we did not find a significant difference between the duration of fixations on these target-areas and the duration of fixations on the path and pylons of the middle section, the pylons of the final section, and the points outside the course. The young participants spent almost the same time fixating these target-areas. However, the young adults spent less time fixating the pylons of the initial section than the path of the final section (p = .005) and the goal position (p = .001). The young adults spent also less time fixating the path of the initial section than the path of the final section (p = .003) and the goal position (p = .001). The initial section (path and pylons) is the target-area where the participants spent less time looking.

96

Chapter Five

In turn the elderly spent 89% of the time fixating the target-areas (12% for the pylons, 38% for the path, 33% for the goal and 6% for the points outside the obstacle course). The elderly participants spent more time fixating the goal position (33%) than the pylons of the initial section (p = .000), the path of the initial section (p < .001), the pylons of the middle section (p = .001) and the points outside of obstacle course (p = .004). During the task performance the time that the elderly spent fixating the goal position was not significantly different from the time that they spent fixating the path of the middle section or the path of the final section. There were no significant differences between the time that the elderly spent fixating the path of the middle section and the time that they spent fixating the path of the final section. However, the elderly spent less time fixating the path of the initial section than the path of the middle section (p = .037) or the path of the final section (p < .001). Thus, we rejected the seventh hypothesis, the two groups did not both spend more time fixating the path of the middle and the final sections in comparison with the other target-areas. Instead, the young participants spent more time fixating the path and pylons of the middle section, the path and pylons of the final section, the goal position and the points outside the course, while the elderly spent more time fixating the path of the middle and final section and the goal position than the other target-areas.

Fig. 5-4: Percentage of duration of fixations for young adults and elderly, during task performance. The symbol “*”indicates significant differences between young adults and elderly

When comparing the duration of fixations made by the young participants with the duration of fixations made by the elderly participants,

Gaze Fixation Patterns in a Route with Obstacles

97

we found a significant difference between groups concerning the total time of fixation: the elderly (M = 89%) spent more time fixating than young adults (M = 62%) Z = -3.24, p = .001; on the path of the final section: the elderly (M = 25%) spent more time fixating this area than young adults (M = 14%) Z = -2.01, p = .045; and on the goal position: the elderly (M = 33%) spent more time fixating this area than young adults (M = 21%) z=1.955, p=.051. The eighth hypothesis was accepted in relation to the elderly fixation time of the final section, as the elderly spent more time fixating this section than the young participants. However, concerning the time fixating the path of the middle section, the hypothesis was rejected since the elderly did not spend more time on this section than did the young participants.

D uring task performance (Number of Fixations) The number of fixations on the various locations of the obstacle course, during task performance is summarized in Table 5-4 and in Figure 5-5. Table 5-4: Percentage of number of fixations for young adults and elderly, during task performance Pylons Initial Section Pylons Middle Section Pylons Pylons Final Section Total Path Initial Section Path Middle Section Path Path Final Section Total Goal Outside Total time of fixation

Young (M%±SD) .79 ± 2.97 6.59 ± 8.4 10.86 ± 9.55 18.24 ± 14.85 .79 ± 2.97 12.18 ± 13.3 26.55 ± 18.34 38.87 ± 21.81 26.43 ± 16.81 16.45 ± 24.47 10.29 ± 3.3

Elderly (M%±SD) .69 ± 2.41 3.73 ± 4.12 18.15 ± 12.63 22.57 ± 14.79 .83 ± 2.89 13.49 ± 8.5 33.89 ± 13.09 48.22 ± 9.62 23.45 ± 10.42 5.77 ± 6.51 11.42 ± 4.4

Regarding the number of fixations during task performance, the young adults made 10 fixations on the target-areas (18% for the pylons, 39% for the path, 26% for the goal and 16% for the points outside the obstacle course). The young participants made a greater number of fixations on the path of the final section (27%) and on the goal position (26%). Comparing target-areas, we did not find significant differences between the number of fixations on these target-areas and the number of fixations on the path and pylons of the middle section, the pylons of the final section, and the points outside the course. The young participants made almost the same number of fixations on all these target-areas. However, the young adults made a lower number of fixations on the pylons of the initial section than on the

98

Chapter Five

path of the final section (p=.002) and on the goal position (p= .001). The young adults also made a lower number of fixations on the path of the initial section than on the path of the final section (p= .003) and on the goal position (p = .001). The initial section (path and pylons) is the targetarea where the participants looking less times.

Fig. 5-5: Percentage of number of fixations for young adults and elderly, during task performance. The symbol “*” indicates significant differences between young adults and elderly

In turn, the elderly made on average 11 fixations on the target-areas (23% for the pylons, 48% for the path, 23% for the goal and 6% for the points outside the obstacle course). The elderly participants made a greater number of fixations on the path of the final section (34%), and on the goal position (23%). The elderly had a larger number of fixations on the path of the final section than on the pylons of the initial section (p < .001), on the path of the initial section (p < .001), on the pylons of the middle section (p = .001) and on the points outside the course (p = .006). They also had a greater number of fixations on the goal position than on the pylons of the initial section (p < .001), on the path of the initial section (p = .001) and on the pylons of the middle section (p = .001). During the performance of the task, the number of fixations that the elderly made on the goal position was not significantly different from the number of fixations that they made on the path of the middle section or on the path of the final section. There was no significant difference between the number of fixations that the elderly made on the path of the middle section and the number of fixations that they made on the path of the final section. However, the elderly made a lower number of fixations on the path of the initial section than on the

Gaze Fixation Patterns in a Route with Obstacles

99

path of the middle section (p = .043) or on the path of the final section (p 5 times a day > 5 times a day 2-5 times a day 2-5 times a day

110

Chapter Six

Tasks Placing orders is the most important function in the Rustica application. For this reason, users participating in the study had to perform a main task of ordering specified products. The evaluation of usability consisted of a number of different tasks. First, the user had to navigate to the order page by selecting the right-hand menu option (task 1). Task 2 consisted of ordering three R10.99 Vodacom recharge vouchers and two R4.79 MTN recharge vouchers. Lastly, the order had to be confirmed and sent to the wholesaler (task 3) (Figure 6-1 a-c).

Fig. 6-1a – c: Screen shots of the user interfaces of task 1 – 3 respectively

Apparatus and eye-tracking data collection A Tobii 1750 eye tracker with a 1024px X 768px resolution display that records gaze points at 50Hz was used to record the eye movements of the participants. On the desktop screen, an Opera Mobile 10 emulator was used to simulate a mobile device. The Rustica web-based system was accessed through the emulator with a resolution of 500px x 310px. The eye tracker was set-up with a minimum fixation duration of 100 milliseconds (i.e. only fixations with durations over 100ms were recorded as fixations). The Tobii fixation filter was used with a fixation radius of 35; for this set-up a movement was recorded as a new fixation if the threshold was greater than 0.35 pixels/ms.

Expert-based usability test evaluation One of the methods used by Gelderblom et al. (2012), in the usability study described above, was based on eye tracking data focussing on heat

Benchmark Fixation Deviation Index to Automate Usability Testing

111

maps to identify usability issues. We reproduce their analysis here. Those issues are highlighted, by means of the heat map images, for three of the participants while completing task 2 (Figure 6-2 a – c). The heat maps represent all the fixation points and duration on that point from the moment that the screen was loaded until the user clicked on the right edit box.

Fig. 6-2a – c: Heat maps of participants performing task 2 (Gelderblom, De Bruin and Singh, 2012)

The edit box was expected to be a significant focus of attention for participants, because that is where they had to enter the amount to order, but in fact the users did not pay much attention to the edit boxes. What the users did do was to focus a lot on tabs 1 and 2 at the top of the list of products. The table headings on the page were also observed more than was expected. The participants fixated on the menu options of the emulator a number of times, even though this had nothing to do with the task at hand. Lastly, the authors noted that the users missed crucial elements on the screen, instead fixating on elements such as the images, prices and table names. In sum, Gelderblom et al. (2012) clearly highlighted that there were usability issues in task 2 by using eye tracking data and analysis of heat maps.

Proposed approach Consider the scenario where a number of users interact with a system to achieve a given task. The basic premise of our approach is that if one user is more successful in achieving the task, then he or she can be said to use the best possible visual strategies required to complete the task, we use

112

Chapter Six

this user as the benchmark user. We thus propose that the difference between the eye movements of the benchmark user and the other users, while performing the same task, can be used as the basis for quantifying and highlighting issues in the usability of the system. Here we present a re-analysis of the Gelderblom et al. (2012) data, plus additional previously unreported data from that study, to test this approach. This analysis involved a number of steps. The first step was to gather the data during usability testing, as described above. During usability testing, participants had to complete a number of tasks and data was captured using an eye tracking device (some of which was not previously analysed). The next step was to identify the benchmark user; the criteria for selecting a benchmark depend on the aims of the study. In this study, usability-related data was used to select the benchmark user: task completion, number of fixations and time taken to complete the task. This benchmark user was then used during the data processing stages. To identify problem areas an index is calculated during the processing stage. Data was processed with a fixation deviation index, which represents how much each participant differs from the benchmark user. To calculate this fixation deviation index, the data was clustered with respect to the benchmark user and the FDI for each cluster was determined. Lastly, the clustered data was mapped back onto the user interfaces, to highlight where each participant deviated. This complete process is depicted in Figure 6-3; each of the remaining steps is discussed in detail in the sections that follow.

Fig. 6-3: Process diagram for calculating the FDI and mapping Benchmark Deviation Areas

Data In contrast to the Gelderblom et al (2012) analysis, in this study, only the fixation data per participant for each task was exported. Each fixation was defined by the x and y coordinates on the screen, as well as the start time and duration of the fixation. The fixations on the screen can be seen on the gaze plot images (Figure 6-4 a – c). Those images are representations of where the participants fixated on the screen from the start of the task until the task was successfully completed.

Benchmark Fixation Deviation Index to Automate Usability Testing

113

Fig. 6-4a – c: Gaze plot images from task 2

Figure 6-4a gives a clear view of where the participant focused and in which order. The participant in Figure 6-4.a was selected as the benchmark user. Figure 6-4.b is still a good overview of where another participant fixated and in which order, but some data can go lost, under the visualization or when too much information is displayed on one screen, and it will take more time to analyse. When it comes to Figure 64.c, it was difficult to identify the order and exact elements where the participant fixated on most. The visualizations in Figure 6-4 can easily become cluttered; this study makes use of the benchmark user to simplify the data while still visualizing the exact coordinates of the fixations.

Benchmark user Every user perceives a user interface differently, but in the end the users have to accomplish the same steps to complete a task. Thus, there are a few crucial parts of the user interface that the user will have to focus on to make informed decisions. It is for this reason that we propose the use of a benchmark user – this is the user that represents an optimal fixation outline to compare other users to. To select the benchmark user, a number of attributes was considered. First of all, did the user complete the task successfully? If the user could not complete the task successfully, he/she could not be considered as the benchmark user. Next, the number of fixations was also important, because it was an indication of the amount of processing required to complete the task. Lastly, time to completion was considered if there were users with the same number of fixations. A higher overall fixation

Chapter Six

114

duration would indicate more time spent interpreting the user interface components. Consider Table 6-2 for selecting the benchmark user. On all three tasks, participant 5 qualified as the benchmark user. Table 6-2: Benchmark user classification criteria per task for each participant (# – Number of fixations, BU – Benchmark user)

1 2 3 4 5

# 51 41 37 32 16

Task 1 Time (s) 26.88 19.27 21.54 32.70 13.33

BU 8 8 8 8 9

# 85 29 270 300 22

Task 2 Time (s) 32.60 13.38 118.90 161.32 17.23

BU 8 8 8 8 9

# 94 17 118 40 9

Task 3 Time (s) 56.64 11.78 52.67 52.32 13.52

BU

8 8 8 8 9

Data processing After the benchmark user was selected, the fixation data of the remaining participants was clustered; for each of the clusters, the FDI was determined. Lastly, some of the selected clusters were mapped back onto the user interfaces to display the benchmark deviation areas visually (Figure 5-6). Data clustering: The areas on the user interface where the benchmark user fixated represent relevant information to complete the task at hand. Determining the closeness of the neighbouring participant fixations to the fixations of the benchmark user provides insight into how close that participant was to the closest-to-the-ideal solution. The fixations of the benchmark user are used as centroids and the fixations of each participant are individually sorted into clusters around these centroids. For each participant, as shown in Figure 6-5a, the Euclidean distance from each benchmark centroid (black circles) to each fixation (light grey circles) is calculated. In Figure 6-5b it is shown that each fixation is sorted into the cluster of the centroid closes to it, according to its Euclidean distance from each centroid.

Benchmark Fixation Deviation Index to Automate Usability Testing

115

Fig. 6-5a – d: Euclidean distance clustering of fixations

The Euclidean distances between the participant fixations ( ) and benchmark centroids ( ), were used to calculate which centroid were closest to each fixation, as shown in Equation 6-1. The participant fixations were added to the cluster of the centroid closest to it. (6-1) Fixation Deviation Index (FDI) calculations: The FDI is a measure that represents how much on average the fixations of each participant differ from those of the benchmark user. The FDI will be small if the participant fixates close to where the benchmark user fixated. A small difference is good, because the participants would not necessarily fixate exactly where the benchmark user fixated, even when they would still fixate on the same component. The greater the FDI is, the further the participant fixated from the relevant information on the screen. To calculate the FDI, the Euclidean distance ( ) is used; this represents the difference measure. For each cluster, the average Euclidean distance ( ), of all the fixations ( ) from the centroid is calculated, see Equation 62. The FDI is calculated as the average Euclidean deviation for each cluster as well as the average FDI per task. The FDI for each task specifies how much participants deviated from the benchmark user whilst completing the given task. Lastly, the average FDI is calculated per

116

Chapter Six

participant and per task. This can be used to easily filter through a large quantity of data and identify specific tasks that had usability issues or the participants who had difficulty to complete given tasks. (6-2) Benchmark deviation areas: To obtain valuable information from the FDI, the clusters with an FDI higher than the average FDI are mapped back onto the original user interface. The areas with above-average FDI values are the areas where most deviation occurred; this represents problem areas visually on the user interface for further investigation into why these areas were significant to the participants. The clusters are then plotted back onto the user interfaces by means of polygons. Each fixation point of the cluster serves as a point in the polygon to provide as much information as possible about the cluster in the visualisation (Figure 6-6ac). Figure 6-6a shows all the clusters as polygons on the UIs. In Figure 66b the cluster with a below-average FDI value was removed from the UI. This shows the areas where the participant deviated most from the benchmark user.

Fig. 6-6a – d: Cluster polygons on the UIs

Benchmark Fixation Deviation Index to Automate Usability Testing

117

Results We were interested in analysing clusters with high FDIs in order to discover areas with usability problems. In order to support the use of an FDI, the findings are compared to the results of an expert based analysis from another study conducted using the same data.

Fixation Deviation Index Table 6-3 presents the average FDI for each task per participant. This is an indication of how much the fixation data of the participant deviated from the fixation data of the benchmark user. A small FDI indicates that the participant focused in the vicinity of where benchmark user focused, even if not exactly where the benchmark user fixated, but still on the same element of the user interface. At least a slight deviation was thus expected from all participants. Larger FDI values indicate greater deviation. This translates into the fact that the participant did not focus on areas of the user interface that were relevant to perform the task at hand. Table 6-3 presents for each task the average FDI’s, time to complete the task, and total number of fixations. The data indicate that neither the time nor the number of fixations has a direct influence on the FDI value. For a usability test with a large number of participants and tasks, the FDI can be used to filter through the data. Raw fixation data can be analysed automatically and tasks and/or participants can be highlighted for further analysis. So, what valuable information can be extracted from an FDI value for this study? A closer look at the values in Table 6-3 reveals that participant 2 performed very well overall. This suggests that the participant completed the tasks efficiently. Confirming this, the short task completion times of participant 2 indicate high efficiency, even though the number of fixations was not always the lowest of the non-benchmark participants. The FDI values for the remaining participants are higher and do not vary as much as those of this outlier. Further insightful can be extracted from analysing the FDI of each task. The FDI values of task 1 have very slight variance compared to those of tasks 2 and 3. The total FDI of task 2 is the highest average FDI as well as the largest inconsistency in FDI values per participant. For this reason, task 2 was analysed in greater depth.

Chapter Six

118

Table 6-3: Average task FDI and time per task for each participant (# – Number of fixations)

1 2 3 4 B. M

FDI 0.86 0.36 0.66 0.41 0

Task 1 Time # 26.88 51 19.27 41 21.54 37 32.70 32 13.33 16 0.574

FDI 2.23 0.43 5.43 5.52 0

Task 2 Time 32.60 13.38 118.90 161.32 17.23 3.404

# 85 29 270 300 22

FDI 3.23 0.22 3.04 1.15 0

Task 3 Time 56.64 11.78 52.67 52.32 13.52 1.909

# 94 17 118 40 9

Benchmark deviation clusters The FDI also removes the effort of mapping out areas of interest on the user interfaces. The focal points of the benchmark user are already seen as significant areas. This saves the time, effort and expertise for the initial analysis. Tasks that have a high FDI or high variety in FDIs indicate possible usability issues. Task 2 (Table 6-3) has those characteristics; here we investigate these results in further detail. Fig. 6-7a – d represents Task 2 fixations of each of the participants 1 – 4, respectively. The numbered circles indicate the fixation points of the benchmark user and the highlighted areas represent the areas where the participants deviated from the benchmark user. Even if highlighted areas are close to fixation points, the points form part of a cluster with a high FDI, thus the points are included in the highlighted area. Considering participant 2 (Fig. 6-7.b), even if the participant’s average FDI for this task was low, there are areas where the participant deviated from the benchmark user. These areas are clearly indicated by the highlighted areas. Unlike participant 3 and 4, the edges of the clusters for participant 1 are not very jagged. This indicates that participant 1 deviated from the benchmark user but with only a small number of fixations. Participant 3 and 4, on the other hand, considered other parts of the user interface a significant number of times before completing the task. This can indicate uncertainty or searching for an element. From the clusters of the responses of all four participants (i.e. excluding the benchmark user), we can make some general observations. For all participants, prices were important; all participants deviated from the benchmark user and fixated on most of the prices. Tabs 1 and 2 at the top of the product list were also noteworthy to the participants. Even though the benchmark user also fixated on these tabs, the remaining participants explored the area more widely. The edit boxes, the element with which the users had to interact, were trivial to most participants, but participants focused on all of the

Benchmark Fixation Deviation Index to Automate Usability Testing

119

images a number of times. The last components that seemed to draw the attention of all participants were the white menu option buttons. These are not part of the user interface design, but only part of the emulator that the web-based solution was hosted in; these elements should have been unimportant to the participants.

Fig. 6-7a – d: Cluster with a high FDI count mapped back onto the UIs

Participants 1 and 4 explored the area with the table names and header bar, which participants 2 and 3 ignored. This did difference did not seem to affect the efficiency of the task, seeing that participants 3 and 4 were the worst performers.

120

Chapter Six

Findings comparison One of the methodologies followed by Gelderblom et al. (2012) was to use an eye tracking device in a usability lab to conduct a usability evaluation on a mobile procurement application. The data captured from that study was the same data used for this study. Thus, the expectation would be that if this proposed approach is valid, it would deliver the same results as well as some more precise results, while being less resource intensive. For this comparison, the usability testing results from the Expert-based usability test evaluation section (which describes the Gelderblom study) and the Fixation Deviation Index section (referred to as the FDI study) are used. The first similarity in the findings regards participants’ inattentiveness to the edit boxes. The highlighted areas in the FDI study mostly exclude the edit boxes; the Gelderblom study also finds a lack of interest in the edit boxes. Secondly, the FDI study indicates that all participants fixated on the tabs at the top of the product list. The same results can be seen from the Gelderblom study. In the Gelderblom study, it is noted that the white menu buttons are fixated on and are seen as a way to assist the participants. Likewise, in the FDI study, at least one cluster for each participant includes the white menu buttons at the bottom of the screen. The last correlation between the two studies is the fixation on the table headings in both studies. There are two findings from the usability test in the FDI study, which were not mentioned, seeing that it was not relevant at the time, in the Gelderblom study: participants’ generally deviated from the benchmark user and fixated on the remaining images and prices multiple times. After further investigation into the heat maps from the Gelderblom study, it was noted that the same areas were highlighted. The FDI does not lose any of the results captured by the Gelderblom study, as can be seen from the number of correlations between the results. The FDI study also provides quantitative results and visualisations to speed up the analysis process and highlight the usability issues directly on the user interfaces.

Conclusion Automated usability testing is considered one of the most timeefficient methods of usability evaluation, yet there are some aspects that still require a lot of resources. A great deal of valuable and insightful information can be captured by eye tracking as an automated usability

Benchmark Fixation Deviation Index to Automate Usability Testing

121

testing tool. Unfortunately, a great deal of time is required of expert evaluators to interpret all this data, especially for large amounts of participants and tasks. In this paper, we proposed a fixation deviation index (FDI) as a measure of how much the fixation data of participants deviate from the fixation data of a benchmark user. This approach allows filtering through data to easily identify which tasks were easy or had usability issues and how much the participants strayed from the optimal task process. The benchmark deviation areas that are mapped back onto the user interfaces assist in illustrating in which areas the usability issues occur. We have shown here that the use of the FDI and benchmark deviation areas offers the same data as expert-based benchmark usability testing, as well as providing additional, more precise results on usability problems. We believe that this method can be quite beneficial in usability testing. It uses data that was captured automatically by an eye tracking device and then analyses it by identifying deviation between the most efficient user (i.e. the benchmark user) and the other participants in a study. This approach can produce more in-depth analysis by highlighting areas with high deviation on the user interfaces. It is especially beneficial for studies with a high volume of participants and/or tasks. Data can be easily filtered by means of the FDI values to obtain an overview of the usability of a page. Tasks can be evaluated separately to establish if there are high FDI values or even high irregularity in the FDI values, which point to usability issues. The FDI value also indicates if a participant or group of participants had difficulty with a specific task. Furthermore, the FDI values provide an unbiased approach towards analysing the results without partial views getting in the way. Mapping the clusters with above-average FDI values back onto the user interfaces automates this process even further by indicating areas with high levels of deviation. Only at this stage will an expert evaluator have to spend time analysing the highlighted areas. The expert evaluator will also only have to analyse a few selected interfaces with certain FDI values. This process also makes the analysis of the user interface independent from expert knowledge of the interface. Thus, an evaluator needs to have very little knowledge of the user interface in order to carry out this research. The benchmark user already fixated on the relevant elements on the screen and therefore areas of interest do not need to be mapped out for each task by the evaluator. For this study, only the gaze point data was used to test our proposed approach. However, eye tracking and usability testing provide very rich, often unexplored data. Our approach can provide a basis from which to

122

Chapter Six

develop even more sophisticated means to obtain more valuable information from usability studies. In future work, this approach could be applied to different usability data captured by usability evaluations. This should give better insight into how and why the participants deviated from the ideal solution. Future research can also be conducted on a larger scale both in terms of sample size and number of tasks, to emphasise the value of filtering data by means of FDI values and automatically highlighting deviation areas.

Acknowledgements This study benefitted from the support of SAP. Opinions expressed and conclusions arrived at are those of the authors and not necessarily to be attributed to the companies mentioned in this acknowledgement. A special thanks to Prof Helene Gelderblom for her leadership in terms of the usability study and for the use of the usability lab at UNISA.

Bibliography Andre, Terence S, Rex H Hartson, and Robert C Williges. “Determining the Effectiveness of the Usability Problem Inspector: a Theory-based Model and Tool for Finding Usability Problems.” Human Factors: The Journal of the Human Factors and Ergonomics Society 45, no. 3 (2003): 455. Baker, Simon, Fiora Au, and Gillian Dobbie. “Automated Usability Testing Using HUI Analyzer.” (paper presented at the 19th Australian Conference On: 579-588, 2008). Buscher, George, Susan Dumais, and Edward Cutrell. “The Good, the Bad, and the Random: An Eye-tracking Study of Ad Quality in Web Search.” (paper presented at Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 42–49. ACM, 2010). Butler, Keith A. “Usability Engineering Turns 10.” Interactions 3, (1996): 58–75. Comber, Tim, and John R Maltby. “User Operations as Language Elements: Measuring Usability and User Competence Through Redundancy.” In CHINZ 03, (2003): 57–61. Dunedin. Cowen, Laura, Linden J Ball, and Judy Delin. “An Eye Movement Analysis of Webpage Usability.” People and Computers XVIMemorable Yet Invisible (2001).

Benchmark Fixation Deviation Index to Automate Usability Testing

123

Ehmke, Claudia, and Stephanie Wilson. “Identifying Web Usability Problems from Eye-Tracking Data.” Interfaces (2007): 119–128. Gelderblom, Helene, Jhani de Bruin, and Akash Singh. “Three Methods for Evaluating Mobile Phone Applications Aimed at Users in a Developing Environment: a Comparative Case Study.” In Proceedings of M4D2012 3, (2012):321–334. Ghaoui, Claude. Encyclopedia of Human Computer Interaction. Ed. Claude Ghaoui. Encyclopedia of Human Computer Interaction. Vol. 18. IGI Global, 2005. Goldberg, Joseph H, Mark J Stimson, and Marion Lewenstein. “Eye Tracking in Web Search Tasks: Design Implications.” Proceedings of the 2002 Symposium on Eye Tracking Research & Applications (2002): 51–58. Hartson, H Rex. “Human – Computer Interaction : Interdisciplinary Roots and Trends.” Journal of Systems and Software 43, no. 2 (1998): 103– 118. Ivory, Melody Y, and Marti A Hearst. “The State of the Art in Automating Usability Evaluation of User Interfaces.” ACM Computing Surveys 33, no. 4 (2001): 470–516. Jacob, Robert J K, and Keith S Karn. “Eye Tracking in Human-computer Interaction and Usability Research: Ready to Deliver the Promises.” In the Mind’s Eye: Cognitive and Applied Aspects of Eye Movement Research 2, no. 3 (2003): 573–605. Jarodzka, Halszka, Kenneth Holmqvist, and Marcus Nyström. “A Vectorbased, Multidimensional Scanpath Similarity Measure.” Proceedings of the 2010 Symposium on Eye-Tracking Research & Applications – ETRA ’10 1, no. 212 (2010): 211. Just, Marcel A, and Patricia A Carpenter. “Eye Fixations and Cognitive Processes.” Cognitive Psychology 8, no. 4 (1976): 441–480. Kieras, David E, Scott D Wood, Kasem Abotel, and Anthony Hornof. “GLEAN : A Computer-Based Tool for Rapid GOMS Model Usability Evaluation of User Interface Designs.” In UIST (1995): 91–100. Komogortsev, Oleg, San Marcos, Corey Holland, Dan Tamir, and Carl J Mueller. “Aiding Usability Evaluation via Detection of Excessive Visual Search.” CHI (2011): 1825–1830. Kort, Joke, and Henk de Poot. “Usage Analysis : Combining Logging and Qualitative Methods.” ACM Computing Surveys (2005): 2121–2122. Kulkarni, R B, and S K Dixit. “Empirical and Automated Analysis of Web Applications.” Applications 38, no. 9 (2012): 1–8. Li, Dongheng, Jason Babcock, and Derrick J Parkhurst. “openEyes : a Low-cost Head-mounted Eye-tracking Solution.” ETRA ’06

124

Chapter Six

Proceedings of the 2006 Symposium on Eye Tracking Research & Applications 1, (2006): 95–100. Miluzzo, Emiliano, Tianyu Wang, and Andrew T Campbell. “EyePhone: Activating Mobile Phones with Your Eyes.” Proceedings of the Second ACM SIGCOMM Workshop on Networking, Systems, and Applications on Mobile Handhelds (2010): 15–20. Nielsen, Jakob, and Kara Pernice. Eyetracking Web Usability. New Riders, 2010. Poole, Alex, and Linden J Ball. “Eye Tracking in Human-Computer Interaction and Usability Research : Current Status and Future Prospects.” Psychology 10, no.5 (2005): 211–219. Scholtz, Jean. “Common Industry Format for Usability Test Reports.” CHI ’00 Extended Abstracts on Human Factors in Computer Systems – CHI (2000): 301. Sears, Andrew, and Julie A Jacko. The Human-computer Interaction Handbook. 2nd ed. Lawrence Erlbaum Associates, 2008. Travis, David. “Bluffers’ Guide to ISO 9241.” Bluffers’ Guide to Usability Standards 18 (2006): 2007.

CHAPTER SEVEN A MULTIMODAL WEB USABILITY ASSESSMENT BASED ON TRADITIONAL METRICS, PHYSIOLOGICAL RESPONSE AND EYE-TRACKING ANALYSIS JOSÉ LAPARRA-HERNÁNDEZ1, JUAN-MANUEL BELDA-LOIS1,2, JAIME DÍAZ-PINEDA1 AND ÁLVARO PAGE1 Abstract Most of web usability recommendations are based on expert opinion, but without performing user validation. Moreover, validations rely on techniques that disturb users during their interaction (e.g. think aloud) or are applied at the end of the task (e.g. questionnaires). Eye-tracking and physiological response analyses can extract information during user interaction without disturbing the user and can be more sensitive than traditional ones (e.g. spent time). The aim of the study reported in this chapter was to validate usability recommendations depending on user profiles and to identify the advantages of these innovative methodologies. 10 control users and 10 users with upper limb disorders participated in 3 sessions each, on 3 different days, and performed 3 times tasks of web searching during each session. An orthogonal design with eight web parameters was used to generate 16 websites with different styles but with the same content. Physiological analysis based on galvanic skin response electrocardiography and facial electromyography on the corrugator 1

Instituto de Biomecánica de Valencia (IBV). Universitat Politècnica de València (UPV). 2 Grupo de Tecnología Sanitaria del IBV, CIBER de Bioingeniería, Biomateriales y Nanomedicina (CIBER-BBN).

126

Chapter Seven

supercilii and zygomaticus major muscles was used to assess satisfaction and cognitive workload. Eye-tracking measures used were saccadefixation ratio, saccade amplitude and number of fixations per second. The results showed that eye-tracking and physiological variables were more sensitive to the effect of web parameters than time spent or questionnaires. Some of the web parameters provided by experts (e.g. “hover and click”) did not improve usability, and most of web parameters only improved usability for users with motor disorders.

Introduction Web usability Information and Communication Technologies (ICT) allow access to multiple services and resources independently of place and time. Therefore, it is a common expectation that ICTs such as web access enhance independent living, social inclusion and quality of life of older persons or persons with disabilities, allowing them to access a wide variety of services without leaving their homes. However, the lack of fit of ICT requirements to those groups’ needs and characteristics has had the opposite effect, which is exclusion from digital society, or e-exclusion (Lam and Lee 2005). Fortunately, governments and society itself have become conscious of the web access problems of people with disabilities and older people. The first efforts were focused on accessibility: “The degree to which a product, device, service, or environment is available to as many people as possible”. For example, the Web Accessibility Initiative (WAI)3, which is part of the World Wide Web Consortium (W3C), has developed several web content accessibility guidelines to ensure access to web content and software independently of user capabilities. However, web accessibility does not ensure web usability (Leuthold, Bargas-Avila and Opwis 2008), which means easy and safe web interaction and a positive experience. Therefore, current efforts are focused on usability. Usability is the key issue in human-computer interaction, but there is no consensus about its definition. The ISO 9241-11 (1998) definition of usability involves several concepts such as effectiveness, efficiency, satisfaction, context of use and specific use; but other concepts such as ease of use (Brink, Gergle and Wood 2002) or learning (Belda-Lois et al. 2010) should also be taken into account. Usability is highly dependent on 3

http://www.w3.org/WAI/.

A Multimodal Web Usability Assessment Based on Traditional Metrics 127

user profile and context (Newman and Taylor 1999), in contrast to accessibility, which is intended to satisfy all users. It is difficult to arrive at a single definition of usability and it is more difficult still to find a common approach or methodology to measure it. Usability cannot be directly measured (Hornbaek 2006), but it is possible to assess some aspects of usability that are able to provide indirect forms of measurement. In recent years, a wide variety of approaches and methodologies have been used to assess the usability of products and services (Hornbaek 2006). These studies can generally be classified into three groups: heuristic (or inspection), model-based, and user-based (Sears and Jacko 2003). Heuristic or inspection-based evaluations (Nielsen 1994) are carried out by experts who apply their knowledge to developing certain recommendations that can be taken into account by developers during the design process (Belda-Lois et al. 2010). Cognitive-based Model-based evaluations create models in which user behaviour is simulated, and allow to observe the virtual relationship between users and their environment (Lo and Helander 2004). Finally, user-based methods require user participation, using questionnaires or empirical testing, for example (Lin, Omata and Imamiya 2005, Nielsen 1995). More recently, in a review of current practice in usability measurements, Hornbaek (2006) classified studies and research according to their focus on effectiveness, efficiency and satisfaction measures, following the ISO 9241-11 (1997) definition. Effectiveness measures included, among others, task completion, accuracy measures (number of error) and quality of outcome. Efficiency measures includes usage patterns, mental effort and learning measurements. It is easy to find good practices and guidelines to design user-friendly web sites (Pearson and Schaik 2003). Yet, despite the variety of approaches to web usability studies, recommendations based on expert opinion are the most widespread. However, it is not clear if these recommendations have been validated or how they have been validated (Belda-Lois et al. 2010, Pearson and Schaik 2003). Therefore, user involvement of different profiles of users, specially people with disabilities, older persons or people with low digital literacy skills, is key to validating the fit of usability recommendations to real needs of each group of users, because usability recommendations are highly dependend on user profile. Moreover, most usability tests that involve users only rely on subjective techniques such as questionnaires or thinking aloud, or on basic

128

Chapter Seven

and traditional metrics such as task completion, time spent or number of errors (Hornbaek 2006), yet these metrics are insufficient to detect more subtle differences (Laparra-Hernández et al. 2008). Innovative approaches such as the analysis of physiological response or eye-tracking can provide objective and quantitative information about the behaviour and emotional state of the user during web interaction in real time without interfering in their interaction.

Physiological response Emotion theories have evolved over history. For example, Aristotle classified emotions into antagonistic pairs, Descartes talked about emotional behaviour, and Darwin postulated that emotions are linked to survival (see Frapanopaganos and Taylor, 2005, for a review of classical approaches). In fact, there are more than 500 theories about emotions (Strongman 1996). On the one hand, some researchers have defended that emotions are innate (Duclos et al. 1989), while others have believed that they depend on social influence (Ortony and Turner 1990). However, it is necessary to take into account both approaches (Ekman 1993), which are not necessarily in contradiction to each other but, rather, can be complementary. On the other hand, some authors have defined emotions as physiological changes (James 1890, Lange and Kurella 1887), while others have related emotions to cognitive processes (Cannon 1927). In fact, emotions are related to some of the most important cognitive processes such as perception, memory, preferences, decision-making, strategic planning, attention, motivation, intention, communication and learning (Nasoz et al. 2003). However, both of these definitions – physiological and cognitive – are also complementary. Different physiological signals such as galvanic skin response (GSR) (Ward and Marsden 2003), facial electromyography (EMG), electrocardiography (ECG) (Lin and Imamiya 2006), or electroencephalography (EEG) (Zhang, Zheng and Yu 2009), have been used to study human-computer interaction. Scheirer et al. (2002) used Hidden Markov Models based on GSR and blood volume pressure to discriminate between frustration episodes while solving tasks. These variables, combined with the heart rate variability (HRV), are influenced by web design (Ward and Marsden 2003). Moreover, Hazlett (2003) confirmed that activation of the EMG signal on the corrugator supercilii, which was related to experiencing frustration, depends on web complexity. However, Ward and Marsden (2003) mention the difficulty of finding

A Multimodal Web Usability Assessment Based on Traditional Metrics 129

statistical differences in the psycho-physiological responses during HCI due, due among others factors to the lack of controlled experimental settings.

Eye tracking Eye movement is one example of an external reflection of cognitive processes. In fact, much human interaction with the external world is based on the visual senses (Sibert and Jacob 2000). Humans are able to focus their attention independently of their eye movements, but eye movements are functionally coupled to shifts in attention (Deubel and Schneider 1996). Different parameters, related to displacement and time measurements, are extracted by evaluating eye movement (Poole, Ball and Phillips 2004). The combination of position, displacement and times is called gaze pattern, and it is defined by two principal parameters: fixations and saccades. Fixations are defined as the permanence of the gaze in a spatial area from 2.5º to 3.5º (Goldberg and Kotval 1999) during a determinate period of time that varies depending on the author. For example, Granka, Joachims and Gay (2004) establishes this duration as being from 200 to 300 ms, while Duchowski (2007) establishes it as 150 to 600 ms. Fixations have been related with cognitive processes in which users internalize information (Granka, Joachims and Gay 2004). On the other hand, saccades are fast eye movements between two consecutive fixations (Duchowski 2007). Saccades are related to search processes, and it can be considered that the brain is blind during these movements and does not process what the “eye sees” (Duchowski 2007). Moreover, there are small eye movements called microsaccades, which should be differentiated from previous ones by the fixation radius. All consecutive movements inside this radius are considered as a single fixation (Duchowski 2007). There are other parameters that can give important information about eye behaviour, but most of them can be defined by combinations of fixations and saccades. Some examples are scanpath, eye trajectory, and transition matrix (Poole and Ball 2005). Thus, fixations and saccades are the basic features of eye movements, but the really useful information is obtained from the conversion of this basic data into different metrics or variables. Those can be interpreted in order to extract conclusions about the relationship between user and environment. Some of these variables are: the number of fixations (Goldberg and Kotval 1999), fixations per area of interest (Poole, Ball and

130

Chapter Seven

Phillips 2004), fixation duration (Just and Carpenter 1976), fixation spatial density (Cowen, Ball and Delin 2002), number of saccades (Goldberg and Kotval 1999), saccade amplitudes (Goldberg et al. 2002), saccade regressions (Sibert, Gokturk and Lavine 2000), and scanpath duration and regularity (Goldberg and Kotval 1999). However, not all of these are suitable for evaluating web usability. The number of fixations, the saccade/fixation duration ratio, and saccade amplitudes are particularly useful for this purpose. Goldberg and Kotval (1999) related a high number of fixations with low search efficiency, long duration ratios with high processing or low search times, and large saccades with good organization of the information. It is important to recognise that these interpretations could vary depending on the context and type of task, presenting inverse duality in their meaning: for example, a higher fixation saccade ratio indicates more processing or less searching, depending on whether the task has a high cognitive workload or is a data-searching task. (Poole, Ball and Phillips, 2004).

Objectives The aim of this study is to explore the usefulness of innovatives approaches sucha as physiological response and eye-tracking to assess web usability. The study will assess how some web usability recommendations, provided by experts, improve usability of users with upper limb disorders in contrast to users without functional limitations These innovatives approaches can provide complementary information about the effect of web parameters during web-user interaction to traditional approaches such as time spent or questionnaires.

Methods Sample Two different user profiles were selected to evaluate the effects of web usability parameters depending on user capabilities. All users, selected from the IBV users-database, were from Valencia (Spain) and their first language was Spanish. 10 control users, without functional limitations, and 10 users with different upper limb disorders. Both groups were gender-balanced (5 men and 5 women) with a mean age of 36 years +/- 8.4 years old and with a minimal level of primary education. All users had experience in using computers (a mean of 12.6 years with a standard

A Multimodal Web Usability Assessment Based on Traditional Metrics 131

deviation of +/- 5.1 years), used computers frequently (a mean of 4.3 hours per day with a standard deviation of +/- 3.1 hours per day) and daily accessed the Internet on a daily basis. All users were able to surf with a non-adapted mouse.

Stimulus material and experimental design Current web usability guidelines and Spanish government web sites were reviewed to find the most widespread web usability parameters. The selected parameters were: “Go Home” (P1), “Go Up” (P2), “Web site Map” (P3), “Hover and Click” (P4), “Background Image” (P5), “Breadcrumbs” (P6) and “Menu Type” (P7). Commercial web sites were not evaluated, in order to retain full control on the influence of each web parameter. Therefore, a factorial design with an orthogonal distribution was made using previous parameters, generating 16 web styles (Table 7-1) with the same web content in Spanish language. All parameters can be present or absent, except for “Menu Type” that had five options depending on the levels of the upper and lefthand menu (Table 7-1). Table 7-1: Menu Types Menu Code M1 M2 M3 M4 M5

Upper Menu 2 levels 1 level 2 levels

Left-hand Menu 1 level 1 level 2 levels 2 levels

The result was 16 Cascading Style Sheets (CSS) that were distributed among all users by the Federov algorithm (Federov, 1972). This distribution ensured that all user profiles interacted with all web parameters, making a balanced test (Table 7-2).

Chapter Seven

132

Table 7-2: Web style distribution. Presence (“X”) or absence (-) of usability parameters. “Go Home” (P1), “Go Up” (P2), “Web site Map” (P3), “Hover and Click” (P4), “Background Image” (P5), “Breadcrumbs” (P6) and “Menu Type” (P7) Web 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

P1 X X X X X X X X

P2 X X X X X X X X -

P3 X X X X X X X X -

P4 X X X X X X X X -

P5 X X X X X X X X -

P6 X X X X X X X X -

P7 M1 M4 M4 M3 M2 M2 M2 M5 M4 M3 M3 M3 M1 M4 M5 M2

Protocol The user was taken to a comfortable room with soft lighting that allowed suitable reading but avoided light reflection and environmental distractions. The user was sat down in front of the computer and received a clear explanation of the test tasks. The experimenter was always placed sat behind the user to avoid any interference during user interaction. The experimenter explained the project and the test before the user signed the informed consent. The experimenter then attached the electrodes to record the physiological signals and conduct a gaze calibration to adapt the system to each user and improve eye-tracking precision. ECG electrodes were attached following Einthoven’s triangle, surface EMG electrodes were attached over the corrugator supercilii and the zygomaticus major on the right-hand side of the face (Figure 7-1) and GSR electrodes were attached to the palm of the left hand between the fourth and fifth finger (Laparra-Hernández et al. 2009).

A Multimodal Web Usability Assessment Based on Traditional Metrics 133

Fig. 7-1: Attachment of GSR electrodes on the palm (left), EMG electrodes on the corrugator supercilii (centre) and on the zygomaticus major (right)

Next, the participant (user) was asked to make him/herself comfortable and rest for 30 seconds in order to relax. During this time, the screen was totally black and there was no kind of visual or auditory stimulus, so as to correctly record the baseline of the physiological response. Each user participated in 3 sessions over 3 different days, performing 3 repetitions of each task during each session. Therefore, users performed each task 9 times over the full test. During each session, users performed three information-searching tasks, each one with a different web style. There were two types of search tasks: text and images. Users had two minutes to finish each task. If users did not achieve the objective, the experimenter explained how to achieve the objective and the next objective was started. At the end of each task, the user completed a questionnaire to subjectively assess different items related to web usability.

Data analysis Eye movement was recorded using the Eyetracker Tobii T60 system with a sampling frequency of 60Hz. EMG, GSR and HR signals were

134

Chapter Seven

acquired using the VarioportTM system (Becker MeditecTM, Karlsrue, Germany) with a sampling frequency of 250 Hz, 64 Hz and 120 Hz, respectively. Time spent to achieve each objective was also recorded. User opinion was gathered using a questionnaire with 20 specific questions and two global questions about usability and general perception.

Signal processing The Eyetracker Tobii T60 system provides the raw coordinates (x,y) of gaze at each instant. During testing, some of these positions are lost or rendered invalid due to user movements or because user gaze was away from the screen, so these values were excluded before data processing. Two different values were selected for fixation radius and duration depending on the size of the target: a radius of 50 pixels and a duration of 200ms for searching for images, and a radius of 20 pixels and a duration of 40ms for searching for text. The extracted eye-tracking parameters were the saccade/fixation ratio (i.e. the ratio between the time spent searching and the time spent processing), the saccade amplitude, the number of fixations and the number of fixations per second. Moreover, the time spent performing each task was also recorded and a logarithmic normalization was applied to transform it into a normal distribution. The EMG and ECG signals were filtered with a 50 Hz and 100 Hz notch filter to avoid first- and second-order interference from power lines. Moreover, the EMG signals were filtered with a 0.2 Hz high-pass filter to reject the baseline effect and were fully rectified and filtered with a 4 Hz low-pass filter to acquire the EMG envelope. The ECG signals were filtered with a 0.1 Hz high-pass filter to diminish the baseline effect and artefacts due to body movement (e.g. breathing) and a modified version of Pan-Tompkins algorithm (Pan and Tompkins 1985) was applied to detect “R” peaks and calculate the Heart Rate. The GSR signal has two main components: a tonic response or baseline, and a phasic response, which represents fast fluctuations resulting from external stimuli (Heino, Molen and Wilde 1990). The phasic response was extracted using a 0.1 Hz highpass filter. The signal was rectified for analysis. Data normalization of physiological data is critical. In fact, many researchers cannot get results due the high variability of physiological signals, especially EMG and GSR, between users and over the time. The method for normalizing EMG and GSR signals is shown in Equation 7-1, where X is the 75th percentile of the signal during the performance of each task, and B is the 75th percentile of the signal during the initial 30 seconds

A Multimodal Web Usability Assessment Based on Traditional Metrics 135

of black screen (relaxation state). This method allows us to normalize data, reducing intra- and inter-subject variability. The HR was reviewed to avoid missed “R” peaks or false detections. Heart Rate Variability (HRV) was calculated using the axis of Poincare Plots. N=

(X

B)

(Eq.7-1)

B

Statistical analysis Principal Component Analysis (Pearson and Schaik 2003), with a Varimax rotation and Kaiser normalization, was used to transform the 20 subjective variables (items in the questionnaire) into a reduced set of independent components that contain the same information. Only components with eigenvalues greater than one were taken into account. The same univariate Analysis of Variance (ANOVA) model was used for each variable: task time, principal components of the questionnaire, physiological variables (HRV, EMGz, EMGc, GSRf) and eye-tracking variables (saccade/fixation ratio, saccade amplitudes, number of fixations and number of fixations per second). The factors in the ANOVA model were the seven web parameters, session, repetition and user profile (with versus without motor disorder of the upper limbs). The ANOVA model was defined considering principal factor effects and the first order interaction between the motor disorder factor and the remaining model factors (Equation 7-2). Variable = ~ f [(1+ Motor Disorder) * (1+P1+P2+P3+P4+P6+ Session + Repetition)].

(Eq.7-2)

Post-Hoc analysis with Bonferroni criteria was used for the analysis of multilevel factors (menus, session and repetition).

Software tools Matlab and SPSS software were used for signal processing and statistical analysis, respectively. In addition, some self-developed statistical functions with Matlab were used for the Post-Hoc analysis in order to detect differences inside first order interactions, which cannot be performed with SPSS.

Chapter Seven

136

Results Principal Components Analysis The usability questionnaire was composed of 21 questions (see Annex 1 to see a translated version of the questionnarie). The questions had a scale of 5 levels (“Completely disagreement”, “Quite disagreement”, “Indeferent”, “Quite according”, and “Completely according”), except the final question, which asks about global usability (Question 21) and ranks from 0 to 10. The questions were highly correlated so a Principal Component Analysis (PCA) was performed. On the one hand, Questions 13 (“Easy to use”) and 21 (“General Usability socre”) were not include in the analysis because they prvide a global view of usability and it could be used in a future as output variables of a linear regression model. On the other hand, after applying PCA, Question 15 (“Need to learn”) was excluded because its correlation values with other questions were low and it can mask the results, therefore Question 15 was considered as itself independent component. PCA reduced the original set of 18 questions to four uncorrelated components that explain more than 60% of the variance. The threshold to belong to a principal component was 0.7. Table 7-3 shows the selected components and their correlations with the original questions. Therefore, the subjective assessment was into fivecomponents: 1. 2. 3. 4. 5.

“Aesthetics” (PC1): Questions 12, 17, 18 and 19. “Orientation and Clear Layout” (PC2): Questions 10 and 7. “Easy to Find” (PC3): Questions 11, 4 and 1. “Simplicity and order “(PC4): Questions 16 and 20. Need to learn (/PC5): Question 15 (it was independent of other questions).

A Multimodal Web Usability Assessment Based on Traditional Metrics 137

Table 7-3: Correlation matrix of rotated components and questions Questions Q_18 : Suitable Aesthetic Q_17: Like Aeshtetic Q_19: Good Design Q_12: Nice to See and Use Q_09: Clear Section Exits Q_10: Section Information Q_07: Descriptive Links Q_03: Clear Sections Change Q_05: Clear Layout Q_06: Clear Information Q_08: Comprehensible Web Map Q_11: Tasks without Errors Q_04: Easy to Find Information Q_01: First Glance Key Elements Q_02: Elements Coherency Q_16: Easy to Read Texts Q_20: Simple Q_14: Confident during Use

PC1 0.892 0.881 0.875 0.827 0.606 0.438

PC2

0.593 0.750 0.734 0.663 0.622 0.595 0.496

0.415

PC3

PC4

0.407 0.416 0.516

0.878 0.839 0.831 0.515

0.547

0.430 0.789 0.741 0.645

Key: Rotation method: Varimax with Kaiser normalization. Question terms (Annex 1) were translated from Spanish language

Comparison between traditional and innovative approaches The significance (p0.05, “+” 0.01